Issues with structured output
Hi, I am running the model in vllm-openai and try to generate structured outputs using the responses endpoint.
class ResponseFormat(BaseModel):
pros: str
cons: str
res=llm_client.responses.parse(
model="QuantTrio/Qwen3-VL-32B-Thinking-AWQ",
input=[
{"role": "user", "content": "in pros write \"ok\" and in cons write \"not ok\""}
],
text_format=ResponseFormat,
temperature=0,
)
Consistenty I get errors that the json is invalid.
ValidationError: 1 validation error for ResponseFormat
Invalid JSON: key must be a string at line 4 column 1 [type=json_invalid, input_value='\n\n{\n{\n "\nproprss":...":"\n " "not\n } \n', input_type=str]
For further information visit https://errors.pydantic.dev/2.12/v/json_invalid
my vllm command looks like this:
vllm serve QuantTrio/Qwen3-VL-32B-Thinking-AWQ --reasoning-parser deepseek_r1 --quantization awq_marlin --trust-remote-code --enable-chunked-prefill --max_num_batched_tokens "16384" --max_model_len "49152" --gpu_memory_utilization "0.95" --async-scheduling --dtype half --kv_cache_dtype auto --max_num_seqs "16" --limit-mm-per-prompt.video "0"
thank you in advance for your help
llm_client is a OpenAI client connecting to the Vllm container
according to vllm guide
from pydantic import BaseModel
from enum import Enum
class CarType(str, Enum):
sedan = "sedan"
suv = "SUV"
truck = "Truck"
coupe = "Coupe"
class CarDescription(BaseModel):
brand: str
model: str
car_type: CarType
json_schema = CarDescription.model_json_schema()
completion = client.chat.completions.create(
model=model,
messages=[
{
"role": "user",
"content": "Generate a JSON with the brand, model and car_type of the most iconic car from the 90's",
}
],
response_format={
"type": "json_schema",
"json_schema": {
"name": "car-description",
"schema": CarDescription.model_json_schema()
},
},
)
print(completion.choices[0].message.content)
"replies" is the new openai endpoint to simplify this process of casting responses to pydantic objects. latest versions of vllm support it.
it seems to not work either with chat completions endpoint:
class ResponseFormat(BaseModel):
pros: str
cons: str
completion=llm_client.chat.completions.create(
model="QuantTrio/Qwen3-VL-32B-Thinking-AWQ",
messages=[
{"role": "user", "content": "in pros write \"ok\" and in cons write \"not ok\""}
],
response_format={
"type": "json_schema",
"json_schema": {
"name": "ResponseFormat",
"schema": ResponseFormat.model_json_schema()
}
},
temperature=0,
)
reply:
reasoning_content:
print(completion.choices[0].message.reasoning_content)
We are given a task: in pros write "ok" and in cons write "not ok"
However, note that the instruction is to write in the "pros" section "ok" and in the "cons" section "not ok".
But the problem is: we are to output the string "ok" for pros and "not ok" for cons.
However, the instruction is a bit ambiguous. Let me re-read:
"in pros write "ok" and in cons write "not ok""
So, we are to output two lines:
First line: "pros: ok"
Second line: "cons: not ok"
But note: the instruction says "in pros" meaning in the section for pros, we write "ok", and similarly for cons.
However, the problem does not specify the format. But typically, we might write:
pros: ok
cons: not ok
But the instruction says "in pros write 'ok'", meaning that the word "ok" is to be written in the pros section.
How about we output:
pros: ok
cons: not ok
However, the problem says "in pros" and "in cons", so we are to write the string "ok" in the pros part and "not ok" in the cons part.
But note: the instruction does not say to write "pros: " and "cons: ", it just says "in pros" and "in cons".
However, to make it clear, we should output two lines: one for pros and one for cons.
Since the problem is very simple, I think the expected output is:
ok
not ok
But wait, the instruction says "in pros" and "in cons", meaning that the first line is for pros and the second for cons.
However, the problem does not specify that we have to label them. But to be safe, let's see:
The instruction: "in pros write 'ok' and in cons write 'not ok'"
So, if we are to write in the pros section (which is the first part) the string "ok", and in the cons section (the second part) the string "not ok", then:
We output:
ok
not ok
But note: the problem says "in pros" and "in cons", meaning that the context is that we are writing in two different sections.
However, the problem does not require the labels "pros" and "cons" to be written. It just says to write "ok" in the pros section and "not ok" in the cons section.
How about we interpret it as:
The pros section is the first line, and we write "ok" there.
The cons section is the second line, and we write "not ok" there.
So, the output should be two lines:
ok
not ok
But to be safe, let's check: the problem says "in pros write 'ok'", meaning that the content of the pros section is "ok", and similarly for cons.
Therefore, the output is:
ok
not ok
However, note that the problem says "in pros" and "in cons", so if we are to write the entire response, we are to write two lines: the first line is the pros (which is "ok") and the second line is the cons (which is "not ok").
So, the answer is:
ok
not ok
But wait, the problem says "in pros" and "in cons", meaning that the user is to write in the pros section (which is the first part of the response) the string "ok", and in the cons section (the second part) the string "not ok".
Therefore, we output:
ok
not ok
However, the problem does not specify that the response must have two lines. But the context of "pros" and "cons" implies two separate parts.
Since the instruction is to write in the pros section "ok" and in the cons section "not ok", and the problem does not specify the format, the minimal and most direct way is to output two lines.
Let's do:
First line: "ok"
Second line: "not ok"
So, the code (if we were to write a program) would output:
print("ok")
print("not ok")
But since we are to write the response, we write:
ok
not ok
However, note: the problem says "in pros" and "in cons", so if we are to write the entire response, we are to write two lines: the first line is the pros (which is "ok") and the second line is the cons (which is "not ok").
Therefore, the final answer is:
ok
not ok
