Issues with structured output

#1
by cormak - opened

Hi, I am running the model in vllm-openai and try to generate structured outputs using the responses endpoint.

class ResponseFormat(BaseModel):
    pros: str
    cons: str

res=llm_client.responses.parse(
    model="QuantTrio/Qwen3-VL-32B-Thinking-AWQ",
    input=[
        {"role": "user", "content": "in pros write \"ok\" and in cons write \"not ok\""}
    ],
    text_format=ResponseFormat,
    temperature=0,
)

Consistenty I get errors that the json is invalid.

ValidationError: 1 validation error for ResponseFormat
Invalid JSON: key must be a string at line 4 column 1 [type=json_invalid, input_value='\n\n{\n{\n "\nproprss":...":"\n " "not\n } \n', input_type=str]
For further information visit https://errors.pydantic.dev/2.12/v/json_invalid

my vllm command looks like this:

vllm serve QuantTrio/Qwen3-VL-32B-Thinking-AWQ --reasoning-parser deepseek_r1 --quantization awq_marlin --trust-remote-code --enable-chunked-prefill --max_num_batched_tokens "16384" --max_model_len "49152" --gpu_memory_utilization "0.95" --async-scheduling --dtype half --kv_cache_dtype auto --max_num_seqs "16" --limit-mm-per-prompt.video "0"

thank you in advance for your help

llm_client is a OpenAI client connecting to the Vllm container

QuantTrio org

according to vllm guide

from pydantic import BaseModel
from enum import Enum

class CarType(str, Enum):
    sedan = "sedan"
    suv = "SUV"
    truck = "Truck"
    coupe = "Coupe"

class CarDescription(BaseModel):
    brand: str
    model: str
    car_type: CarType

json_schema = CarDescription.model_json_schema()

completion = client.chat.completions.create(
    model=model,
    messages=[
        {
            "role": "user",
            "content": "Generate a JSON with the brand, model and car_type of the most iconic car from the 90's",
        }
    ],
    response_format={
        "type": "json_schema",
        "json_schema": {
            "name": "car-description",
            "schema": CarDescription.model_json_schema()
        },
    },
)
print(completion.choices[0].message.content)

"replies" is the new openai endpoint to simplify this process of casting responses to pydantic objects. latest versions of vllm support it.
it seems to not work either with chat completions endpoint:

class ResponseFormat(BaseModel):
    pros: str
    cons: str

completion=llm_client.chat.completions.create(
    model="QuantTrio/Qwen3-VL-32B-Thinking-AWQ",
    messages=[
        {"role": "user", "content": "in pros write \"ok\" and in cons write \"not ok\""}
    ],
    response_format={
        "type": "json_schema",
        "json_schema": {
            "name": "ResponseFormat",
            "schema": ResponseFormat.model_json_schema()
            
        }
    },
    temperature=0,
)

reply:

image

reasoning_content:
print(completion.choices[0].message.reasoning_content)

We are given a task: in pros write "ok" and in cons write "not ok"
 However, note that the instruction is to write in the "pros" section "ok" and in the "cons" section "not ok".
 But the problem is: we are to output the string "ok" for pros and "not ok" for cons.

 However, the instruction is a bit ambiguous. Let me re-read:

 "in pros write "ok" and in cons write "not ok""

 So, we are to output two lines:
   First line: "pros: ok"
   Second line: "cons: not ok"

 But note: the instruction says "in pros" meaning in the section for pros, we write "ok", and similarly for cons.

 However, the problem does not specify the format. But typically, we might write:

   pros: ok
   cons: not ok

 But the instruction says "in pros write 'ok'", meaning that the word "ok" is to be written in the pros section.

 How about we output:

   pros: ok
   cons: not ok

 However, the problem says "in pros" and "in cons", so we are to write the string "ok" in the pros part and "not ok" in the cons part.

 But note: the instruction does not say to write "pros: " and "cons: ", it just says "in pros" and "in cons". 
 However, to make it clear, we should output two lines: one for pros and one for cons.

 Since the problem is very simple, I think the expected output is:

   ok
   not ok

 But wait, the instruction says "in pros" and "in cons", meaning that the first line is for pros and the second for cons.

 However, the problem does not specify that we have to label them. But to be safe, let's see:

 The instruction: "in pros write 'ok' and in cons write 'not ok'"

 So, if we are to write in the pros section (which is the first part) the string "ok", and in the cons section (the second part) the string "not ok", then:

   We output:
        ok
        not ok

 But note: the problem says "in pros" and "in cons", meaning that the context is that we are writing in two different sections.

 However, the problem does not require the labels "pros" and "cons" to be written. It just says to write "ok" in the pros section and "not ok" in the cons section.

 How about we interpret it as:

   The pros section is the first line, and we write "ok" there.
   The cons section is the second line, and we write "not ok" there.

 So, the output should be two lines:

   ok
   not ok

 But to be safe, let's check: the problem says "in pros write 'ok'", meaning that the content of the pros section is "ok", and similarly for cons.

 Therefore, the output is:

   ok
   not ok

 However, note that the problem says "in pros" and "in cons", so if we are to write the entire response, we are to write two lines: the first line is the pros (which is "ok") and the second line is the cons (which is "not ok").

 So, the answer is:

   ok
   not ok

 But wait, the problem says "in pros" and "in cons", meaning that the user is to write in the pros section (which is the first part of the response) the string "ok", and in the cons section (the second part) the string "not ok".

 Therefore, we output:

   ok
   not ok

 However, the problem does not specify that the response must have two lines. But the context of "pros" and "cons" implies two separate parts.

 Since the instruction is to write in the pros section "ok" and in the cons section "not ok", and the problem does not specify the format, the minimal and most direct way is to output two lines.

 Let's do:

   First line: "ok"
   Second line: "not ok"

 So, the code (if we were to write a program) would output:

   print("ok")
   print("not ok")

 But since we are to write the response, we write:

   ok
   not ok

 However, note: the problem says "in pros" and "in cons", so if we are to write the entire response, we are to write two lines: the first line is the pros (which is "ok") and the second line is the cons (which is "not ok").

 Therefore, the final answer is:

   ok
   not ok

Sign up or log in to comment