Learn how to use JSON mode to get structured outputs from LLMs like DeepSeek V3 & Llama 3.3.
response_format
key of the Chat Completions API.
openai/gpt-oss-120b
openai/gpt-oss-20b
moonshotai/Kimi-K2-Instruct
zai-org/GLM-4.5-Air-FP8
Qwen/Qwen3-235B-A22B-Thinking-2507
Qwen/Qwen3-Coder-480B-A35B-Instruct-FP8
Qwen/Qwen3-235B-A22B-Instruct-2507-tput
deepseek-ai/DeepSeek-R1
deepseek-ai/DeepSeek-R1-0528-tput
deepseek-ai/DeepSeek-V3
meta-llama/Llama-4-Maverick-17B-128E-Instruct-FP8
Qwen/Qwen2.5-72B-Instruct-Turbo
Qwen/Qwen2.5-VL-72B-Instruct
meta-llama/Llama-4-Scout-17B-16E-Instruct
meta-llama/Llama-3.3-70B-Instruct-Turbo
deepcogito/cogito-v2-preview-llama-70B
deepcogito/cogito-v2-preview-llama-109B-MoE
deepcogito/cogito-v2-preview-llama-405B
deepcogito/cogito-v2-preview-deepseek-671b
deepseek-ai/DeepSeek-R1-Distill-Llama-70B
deepseek-ai/DeepSeek-R1-Distill-Qwen-14B
marin-community/marin-8b-instruct
meta-llama/Meta-Llama-3.1-8B-Instruct-Turbo
meta-llama/Llama-3.3-70B-Instruct-Turbo-Free
Qwen/Qwen2.5-7B-Instruct-Turbo
Qwen/Qwen2.5-Coder-32B-Instruct
Qwen/QwQ-32B
Qwen/Qwen3-235B-A22B-fp8-tput
arcee-ai/coder-large
meta-llama/Meta-Llama-3.1-405B-Instruct-Turbo
meta-llama/Llama-3.2-3B-Instruct-Turbo
meta-llama/Meta-Llama-3-8B-Instruct-Lite
meta-llama/Llama-3-70b-chat-hf
google/gemma-3n-E4B-it
mistralai/Mistral-7B-Instruct-v0.1
mistralai/Mistral-7B-Instruct-v0.2
mistralai/Mistral-7B-Instruct-v0.3
arcee_ai/arcee-spotlight
response_format
key.
Finally – and this is important – we need to make sure to instruct our model to only respond in JSON format, and include details of the schema we want to use. This ensures it will actually use the schema we provide when generating its response. Any instructions in the schema itself will not be followed by the LLM.
Important: You must always instruct your model to only respond in JSON format, either in the system prompt or a user message, in addition to passing your schema to the response_format
key.
Let’s see what this looks like:
DeepSeek-R1-0528
.
Below we ask the model to solve a math problem step-by-step showing it’s work: