Saturday, April 25, 2026

Qwen-Code: Ollama Ignores num_ctx from settings.json

If you're running Ollama behind an OpenAI-compatible endpoint and setting num_ctx in your client's config, you may notice Ollama still loads the model with only 32k context. This is a silent failure — no errors, just ignored parameters.

Why It Happens

Ollama's /v1 OpenAI-compatible layer only forwards standard OpenAI parameters. Native Ollama options like num_ctx need to be passed inside an options: {} object in the raw request body. Most OpenAI-compatible clients don't do this, so the parameter is quietly dropped and Ollama falls back to the model's default context size.

The Fix

Bake num_ctx directly into the model via a custom Modelfile. This sidesteps the client entirely.

1. Export the existing Modelfile:

ollama show qwen3.6:27b --modelfile > qwen_custom.Modelfile

2. Add or update the num_ctx parameter in the file:

PARAMETER num_ctx 262144

3. Create a new model from it:

ollama create qwen3.6-256k -f qwen_custom.Modelfile

4. Update your client config to point to the new model name, then verify:

ollama ps

The context size shown in ollama ps should now reflect 262144 instead of 32768.