Header image for Getting access to the full 1 million token context window in Codex GPT-5.4

Getting access to the full 1 million token context window in Codex GPT-5.4

GPT-5.4 supports a million-token context window but Codex launches it at 272,000.

Codex doesn't ask GPT-5.4 how many tokens it can handle. It looks up the answer in a model catalog -- a JSON file that ships with Codex listing every available model alongside its capabilities: name, context window, supported features. The default catalog sets GPT-5.4's context_window to 272,000. As far as Codex is concerned, that's the ceiling.

Codex profiles let you create named configurations -- a different model, different sandbox rules, different settings -- and switch between them with --profile name. Each profile lives in its own file at ~/.codex/<name>.config.toml, using the same top-level keys as ~/.codex/config.toml. Anything you set in a profile file overrides the matching value in config.toml for sessions launched under that profile.

That includes model_context_window.

The profile that unlocks 1M

Create ~/.codex/gpt54_1m.config.toml:

model = "gpt-5.4"
model_context_window = 1000000
model_auto_compact_token_limit = 900000

Three lines. model_context_window tells Codex to use a 1M-token window for sessions launched under this profile. model_auto_compact_token_limit triggers conversation compaction at 900K, leaving headroom before the effective ceiling.

Launch with:

codex --profile gpt54_1m

Verifying it works

Inside the Codex TUI, /status shows the context window size. For a definitive check, inspect the rollout JSON from your session:

latest_rollout=$(
  find ~/.codex/sessions -name 'rollout-*.jsonl' -printf '%T@ %p\n' \
    | sort --numeric-sort \
    | tail --lines=1 \
    | cut --delimiter=' ' --fields=2-
)

jq -r '
  select(.type == "event_msg" and .payload.type == "task_started")
  | .payload.model_context_window
' "$latest_rollout"

This returns 950,000. GPT-5.4 applies a 95% effective context window internally, so the runtime ceiling is 950,000 of the catalog's 1,000,000.

To prove the window is actually usable end-to-end -- not just declared in metadata -- pipe several hundred thousand tokens of input to a single-turn codex exec and check that the rollout records the full input acceptance:

codex exec --model gpt-5.4 -c 'model_context_window=1000000' \
  --skip-git-repo-check 'Reply with only the literal word RECEIVED.' \
  < big_input_file

On my Plus account, this accepted a single turn with 591,580 input tokens -- well above the 272,000 default ceiling.

Why this works now (and didn't before)

Until Codex 0.134.0, profiles lived inside ~/.codex/config.toml as [profiles.X] tables -- and that table parser silently dropped model_context_window (openai/codex#14456). Setting the key inside [profiles.X] did nothing. The 0.134.0 release moved profiles into their own files at ~/.codex/<name>.config.toml, using top-level keys where the override was never broken. The bug fix was incidental to the file-shape change.

What about GPT-5.5?

GPT-5.5 has the same shape of gap: OpenAI's API docs document a 1,050,000-token context window, but Codex's catalog still ships it at 272,000.

The same simple override does not work for GPT-5.5 on the Plus tier (internally labeled prolite in Codex's rate-limit telemetry). I verified this by stress-testing both models on the same account, back to back:

  • GPT-5.4 with model_context_window = 1000000: a single-turn session accepted 591,580 input tokens.
  • GPT-5.5 with an analogous setup: real working sessions hit "Codex ran out of room in the model's context window" three times in a row, each time when a single turn's input crossed ~270,000 tokens. Codex's /status reported a 998K window throughout; the OpenAI API rejected the requests anyway.

So the override-tells-Codex layer worked for both models, but only the GPT-5.4 API actually honored it for Plus. The catalog patch convinces Codex it has more headroom; the server still enforces the plan-tier ceiling for GPT-5.5.

I haven't tested this on Pro / Enterprise / API access, so I can't say whether the limit relaxes there. If you're on a higher tier and want to find out, the test is: stress-test a GPT-5.5 1M profile with several hundred thousand tokens of single-turn input and read the rollout's last_token_usage.input_tokens. If it succeeds, the gate is plan-tier; if it fails the same way, the cap is universal on Codex.