Getting access to the full 1 million token context window in Codex GPT-5.4
GPT-5.4 supports a million-token context window but Codex launches it at 272,000.
Codex doesn't ask GPT-5.4 how many tokens it can handle. It looks up the answer in a model catalog -- a JSON file that ships with Codex listing every available model alongside its capabilities: name, context window, supported features. The default catalog sets GPT-5.4's context_window to 272,000. As far as Codex is concerned, that's the ceiling.
Codex profiles let you create named configurations -- a different model, different sandbox rules, different settings -- and switch between them with --profile name. Each profile lives in its own file at ~/.codex/<name>.config.toml, using the same top-level keys as ~/.codex/config.toml. Anything you set in a profile file overrides the matching value in config.toml for sessions launched under that profile.
That includes model_context_window.
The profile that unlocks 1M
Create ~/.codex/gpt54_1m.config.toml:
model = "gpt-5.4"
model_context_window = 1000000
model_auto_compact_token_limit = 900000
Three lines. model_context_window tells Codex to use a 1M-token window for sessions launched under this profile. model_auto_compact_token_limit triggers conversation compaction at 900K, leaving headroom before the effective ceiling.
Launch with:
codex --profile gpt54_1m
Verifying it works
Inside the Codex TUI, /status shows the context window size. For a definitive check, inspect the rollout JSON from your session:
latest_rollout=$(
find ~/.codex/sessions -name 'rollout-*.jsonl' -printf '%T@ %p\n' \
| sort --numeric-sort \
| tail --lines=1 \
| cut --delimiter=' ' --fields=2-
)
jq -r '
select(.type == "event_msg" and .payload.type == "task_started")
| .payload.model_context_window
' "$latest_rollout"
This returns 950,000. GPT-5.4 applies a 95% effective context window internally, so the runtime ceiling is 950,000 of the catalog's 1,000,000.
To prove the window is actually usable end-to-end -- not just declared in metadata -- pipe several hundred thousand tokens of input to a single-turn codex exec and check that the rollout records the full input acceptance:
codex exec --model gpt-5.4 -c 'model_context_window=1000000' \
--skip-git-repo-check 'Reply with only the literal word RECEIVED.' \
< big_input_file
On my Plus account, this accepted a single turn with 591,580 input tokens -- well above the 272,000 default ceiling.
Why this works now (and didn't before)
Until Codex 0.134.0, profiles lived inside ~/.codex/config.toml as [profiles.X] tables -- and that table parser silently dropped model_context_window (openai/codex#14456). Setting the key inside [profiles.X] did nothing. The 0.134.0 release moved profiles into their own files at ~/.codex/<name>.config.toml, using top-level keys where the override was never broken. The bug fix was incidental to the file-shape change.
What about GPT-5.5?
GPT-5.5 has the same shape of gap: OpenAI's API docs document a 1,050,000-token context window, but Codex's catalog still ships it at 272,000.
The same simple override does not work for GPT-5.5 on the Plus tier (internally labeled prolite in Codex's rate-limit telemetry). I verified this by stress-testing both models on the same account, back to back:
- GPT-5.4 with
model_context_window = 1000000: a single-turn session accepted 591,580 input tokens. - GPT-5.5 with an analogous setup: real working sessions hit "Codex ran out of room in the model's context window" three times in a row, each time when a single turn's input crossed ~270,000 tokens. Codex's
/statusreported a 998K window throughout; the OpenAI API rejected the requests anyway.
So the override-tells-Codex layer worked for both models, but only the GPT-5.4 API actually honored it for Plus. The catalog patch convinces Codex it has more headroom; the server still enforces the plan-tier ceiling for GPT-5.5.
I haven't tested this on Pro / Enterprise / API access, so I can't say whether the limit relaxes there. If you're on a higher tier and want to find out, the test is: stress-test a GPT-5.5 1M profile with several hundred thousand tokens of single-turn input and read the rollout's last_token_usage.input_tokens. If it succeeds, the gate is plan-tier; if it fails the same way, the cap is universal on Codex.