Unlocking Codex GPT-5.4's Full 1M-Token Context Window

Update (July 2026): as of Codex 0.144.4, this technique no longer works -- the model_context_window override is now ignored, and context ceilings are enforced server-side. See the full update at the end of this post.

GPT-5.4 supports a million-token context window but Codex launches it at 272,000.

Codex doesn't ask GPT-5.4 how many tokens it can handle. It looks up the answer in a model catalog -- a JSON file that ships with Codex listing every available model alongside its capabilities: name, context window, supported features. The default catalog sets GPT-5.4's context_window to 272,000. As far as Codex is concerned, that's the ceiling.

Codex profiles let you create named configurations -- a different model, different sandbox rules, different settings -- and switch between them with --profile name. Each profile lives in its own file at ~/.codex/<name>.config.toml, using the same top-level keys as ~/.codex/config.toml. Anything you set in a profile file overrides the matching value in config.toml for sessions launched under that profile.

That includes model_context_window.

The profile that unlocks 1M

Create ~/.codex/gpt54_1m.config.toml:

model = "gpt-5.4"
model_context_window = 1000000
model_auto_compact_token_limit = 900000

Three lines. model_context_window tells Codex to use a 1M-token window for sessions launched under this profile. model_auto_compact_token_limit triggers conversation compaction at 900K, leaving headroom before the effective ceiling.

Launch with:

codex --profile gpt54_1m

Verifying it works

Inside the Codex TUI, /status shows the context window size. For a definitive check, inspect the rollout JSON from your session:

latest_rollout=$(
  find ~/.codex/sessions -name 'rollout-*.jsonl' -printf '%T@ %p\n' \
    | sort --numeric-sort \
    | tail --lines=1 \
    | cut --delimiter=' ' --fields=2-
)

jq -r '
  select(.type == "event_msg" and .payload.type == "task_started")
  | .payload.model_context_window
' "$latest_rollout"

This returns 950,000. GPT-5.4 applies a 95% effective context window internally, so the runtime ceiling is 950,000 of the catalog's 1,000,000.

To prove the window is actually usable end-to-end -- not just declared in metadata -- pipe several hundred thousand tokens of input to a single-turn codex exec and check that the rollout records the full input acceptance:

codex exec --model gpt-5.4 -c 'model_context_window=1000000' \
  --skip-git-repo-check 'Reply with only the literal word RECEIVED.' \
  < big_input_file

On my Plus account, this accepted a single turn with 591,580 input tokens -- well above the 272,000 default ceiling.

Why this works now (and didn't before)

Until Codex 0.134.0, profiles lived inside ~/.codex/config.toml as [profiles.X] tables -- and that table parser silently dropped model_context_window (openai/codex#14456). Setting the key inside [profiles.X] did nothing. The 0.134.0 release moved profiles into their own files at ~/.codex/<name>.config.toml, using top-level keys where the override was never broken. The bug fix was incidental to the file-shape change.

What about GPT-5.5?

GPT-5.5 has the same shape of gap: OpenAI's API docs document a 1,050,000-token context window, but Codex's catalog still ships it at 272,000.

The same simple override does not work for GPT-5.5 on the Plus tier (internally labeled prolite in Codex's rate-limit telemetry). I verified this by stress-testing both models on the same account, back to back:

GPT-5.4 with model_context_window = 1000000: a single-turn session accepted 591,580 input tokens.
GPT-5.5 with an analogous setup: real working sessions hit "Codex ran out of room in the model's context window" three times in a row, each time when a single turn's input crossed ~270,000 tokens. Codex's /status reported a 998K window throughout; the OpenAI API rejected the requests anyway.

So the override-tells-Codex layer worked for both models, but only the GPT-5.4 API actually honored it for Plus. The catalog patch convinces Codex it has more headroom; the server still enforces the plan-tier ceiling for GPT-5.5.

I haven't tested this on Pro / Enterprise / API access, so I can't say whether the limit relaxes there. If you're on a higher tier and want to find out, the test is: stress-test a GPT-5.5 1M profile with several hundred thousand tokens of single-turn input and read the rollout's last_token_usage.input_tokens. If it succeeds, the gate is plan-tier; if it fails the same way, the cap is universal on Codex.

Update -- July 2026 (Codex 0.144.4): this technique no longer works

Since I wrote this, Codex changed how it resolves context windows, and the override above stopped doing anything. I re-tested the whole thing on the same subscription account against Codex 0.144.4, in an isolated config directory so nothing touched my live setup. Three things changed.

1. The `model_context_window` override is ignored

Whether you set it in a profile file (codex --profile gpt54_1m) or inline (codex exec -c 'model_context_window=1000000'), the key no longer changes anything. A session launched with the 1M profile still reports a 258,400-token window -- 95% of the 272,000 catalog default. The setting parses without error and has no effect.

2. The catalog now drives only the displayed window -- not the enforced one

Codex 0.144.4 reads the window from the model catalog (~/.codex/models_cache.json), and editing an entry's context_window does change the number Codex shows you. But that number is cosmetic. I patched GPT-5.5's catalog entry down to 100,000 -- Codex then displayed a 95,000-token window -- and a 132,606-token input still went through. I patched it up to 1,000,000 -- displayed 950,000 -- and inputs above ~272,000 tokens were still rejected client-side with "ran out of room in the model's context window."

The real limit is enforced server-side, per model and plan tier, and no client-side edit moves it in either direction. (Codex also re-fetches the catalog from the server on a timer, silently overwriting your edit unless you make the file read-only.)

3. There's now a hard 1 MiB cap on a single input message

Independent of tokens, Codex rejects any single message over 1,048,576 characters with input_too_large, before it ever checks the context window. The original end-to-end test in this post -- piping a 591,580-token file to one codex exec -- can't even run today: that much text is well over 1 MiB.

What the windows actually are now

Measured by feeding single-turn inputs of increasing size and reading the accepted input_tokens straight from each session's rollout JSON:

GPT-5.5 -- accepts 268,606 tokens; the next step up is rejected. Usable ceiling ~272,000.
GPT-5.6 (it ships as three variants -- sol, terra, luna) -- sol accepted 356,472 tokens and rejected around 392K. Its usable window is clearly larger than 5.5's, but still short of 400K.
GPT-5.4 -- accepted 291,221 tokens (I didn't push it to its ceiling). It's the only model whose catalog still carries max_context_window: 1000000.

Every one of these reported the same cosmetic 258,400-token window regardless of the model.

The practical upshot inverts the original premise. You can no longer unlock a bigger window by editing config -- the number you can change is the one that doesn't matter. But for the newer models you don't need to: GPT-5.6 already grants more usable context than GPT-5.5 by default. GPT-5.5 is the outlier, capped lowest of the three -- the same regression subscription users flagged in openai/codex#19464, where 5.5 gives less long-context headroom than 5.4 did.

As before, this is one ChatGPT-subscription account. On Pro / Enterprise / API the server-enforced ceilings may well be higher -- I haven't tested them.