Is HTML really "strictly better" than Markdown for Claude Code? I ran the numbers.

Thariq Shihipar of Claude Code team fame posted an article last week, "The Unreasonable Effectiveness of HTML", arguing that Claude should output HTML files by default for basically everything. PR reviews, postmortems, status reports, the lot. It's a good piece, racked up 8.3M views, and most of it I agree with. But in the replies, when somebody pushed back to say markdown's fine for text-heavy stuff, Thariq held the line: "idk I think HTML is strictly better for all of that too." So I ran it all. Three of his use cases, instrumented end to end on Opus 4.7. What follows is what I measured, what surprised me, and where I ended up agreeing and disagreeing. Short version: he's right about half of it, and that half is very right. The other half I don't think holds up, and the data here is why.

TL;DR

Thariq's piece is good. Read it before you read this.

Visual tasks (design exploration, prototypes, drag-and-drop widgets): HTML wins outright. Markdown can describe a design; HTML can show one. The 2-4x token cost is the price of admission, not a tradeoff.
Text-with-structure tasks (PR reviews, technical explainers): both formats produce a real artifact. HTML adds polish for about 1.4-1.5x the dollar cost. In one of my two text cases (the rate-limiter explainer) the markdown surfaced more gotchas (12 vs 8) at lower cost. Which cuts against "strictly better."
Does Claude read HTML better than markdown? (the question Thariq's argument quietly assumes a HTML-favorable answer to): two blinded judges scored Claude 7/7 on factual accuracy from both formats, identically. On explanatory specificity HTML got a ~7% edge, at 33% higher cost per call. Pick your poison.
The "1M context handles it" hand-wave: at N=50 artifacts re-ingested K=3 times with cache evictions, HTML's overhead eats roughly the whole 1M window. Not nothing, but depending on the work being done maybe worth it.

Sample is small (N=1 per condition for content quality, three instrumented cases) and the methodology I used actually biases against markdown. Even with the thumb on the scale, "strictly better" isn't what I saw.

What Thariq actually said

He walks through five buckets of tasks where he reckons HTML should be the default. Roughly:

Specs, planning, exploration (e.g. design exploration)
Code review & understanding (e.g. PR review with annotated diff)
Design & prototypes (e.g. an interactive checkout button)
Reports, research, learning (e.g. a rate limiter explainer)
Custom editing interfaces (e.g. drag-and-drop Linear ticket reorderer)

He flags three costs honestly: more tokens, slower generation, noisier diffs. He concludes the advantages outweigh them. Reasonable people can disagree on the conclusion.

What I want to look at is the reply to the comment. "Strictly better" is a specific, testable claim. That's the one I want to examine.

There are actually two questions here

The article treats format choice as a knob: pay a bit extra in tokens, get nicer output, decide if it's worth it. That framing works for half the use cases and quietly stops working for the other half. Here's the split.

Sometimes HTML can do something markdown structurally can't. Render a six-mockup grid of UI designs. Run a slider that updates a preview in real time. Drag a card across a kanban. Markdown isn't a worse version of these; there is no markdown version. Asking "is the extra cost worth it?" is like asking whether the airfare to Mars is good value when the alternative is staying home and looking at a globe.

Other times both formats can do the job, and HTML just costs more. PR reviews. Postmortems. Technical explainers. Implementation plans. Markdown handles all of them. HTML renders them more attractively. The substance lives in either.

The "use HTML for everything" framing treats both questions like the second one. The case for HTML gets stronger, not weaker, when you split them apart.

How I ran this

I picked three of Thariq's five use cases and instrumented them properly. Same prompt, both formats, capture every token. The other two are the categorical ones from the previous section; there's nothing fair to compare them against, so I just demo them at the end.

The three I instrumented: 1. Design exploration: onboarding screen, six approaches in a grid (his specs/planning bucket) 2. PR review of a real commit: toks commit c6d70f9, "Respect .gitignore when target has no .git directory" 3. Rate limiter explainer over the slowapi source: about 12K tokens of Python as substrate

The two I just demo: 4. Checkout button prototype (interactive sliders, animation, copy-as-prompt) 5. Linear ticket reorderer (drag-and-drop kanban with copy-as-markdown export)

For each instrumented case I ran three probes: generate the artifact, feed it back into a fresh agent session as context for a downstream task, and apply a semantic edit while capturing the diff. The first probe is the obvious one. The other two matter because that's where the cumulative token costs actually bite. Re-ingestion and editing happen constantly in real Claude Code workflows.

On the design exploration case I ran two methodologies side-by-side: Thariq's prompt unchanged versus a neutrally-phrased version of the same task. Why? Because his prompt literally says "create an HTML file." If you only do a "HTML" → "markdown" word-swap, you're asking Claude to make a markdown file in the shape of an HTML file, which is not the same as asking for markdown. Tracking? The difference between the two methodologies turned out to be larger than I expected: a 3.6x cost ratio under ablation, 2.1x under neutral phrasing. Worth being honest about.

What "artifact tokens" means below. I ran toks <file> --for claude on each generated artifact, using Anthropic's own tokenizer to count the file the way Claude would see it if fed back as context. That number is different from usage.output_tokens (what the API reported as the generation length) and from cache_creation_input_tokens (artifact + system prompt + user message, all of it). The first matters for re-ingestion cost. The other two matter for other things.

Where this is honestly a bit thin: - My reader-side reactions are N=1. One user (me), one session, I knew which file was which. No blinding, no randomization. Take them as gut reactions, not data. - Generation runs are also N=1 per condition. Content-quality findings like "markdown surfaced more gotchas" might be variance, not signal. K=5 per condition would tell us. I didn't run K=5. - Three instrumented cases is a sample, not a census. - The agent-as-reader test below covers shared-content questions only, i.e. facts present in both artifacts. I didn't test whether HTML's verbose-priming or callout-salience changes output quality (beyond binary accuracy) in ways that matter downstream. - I didn't test "colleagues actually read HTML more." That'd need users I don't have. - Methodology asymmetry is load-bearing, and I'm flagging it here so the reader knows the thumb is on the scale before the data tables start. I ran two methodologies only on the design exploration case. The PR review and rate-limiter cases used Thariq's verbatim prompt structure, which is HTML-affording ("create an HTML artifact..."). That biases against the markdown artifact compared to a neutrally-phrased prompt. The cost ratios I report for those cases (1.51x for PR review, 1.44x for rate limiter) would likely shrink under a neutral prompt; how much, I didn't measure.

Design exploration: this is where HTML earns its keep

Thariq's prompt (verbatim): "I'm not sure what direction to take the onboarding screen. Generate 6 distinctly different approaches, varying layout, tone, and density, and lay them out as a single HTML file in a grid so I can compare them side by side. Label each with the tradeoff it's making."

Cost data (verbatim methodology):

Format	Artifact tokens	Output tokens	Generation time	Cost
HTML	6,877	10,135	117 s	$0.51
MD	1,896	3,284	49 s	$0.27

Cost data (format-neutral methodology):

Format	Artifact tokens	Output tokens	Generation time	Cost
HTML	6,871	9,161	102 s	$0.48
MD	3,332	4,769	73 s	$0.34

Reading both artifacts side by side, my reaction was unambiguous:

"HTML is massively nicer to read in these cases versus the markdown. It's much better ... it's not speed that makes the entire difference. It's quality and detail of visual output. This question is fundamentally visual and HTML allows for near-perfect visual communication of the ideas vs text-only in markdown."

The HTML rendered six high-fidelity onboarding mockups in iframes inside a responsive grid. The markdown produced a six-section text document that described what each mockup would look like.

The HTML artifact, rendered above. Six approaches, side by side. This is the thing markdown structurally can't do.

Three things worth noting from this case:

Markdown leaks HTML, but only when the prompt corners it. When I ran Thariq's verbatim prompt (with its "single markdown file in a grid" phrasing), the model embedded 10 HTML tags into the "markdown" output (<table>, <tr>, <td>). Of course it did; there's no markdown way to lay out a grid. When I dropped "in a grid" from the prompt, the markdown came out clean. Zero HTML tags. So if you've ever wondered why your markdown agents sometimes spit out HTML, here's one answer: you asked them to do something markdown can't do.
Methodology shifts the cost ratio. Verbatim ablation: 3.63x artifact tokens; format-neutral: 2.06x. Single-number ratios reported without methodology disclosure should be read skeptically.
Generation time partially supports Thariq's "2-4x slower" admission. Measured 1.4-2.4x depending on methodology. The high end of his range is reachable, the low end is below what I measured.

The call: HTML wins outright. The token cost isn't a tradeoff, it's just what the capability costs.

PR review: a fair fight, and HTML costs about 40% more for the polish

Methodology note: this case used verbatim ablation only (the format-neutral methodology was only run on the design exploration). The verbatim prompt structure ("create an HTML artifact...") is HTML-affording and biases against markdown. The 1.51x cost ratio reported here would likely shrink under a neutrally-phrased prompt; I didn't measure by how much.

Thariq's prompt (lightly adapted for my substrate): "Help me review this PR by creating an HTML artifact that describes it. I'm not familiar with the gitignore/path-resolution logic so brace on that. Render the actual diff with inline margin annotations, color-code findings by severity..."

Substrate: toks commit c6d70f9, a real bug fix (~30 lines of code change plus a new test).

Cost data:

Format	Artifact tokens	Output tokens	Generation time	Cost
HTML	11,223	19,308	234 s	$0.82
MD	3,810	10,437	154 s	$0.54

Ratios: 2.95x artifact tokens, 1.85x output tokens, 1.52x time, 1.51x cost.

The markdown PR review is, frankly, a real PR review. TL;DR verdict at the top ("Approve with two non-blocking notes"), a severity legend in a markdown table, a structured walkthrough of the gitignore logic, line-by-line diff annotations, a test-coverage assessment. If a teammate sent me this in a Slack DM I'd be perfectly happy.

HTML adds a color-coded verdict bar, rendered diffs with green/red highlighting and line numbers, eight severity-tagged finding cards (Pass / Concern / Nit), and inline severity callouts on the diff. Prettier. More navigable. The substance is the same.

The HTML PR review. Verdict bar, severity-tagged findings, annotated diff. Polished. But the markdown version below has the same substance.

The same review, in markdown. Same content. Roughly 30% cheaper. The cost of "looks nicer" is a real number, and that's it.

I also asked Claude to make a semantic edit to each artifact (add a "Test coverage assessment" section) and looked at the resulting diffs. HTML's was 5,658 bytes / 93 lines; markdown's was 4,618 bytes / 55 lines. HTML's structural overhead (1.2x more bytes than markdown for the same logical edit) is real but modest, because both formats already had a fair bit of structure to begin with.

So: markdown does the job. HTML is prettier and costs about 50% more. Worth it for a review template you'll reuse across the team and reference repeatedly; probably not for a one-off you'll glance at and close.

Rate limiter explainer: where the markdown actually won on substance

Methodology note: same as above, verbatim ablation only. The 1.44x cost ratio is from a HTML-affording prompt and would likely be smaller under a neutral one.

Thariq's prompt (lightly adapted): "I don't understand how slowapi's rate limiter actually works. Read the relevant code and produce a single HTML explainer page: a diagram of the rate-limiting flow, the 3-4 key code snippets annotated, and a 'gotchas' section..."

Substrate: slowapi core (extension.py, middleware.py, wrappers.py), about 12K tokens, ~1,200 lines of Python.

Cost data:

Format	Artifact tokens	Output tokens	Generation time	Cost
HTML	8,700	14,713	191 s	$0.79
MD	3,629	7,915	118 s	$0.55

Ratios: 2.40x artifact tokens, 1.86x output tokens, 1.62x time, 1.44x cost.

Both artifacts are real explainers. The bit I wasn't expecting: both used Mermaid for the flow diagram. The markdown one wraps it in a ```mermaid fence; the HTML one loads the Mermaid CDN and embeds the same chart definition. GitHub renders the markdown version as a real diagram. So does VS Code. The "HTML can show diagrams and markdown can't" intuition that does some quiet work in Thariq's argument is mostly gone for technical writing in 2026.

The HTML rate-limiter explainer. Mermaid flowchart via CDN, callout boxes for the 8 gotchas, syntax-highlighted code via CSS classes. Looks the part.

The same explainer, in markdown. Twelve gotchas instead of eight. Mermaid flowchart in a fenced block, fenced code blocks letting the viewer handle syntax highlighting, no CSS chrome. Cheaper. And, to my eye, the better artifact.

The HTML version's distinctive features are CSS chrome: a stylesheet for syntax-highlighting tokens (<span class="kw">, <span class="str">), background panels with border-radius for code blocks, color-coded callout boxes for gotchas. The markdown version uses fenced code blocks which most modern viewers syntax-highlight automatically.

I counted the gotchas in both files. HTML had 8. Markdown had 12. The HTML compresses two findings into one "Two latent bugs" entry, so charitably call it 9-10. Even charitably, the markdown surfaces more distinct gotchas the HTML doesn't reach: the empty-key_func bypass, the inspect.parameters signature sniffing, the request.state.view_rate_limit contract between checking and header injection. The HTML caught one the markdown missed (the key_style="url" vs "endpoint" distinction), but it isn't a wash. Markdown was the more substantive artifact.

That surprised me. Going in, I'd assumed the verbose-priming effect (HTML's tag-heavy output register leading Claude into more verbose responses) would push HTML toward more comprehensive answers. It didn't here. It pushed the other way.

Think of it like two technical writers given the same source code. One hands you a clean PDF with a cover page, syntax-highlighted snippets, and tasteful colored boxes around the warnings. The other gives you a longer markdown file with no styling, but he caught four more gotchas. Which one do you want before you ship to prod?

The edit probe threw me a small surprise. I asked Claude to add a "Custom storage backends" section to each artifact and looked at the diffs. HTML's diff was 2,471 bytes over 19 lines; markdown's was bigger at 2,987 bytes over the same 19 lines. Same line count, but markdown's paragraph-style prose has longer lines than HTML's tag-broken structure. The "HTML diffs are noisier" claim turns out not to generalize cleanly. When the content is mostly prose, markdown isn't a saving on diff size.

Verdict, with the obvious caveat that this is one sample on one substrate: the markdown was the better artifact. More comprehensive (12 gotchas vs effectively 9-10 once you count the HTML's "two latent bugs" as separate items), cheaper, with a real Mermaid diagram in the fenced block. The intuition that "HTML wins because richer information" doesn't hold up in this case. Whether it'd hold up at K=5 generations I genuinely don't know, and I think that's the right answer to give. One sample is one sample. But the direction of the surprise is striking. I expected HTML's verbose-priming effect to push HTML toward more thorough answers; it pushed the other way here.

"The 1M context window handles it." Does it though?

Thariq's FAQ says: "With the 1M context window in Opus 4.7, the increased token usage is not really noticeable in the context window."

Fair point for one-shot generation. Not so fair for re-ingestion, which is what Claude Code actually does. Feed yesterday's artifact back in as context for today's follow-up turn, over and over. Let me show what that costs.

Cache_creation tokens (artifact-as-input-context) per re-ingestion across the three cases:

Use case	HTML	MD	Delta
Design exploration	27,309	21,657	+5,652
PR review	33,892	25,532	+8,360
Rate limiter expl.	32,280	26,379	+5,901

Run the math. Project produces N artifacts. Each gets re-fed into the agent K times. The cache evicts between feeds (i.e. calls more than ~5 minutes apart, where you pay the cache_creation cost fresh). The HTML-over-markdown overhead is roughly N × K × 6,000 tokens. At N=50, K=3 that's 900,000 tokens of cache_creation, basically the whole 1M context window, spent on markup you wouldn't have if you'd used markdown. For tight call chains where the cache stays warm, subsequent reads hit cache_read_input_tokens (much cheaper, \$1.50/M vs \$15/M for fresh input), and the overhead amortizes to roughly the artifact's first-ingest cost.

The dollar cost differential is muted by cache pricing for tight chains, so individual calls feel cheap. The context budget is where the cost most reliably shows, because that's measured in tokens regardless of cache state.

This matters more if you're not on an enterprise plan with effectively unlimited tokens. Several commenters in Thariq's thread raised exactly that point: the "context budget isn't a concern" framing reads differently from inside Anthropic than it does for a developer paying retail. The math behind that pushback is real.

Does Claude read HTML better than markdown? (Spoiler: not really.)

Several commenters on Thariq's piece pushed back on the agent side of his argument. HTML "wastes the model's cognitive space on tags, nesting, closure, styles" (@AlexMares). It has "low signal-to-noise ratio" and risks hallucination (@AkhilDevelops). Markdown is just the right format for agents to read (@AiAGiAI). And Thariq's whole pitch quietly assumes the opposite: that the agent processes HTML at least as well as it processes markdown. Otherwise his "use HTML for everything, Claude reads it back later" workflow doesn't really hold up, does it?

So I tested it. Seven specific factual questions about slowapi's rate limiter, each answerable from both the HTML and the markdown version of the same explainer artifact (shared content, different format). I asked Claude (fresh session, Opus 4.7) to answer all seven questions from each format, then scored the answers against ground truth derived from the slowapi source.

Here's what came back:

Metric	HTML source	MD source	Ratio HTML/MD
Correct answers	7 / 7	7 / 7	equal
Input cache_creation tokens	30,213	24,312	1.24x
Output tokens generated	1,560	785	1.99x
Cost per call	$0.246	$0.185	1.33x

Both formats produced equally accurate answers to all seven questions, confirmed by two independent judges scoring blinded (neither judge knew which set came from which format). Both judges gave both sets 7/7 on factual accuracy.

On a separate axis (explanatory specificity, the "clarity and learning value" of the answer) both judges scored HTML marginally higher. Judge 1 gave markdown 6.5/7 and HTML 7.0/7. Judge 2 gave markdown 6.0/7 and HTML 6.5/7. The HTML-sourced answers consistently cited more specific code constructs (e.g. all(not l.override_defaults for l in route_limits), MAX_BACKEND_CHECKS=5, inspect.iscoroutinefunction). The markdown-sourced answers were slightly tighter prose. Both judges independently described the same pattern.

So the strong commenter worry, that HTML "wastes the model's cognitive space" or "lowers signal-to-noise to the point of hallucination," didn't show up. If anything, the mild opposite did: HTML-sourced answers came back a hair more specific. Both blinded judges said so. But the price-quality math is bad either way: you pay 33% more per call for answers that score ~7% better on specificity and identically on accuracy. The verbose-priming story holds up at the qualitative level too. HTML doesn't spend its extra tokens on factual errors; it spends them on more specific code citations.

A few things to keep in mind before reading too much into this:

N=1 per condition. Single shot per format. Could be variance.
Tested only on shared content in a single artifact. I didn't try cases where the two formats actually differ in what they covered.
Didn't test whether answer quality beyond factual accuracy matters for downstream agent reasoning. Two answers can both be correct and still be differently useful for the agent's next turn.

What the test does is rule out the strong version of the agent-side worry. HTML doesn't degrade accuracy on shared content. It does cost more on both directions of the interaction for the same outcome.

The two I didn't bother instrumenting (because there's no markdown alternative)

Checkout button prototype. Thariq's prompt: "...Create a HTML file with several sliders and options for me to try different options or actions. Eventually, give me a copy button..." This is a working interactive widget. Markdown has no equivalent. The cost question is moot; there is no markdown alternative to compare against.

Linear ticket reorderer. Thariq's prompt: "Make me an HTML file with each ticket as a draggable card across New / Next / Later / Cut columns..." Drag-and-drop is an HTML+JavaScript capability. Markdown cannot do this.

Both of these are real use cases. They demonstrate the categorical-win bucket. The token cost is the price of admission, and there is no marginal-cost-vs-benefit question because there is no marginal alternative.

Going through Thariq's claims, one by one

Claim from the article	My verdict
HTML has higher information density than markdown	True for visual content; false for text-only content. Markdown handles prose, tables, code blocks well.
HTML is more readable past 100 lines	True for visual artifacts (confirmed by user reading observation); equivocal for text-with-structure (markdown's structure works well at length).
HTML is easier to share than markdown	Partly true (raw browsers don't render `.md` files), but understated: GitHub renders markdown natively (and renders Mermaid as real diagrams since 2022); VS Code's built-in preview renders markdown; Slack's own "mrkdwn" subset handles a useful chunk. The "you have to attach as email" framing in the article ignores that uploading to GitHub/Gist and sharing a link is a one-step workaround. Untestable here at engagement-level.
HTML enables two-way interaction	True categorically. This is the strongest argument in the article (categorical-win bucket).
"Token overhead is not really noticeable" with 1M context	False. Per-artifact overhead is 1.4-1.9x in dollars, 2-4x in tokens. Cumulative overhead approaches the 1M context limit for moderate-sized projects.
"2-4x slower" generation	Partially supported: measured 1.4-2.4x across the three cases. The high end of his range is reachable; the low end is below what I measured.
HTML diffs are noisy (acknowledged downside)	Confirmed for verbatim methodology (1.2-2.5x diff bytes). Less clear for text-heavy content where markdown's prose lines have their own size.
HTML is more joyful to work with	Subjective. Skipped.
"HTML is strictly better even for text-focused content" (his hardline reply)	Not what I saw in the data. Both text cases produced competent markdown that did the job, at 1.4-1.5x less cost than the HTML. One of them produced more substance than the HTML. N=1 per condition isn't enough to refute the hardline definitively, but it's plenty to say the strong version of the claim isn't warranted by what I measured.
Implied claim: HTML reads better as agent context downstream	Mixed. On shared-content factual retrieval (7-question test on slowapi gotchas), two blinded judges scored both formats 7/7 on accuracy. On explanatory specificity, both judges gave HTML a ~7% edge (6.5-7.0 vs 6.0-6.5). HTML costs 1.33x more per call (1.24x input + 1.99x output). HTML-sourced answers are marginally more specific (more code-citation detail) at significantly higher cost; the quality-per-dollar trade favors markdown.

So what should you actually do?

HTML if the task is fundamentally visual or interactive: design exploration, prototypes, mockups, dashboards, drag-and-drop widgets, slider-tuned UIs, throwaway editor surfaces. Or you specifically need to send it to someone who won't render markdown themselves. The token cost is what the capability costs. Pay it without thinking.

Markdown if the task is text with structure: PR reviews, postmortems, design docs, implementation plans, status reports, meeting notes, technical explainers. The substance lives in the words. HTML's polish costs about 40-50% more per generation, and in my one rate-limiter sample (take it for what one sample's worth) that polish came at the expense of substance, not on top of it.

Maybe HTML on a text task if the artifact is going to be the centerpiece of a meeting, or you need visual hierarchy to handle dense reference material (tables of tables, comparison matrices), or the reader can't comfortably read raw markdown. Otherwise, save the tokens.

The "HTML for everything" stance treats two different questions as one. Half of it is true and worth paying for. The other half is paying extra for things the markdown was already going to give you. Pick the right tool for the right job.

If you want to try this yourself

You don't need anything fancy. Opus 4.7 (the model Thariq is pitching for) via Claude Code's non-interactive mode does the whole thing: claude --print --model claude-opus-4-7 -- "<the prompt>". The prompts are scattered through this post in blockquotes; use them verbatim for the HTML-affording methodology, or rephrase to drop the format-specific cues if you want the neutral version.

For substrate, pick something real. A non-trivial recent commit from a codebase you know for the PR review case. Any reasonably-sized library you don't fully understand for the explainer case (I used slowapi). Anything plausible and visually rich for the design exploration.

For token counts, I used toks, a small Anthropic-tokenizer wrapper that reports an exact token count for whichever provider you're targeting. Without it I would have been squinting at character counts and guessing.

One thing I'd do differently if I were starting over: run K=5 generations per condition. Mine were all K=1, which is why a lot of the findings (especially "markdown was more substantive on the rate-limiter") come with the N=1 caveat. If your finding is going to be load-bearing, run it five times.

What this cost me

The whole thing. Three instrumented cases, two methodologies on one of them, three probes each, the agent-as-reader experiment, plus the two demo cases. Ran me $9.42 across 26 API calls on Opus 4.7. About 28 minutes of API time, less in wall clock once you parallelize. Cheaper than the lunch I had while it ran. Tracking?

Feature Image Prompt:

Generate an image. The aesthetic should be cyberpunk with colors of neon pink, blue and purple. Do not add any people. Depict two colossal floating monolithic slabs facing each other in a rain-slicked neon city alley, suspended in mid-air. The left slab is constructed of intricate, ornate glowing HTML tag scaffolding -- angle brackets, nested div structures, and CSS class hieroglyphs etched in luminous magenta and electric pink, dense and baroque, with cables and wires spilling from its edges. The right slab is sparse and elegant, made of clean horizontal markdown rules, hashtag headers, and asterisk emphasis carved in cool cyan and deep purple light, minimalist and crystalline. Between them hovers a glowing translucent balance scale made of holographic wireframe, tipping almost imperceptibly, with streams of tokenized data particles flowing from both slabs into the central fulcrum. The background is a vertical canyon of dark chrome skyscrapers with flickering neon signage in Japanese and binary code, volumetric fog rolling at the base, puddles reflecting the pink and blue glow. Wet asphalt, lens flares, chromatic aberration, cinematic depth of field, ultra-detailed, atmospheric haze, Blade Runner meets technical schematic.