In order to meet their burgeoning compute demand, Anthropic is renting capacity1 in xAI's filthy data centre cluster, Colossus, that egregiously, illegally and amorally inflicts a clinically significant reduction in air quality on the already health-disadvantaged community of South Memphis.

Generative AI is a wasteful use of compute all round, but Claude Code is notably heavy in token usage relative to competing products. Furthermore, Claude itself uses a full-path architecture, which burns through more GPU cycles per token than the more industry-standard sparse-path2.

This edition of Boxo Barks will explore why the TESCREAL3 ideology of Anthropic is both the source of its compute problem and leaves its corporate conscience entirely unbothered by the environmentally unjust solution they have employed.

The Moral Outrage in South Memphis

Now that xAI and Anthropic are in bed together, I can already see the pivot by Claudesuckers on Bluesky—who were more than happy to be critical when Colossus was only home to Grok—towards minimising the harm that is being visited upon the people of South Memphis. "Air pollution is complicated," suddenly. They're now "second-guessing" the "breathless reporting around the Colossus data center".

Cowards. Environmental science doesn't change because the set of parameters pulsing in those GPUs is now one that suits your fancy. The Southern Environmental Law Center doesn't just make up stuff to get mad at.

The site that xAI chose for Colossus 1 used to be an Electrolux factory4. In its former life it was a building that did not contribute to local air quality issues. The turbines installed by xAI have turned it into a significant emitter right on the doorstep of a community that was already at its limits.

I wonder what arguments we're going to hear from AI Centrists over the coming weeks to try and justify Anthropic's tenancy at Colossus 1.

Perhaps they'll point to the fact that there are now only 15 turbines, that they have catalytic converters on them, and that they are now running with permits.

But the other turbines haven't disappeared. They've moved just over the state border to Colossus 2 in Southaven, Mississippi, where xAI is repeating the same playbook5 on an even larger scale. Anthropic's multi-billion dollar rental fees for Colossus 1 are funding that playbook.

And the catalytic converters don't make the 15 turbines at Colossus 1 fine, because they only help reduce the nitrogen oxides, not the particulates. The permits? They're a farce. They're underpinned by fantasy targets that xAI will fail to meet. They will shrug their shoulders a couple of years from now and say "oops", and the Shelby County Health Department will give them a fine in the low millions. They've probably already set aside a chunk of Anthropic's multi-billion dollar rental fees for it.

How can we be sure that that these are fantasy targets? Because these turbines weren't designed for this. The whole setup is absurd. They aren't meant to be a primary, always-on power source. There aren't meant to be so many of them in one place. It is obscene.

There was some whataboutery from Grok fanboys a while back about the air monitoring not being in place to say that this is a problem. The inversion of the burden of the proof here is unreal. It's impossible for turbines of this design, used in this quantity, in this fashion, to not be a problem.

Flailing AI Centrists might say that there are other data centre projects currently under construction that will be powered by dedicated natural gas. This is, regrettably, true, and it's bad news for the climate. But those projects will have the infrastructure for proper emissions controls to protect the air quality of residential areas. Colossus is its own unique kind of bad.

Not one ppm of any aerial pollutant in any neighbourhood is worth the existence of a chatbot, and if you are one of those people who was willing to condemn this when the chatbot was Grok but is having a little wobble now that it's Claude, go fuck yourself.

Ideologically Inefficient

What's even worse, though, is that if it weren't for Anthropic putting their ideology before efficient product design, you could have your beigecicle without having to reach for the bottom of the compute barrel at all. Anthropic's compute crisis is one of its own making.

In the beginning of the chatbot era, all LLMs were built with a full-path (also now referred to as dense) architecture: at each layer of the transformer, every parameter is used to calculate6 the n-dimensional direction at which the vector hits the next layer.

As the models were built ever larger, this architecture became computationally wasteful for layers with a very high parameter count: at this scale, many parameters made little contribution to the result, but still had to be included in the calculation. The solution was to slice those layers up into separate islands7 with a routing function that would determine which island the vector would land on.

This sparse-path approach means that for a model of equivalent utility, the overall size is larger, because some information8 has to be duplicated across the islands. The routing function is also a rather complex calculation. However, the net computational cost of running the model is much lower, because those factors are more than offset by not having to use every parameter when predicting a token.

Moving to a sparse-path architecture was the big innovation contest between the large model providers in 2023/24, with Meta finally catching up in 2025 with Llama 4. However, Anthropic sat it out entirely. Without knowing Claude's exact specifications, it is difficult to say exactly how much less compute it would take to run it today if Anthropic had followed suit, but the difference would be significant9 enough that they would not need to go cap-in-hand to xAI for dirty GPU cycles.

Why would Anthropic pass up on such an obvious efficiency gain? Mechanistic interpretability.10 The process where they look inside Claude and say that they're mapping its thoughts.

When Chris Olah11 gets his parameter poking stick out, he wants a nice, clean vector destination to put in his fancy diagrams. But if the destination at the next layer is determined not just by the direction of the vector, but a routing function as well, that complicates things.

While OpenAI and Google scaled back their mechanistic interpretability work, choosing instead to pursue the innovation that would make their chatbots multiple times more efficient to run (though still a waste of compute in my view, to be clear), Anthropic was expanding its interpretability lab. Mechanistic interpretability is core to their 'safe AI' identity. Olah's interpretability stack only works on dense models, and changing the architecture in a way that would break Olah's stack was unacceptable to them.

For most of this industry, talk of the singularity is just a cosplay for the hype, but the goobers at Anthropic actually believe this shit. They are singularitarians, for realsies. They actually believe that Claude is either going to accelerate AI research to trigger a fast takeoff to superintelligence, or emergently become superintelligence itself. So nothing, not even energy efficiency, can get in the way of Claude being a model that they can 'see' the 'thoughts' of.

If you are an Anthropic customer who is also a singularitarian, you likely agree that their interpretability lab really is just that important. But I imagine that most of Anthropic's customers are not singularitarians, and are either unaware of how dubious the mech interp nonsense is, or they tolerate it because they like using Claude for whatever reason. I do not think they would be impressed if they realised that Anthropic's deference to Olah's parameter poking means that Claude is at least twice as energy-hungry as it would otherwise need to be, if not for the sake of the lungs of South Memphis residents, but for their own wallets when Anthropic passes the bill from their new landlord xAI onto them.

Of course, I am not naive about why Anthropic's competitors care about compute efficiency; it's not to lower their total compute budget, but to do more with the budget they've already set. But Anthropic has found itself using more compute than it expected to be, which would not be happening if they put what is, frankly, good engineering practice for the thing that they are trying to build, before their ideological commitment to Chris Olah's carousel of circular reasoning.

But it's not even just Claude itself that is inefficiently engineered. The design of Claude Code is also negatively impacted by their insistence upon treating Claude like a digital mind instead of a text generation tool. When its source code was leaked, critics marvelled at just how much of it consisted of prompting the model instead of client-side logic12.

In this vapid AI booster podcast episode that I skimmed through so that you don't have to, Claude Code creator Boris Cherny is asked why his product has Claude grep around in your terminal instead of giving it a client-side knowledge graph of the codebase. He says:

If you talked to me before I joined Anthropic and this team, I would have said, yeah, definitely [use a knowledge graph]. But now, actually, I feel everything is the model. Like, that's the thing that wins in the end. And it just, as the model gets better, it subsumes everything else. So, you know, at some point, the model will encode its own knowledge graph. It'll encode its own, like, KV store if you just give it the right tools.

He went on to claim that Claude works better with just tools and no RAG13, and when asked by what benchmark, he said:

This was just vibes, so internal vibes. There's some internal benchmarks also, but mostly vibes. It just felt better.

Yeah, it doesn't surprise me that the 'vibes' at the company that believes it is growing an emergent superintelligence are to follow a design philosophy of just letting their special boy emergently chunder over the codebase.

The cost? 4.56 times more tokens than Cursor on similar tasks, apparently. Meanwhile, the developer of an open source plugin to add RAG to Claude Code says that it can reduce token use by 40%.

I don't know how well these tests reflect real world usage. I don't use Claude Code. I don't use a competitor of Claude Code. I don't want you to use Claude Code or a competitor of Claude Code either. However, the pattern is clear. Anthropic has, again, chosen the significantly less efficient design philosophy in contrast to the rest of the industry, because its engineers treat Claude like the thing their ideology imagines it to be, rather than the thing that it actually is.

It turns out that market economics does not resolve to the most efficient solution when the corporate player is also a cult. It would be amusing, if the inefficiency were not leading to the use of an energy source that poisons lungs.

They Literally Don't Care

When one hears that a Public Benefit Corporation is concerned with acting very safely and responsibly, and sees that their branding is all earth tones, rounded corners and humanist typefaces, one would be forgiven for assuming that the adjectives safe and responsible might cover things like, say, not perpetuating an environmental health hazard.

However, you won't find any mention of such things in Claude's Constitution. Nor will you find any consideration of the environmental effects of scaling in the Responsible Scaling Policy. You won't find even a passing mention of the environment, or sustainability, or energy efficiency in Anthropic's blog, and the only relevant press release is this from last year: a token14 $1 million grant for AI-powered energy innovation research.

You won't find anything even slightly resembling an ESG report in the Anthropic Transparency Hub. The Anthropic Economic Index says that it is for understanding the impacts of AI on the economy, but since their methodology doesn't cover externalities, there's nothing on the environment there either. Neither is any of Anthropic's research on the environmental sustainability of their technology, even though they have a whole research category called 'Societal Impacts'.

Their only energy-relevant corporate policy is the document Build AI in America, which calls for natural gas permitting to be accelerated. Otherwise, they are conspicuously quiet on such issues for a company that really loves the sound of its own words.

This is because when TESCREALists talk about things like safety and responsibility, they don't mean the same things that they mean to the rest of us. Under longtermism, no GPU cycle can be dirty so long as it is powering the machine god. The superintelligence that they believe they are building will simply fix the environment, cure all disease and may even resurrect the dead in a Dyson sphere as a bonus.

They don't even need to justify to themselves what they are doing in South Memphis, because TESCREAL pushes it out of their frame of view entirely. The harm does not even register. They don't even see it. I hope, however, that you now see them.

In buying this dirty compute and giving xAI a multi-billion dollar cash injection, Anthropic have shown you who they are. Believe them.