More

ainch · 2026-04-12T19:35:12 1776022512

I think that should be true, but doesn't hold up in practice.

I work with a good editor from a respected political outlet. I've tried hard to get current models to match his style: filling the context with previous stories, classic style guides and endless references to Strunk & White. The LLM always ends up writing something filtered through tropes, so I inevitably have to edit quite heavily, before my editor takes another pass.

It feels like LLMs have a layperson's view of writing and editing. They believe it's about tweaking sentence structure or switching in a synonym, rather than thinking hard about what you want to say, and what is worth saying.

I also don't think LLMs' writing capabilities have improved much over the last year or so, whereas coding has come on leaps and bounds. Given that good writing is a matter of taste which is beyond the direct expertise of most AI researchers (unlike coding), I doubt they'll improve much in the near future.

ainch · 2026-04-09T23:10:28 1775776228

Germany has an anonymous support programme for people who feel paedophilic urges but don't wish to offend. I believe they've used that network for research, but I think it's probably quite a limited, and potentially biased, sample.

ainch · 2026-04-09T20:42:25 1775767345

Carbon offsets are a sham, but you could just require them to directly pay for the actual energy infrastructure required. If you need 1GW of electricity, develop 1GW of solar.

irishcoffee · 2026-04-09T20:46:35 1775767595

Surely you realize that building the infrastructure and driver of the 1GW provider would be, hopefully, carbon neutral?

ainch · 2026-04-09T20:51:21 1775767881

Sorry, I'm not picking up on the connection - could you expand? Do you think they should also pay for offsets alongside developing energy infrastructure?

irishcoffee · 2026-04-09T21:42:35 1775770955

I guess what I'm asking is how long it takes, soup-to-nuts, for the 1GW installation to be carbon neutral or better? I've read anywhere from 7 months to 25 years. Maybe its dependent on location?

ainch · 2026-04-09T23:01:33 1775775693

Oh sure, I see what you mean - thanks for clarifying. On top of your point, it's true that CO2 has a prolonged impact on global temperature even after it's been 'removed' from the atmosphere, so even once solar pays back the original carbon investment its impact lingers for a while.

I guess at a certain point you're getting at a more fundamental question about the value of AI (plus technology and everything else) - what level of environmental tradeoff is acceptable? One thing I slightly lament about the discourse is that tradeoff is widely discussed in the case of AI, but not in the context of stuff we do. I suspect most people aren't aware that the water use associated with eating a burger dwarves a year of ChatGPT, that a long-haul flight wipes out the emissions savings of a couple years' veganism, or that renewables have their own impacts, like the demolition of Chile for copper.

ainch · 2026-04-09T17:49:34 1775756974

Transformers do have a fixed input/output size though - that's what a context window is. It's just that, via scaling and algorithmic improvements, the length of usable context windows has increased to the point that they're much less of a bottleneck.

I think your points around parallelisation and the flexibility of quadratic attention are spot-on though.

ainch · 2026-04-07T23:26:52 1775604412

This opens up an interesting new avenue for corporate FOMO. What if you don't partner with Anthropic, miss out on access to their shiny new cybersec model, and then fall prey to a vuln that the model would have caught?

mceachen · 2026-04-08T00:55:48 1775609748

Since when did corporations care? Most seem to just pay their insurance premium for cyber liability and call it a day.

aurareturn · 2026-04-08T08:53:54 1775638434

There is a difference between leaking user accounts and passwords and getting your business destroyed overnight entirely.

Imagine if an AI can infiltrate your SaaS database and delete your entire database and every single backup. The business is dead immediately.

JoshuaDavid · 2026-04-08T09:33:41 1775640821

Did that happen to a lot of companies during the log4shell fiasco? I'm sure some companies had their permissions misconfigured in a way such that a malicious actor who could execute code on their servers could also drop their database and delete their backups.

aurareturn · 2026-04-08T11:56:17 1775649377

I don't know. But the point is that anyone who has access to this model might be able to do the same thing to any company or government.

joquarky · 2026-04-08T12:50:29 1775652629

Equifax is still around.

6thbit · 2026-04-08T17:19:33 1775668773

This seems to be the mind-games play. FOMO at the moment, if they push it successfully you could even be labeled negligent for not paying them for it.

ainch · 2026-04-06T23:28:30 1775518110

Great piece. And a good excuse to read up on the use of diaeresis in English (eg. coördination, reëlection) to distinguish repeated vowels - I hadn't seen the New Yorker's usage before.

mplanchard · 2026-04-07T02:37:29 1775529449

They also prefer some less common spellings. For instance, just noticed “vender” instead of “vendor” in an article this morning.

goodoldneon · 2026-04-06T23:53:48 1775519628

It isn’t for all repeated vowels; only for when the 2 vowels don’t make a single sound. So “chicken coop” wouldn’t have a dieresis

stavros · 2026-04-07T00:09:24 1775520564

It would if the chickens formed a business structure that was owned and democratically controlled by its member-owners.

goodoldneon · 2026-04-07T13:07:53 1775567273

Great point :D

o0-0o · 2026-04-07T01:59:32 1775527172

That, is likely co-op.

satvikpendem · 2026-04-07T03:02:53 1775530973

That's the joke.

OJFord · 2026-04-07T00:04:28 1775520268

Unless it was a chicken coöp... One of few cases it actually resolves an ambiguity!

Cthulhu_ · 2026-04-07T12:27:23 1775564843

It's also to distinguish metal bands. Motörhead.

ainch · 2026-04-02T10:03:22 1775124202

Randall Munroe of xkcd? I like his work but I'm not sure I'd call him a philosopher...

dotancohen · 2026-04-02T17:38:10 1775151490

I see Munroe's work as filling the same role in society as did Socrates in his time. Not only in commentary about current events, government, society, etc but also in expressing his viewpoints in a fashion accessible to society. Socrates paved the way to bring philosophy to the masses. Munroe uses a popular medium and comedy to the same effect.

ainch · 2026-04-02T01:11:25 1775092285

The Gaussian Processes underpinning this work are hardly a product of the 'AI Hype Machine' - they've been around for decades, have strong statistical underpinnings, and are being widely explored for experimental design across many disciplines. Reflexive and poorly-informed backlash to any variety of machine learning is no more productive than blindly hyping up LLMs.

toraway · 2026-04-02T04:51:09 1775105469

Meta Platforms, Inc featuring this technology with a title announcing “AI for American-produced cement and concrete” is, on the other hand, 1000% a product of the AI Hype Machine.

ainch · 2026-04-02T09:43:29 1775123009

Sure, it's clearly marketing. I think a private company pursuing marketing via open research with open source code (including datasets) is a good trade. A hypey blogpost + research is better than no blogpost and no research.

ainch · 2026-03-30T23:36:42 1774913802

A sidenote along these lines - I've recently done an MSc, and found that the default approach to lectures is now to present slide decks. One of the profs, however, delivers a more traditional lecture, writing everything on a blackboard. I've found the second style far more effective, largely because writing caps the rate at which information can be conveyed. Because slides have no such bottleneck, I've found they're often misused and overladen with information which is skipped over too quickly.

ainch · 2026-03-30T13:37:13 1774877833

Do you have any evidence that inference revenue is growing faster than training costs? RLVR is significantly less compute-efficient than token-prediction pretraining - especially as labs are trying to train models to achieve agentic tasks which take tens of minutes per rollout.

aurareturn · 2026-03-30T13:39:17 1774877957

I don't have any evidence. You'll have to believe what Anthropic and OpenAI CEOs say publicly.

However, it seems to make a lot of sense. Anthropic literally added $6b ARR in February 2026 alone. I doubt training costs go up that fast.

ainch · 2026-03-30T23:05:41 1774911941

It's definitely true that they've increased their revenue rapidly. But at the same time the 'scaling laws' that the labs were first built around require exponentially-scaling cost (10x flops for a fixed reduction in training loss).

If anything, a better look at the economics is a reason to look forward to one of them IPO-ing. I suspect the labs probably could cut R&D and turn a profit, but that might only work for one generation, until they get superseded by the competition.

aurareturn · 2026-03-31T05:48:44 1774936124

There is no doubt that competition is what is driving unprofitability. So when people say AI can't be monetized, I laugh. Right now, foundational AI is unprofitable because of competition, not because they can't make money.

arctic-true · 2026-03-30T15:14:46 1774883686

But this is exactly the problem - we have to take it on faith that inference is profitable because nobody actually knows. It’s hard to even define what that would mean, and while I am suspicious of claims that frontier lab CEOs are just out-and-out liars or bad people, defining and calculating the real cost of inference would be time- and labor-intensive in its own right and there is no strong incentive to do it other than “tech reporters are curious.” Until the IPO, we just won’t know.

aurareturn · 2026-03-30T16:07:03 1774886823

A lot of people know. A lot of insiders have been saying tokens are profitable. Is there a conspiracy theory for everyone to lie? Including OpenAI, Anthropic CEOs, employees, Cursor management, inference providers of Chinese models?

arctic-true · 2026-03-30T16:20:53 1774887653

Profitable on what basis? They generate more revenue than the cost of electricity? Does that factor in the cost to service the massive, multi-layer cake of debt that was necessary to even begin to serve inference in the first place - not from a training perspective but from a hardware and facilities perspective?

aurareturn · 2026-03-30T16:38:01 1774888681

Profitable as in every token they generate, they make some money.

And it's already mentioned that the path to profitability is that inference revenue eclipses training costs. It's already happening rapidly.

arctic-true · 2026-03-30T16:45:37 1774889137

I’m not talking about training costs. I’m talking about startup costs. You have to pay for GPUs (or to rent data centers). You have to pay for the electricity that runs those data centers, and in a lot of cases these frontier labs are building the data centers on credit, so you need to pay for the construction, the materials, etc. If it was as simple as “running the GPUs costs less than we charge for it,” I might be inclined to agree. But the GPUs don’t just appear by magic.

aurareturn · 2026-03-30T18:45:59 1774896359

Right now, the demand is far more than supply for GPUs. Every cloud company is saying they're leaving money on the table because they don't have enough compute to serve the demand.

It seems like you're arguing that the bubble is going to collapse soon, like the author? How can it collapse when the demand is so much bigger than supply? Do you think the demand is fake? Or that AI will stop making progress from here on out?

arctic-true · 2026-03-30T19:43:55 1774899835

The demand is real. The tech is real. The economics are completely unsustainable. Switching costs and barriers to entry are too low, operating costs are too high. And if the tech improves, it actually makes it even easier for competitors to swoop in and take market share. Not long ago, an agent that was 80% as good as SOTA was not usable. A year from now, an agent that is 80% as good as SOTA will be better than the best agent is today. We have it on good authority that today’s agents are very good, very useful. Why bother paying full price?

This is deeply ironic in a way. Because the whole premise of AI labor replacement is that AI does not need to be better than human labor, it just needs to be cheaper with acceptable performance. But the same is true one step down: discount AI doesn’t need to be better than bleeding-edge AI, it just needs to be cheaper with acceptable performance.