More

josephg · 2026-04-10T09:25:29 1775813129

It could be massively improved with a special CPU instruction for racing dram reads. That might make it actually useful for real applications. As it is, the threading model she used here would make it incredibly difficult to use this in a real program.

foltik · 2026-04-10T13:55:00 1775829300

There’s no point racing DRAM reads explicitly. Refreshes are infrequent and the penalty is like 5x on an already fast operation, 1% of the time.

What’s better is to “race” against cache, which is 100x faster than DRAM. CPUs already of do this for independent loads via out-of-order execution. While one load is stalled waiting for DRAM, another can hit the cache and do some compute in parallel. It’s all already handled at the microarchitectural level.

jeffbee · 2026-04-10T15:30:02 1775835002

There are already systems that do this in hardware. Any system that has memory mirroring RAS features can do this, notably IBM zEnterprise hardware, you know, the company that this video promoter claims to be one-upping.

shiftingleft · 2026-04-10T16:40:36 1775839236

I don't think memory mirring features available today allow you to race two DRAM accesses and use the result that returns earlier?

jeffbee · 2026-04-10T16:54:59 1775840099

The memory controller sends the read to the DIMM that is not refreshing. It is invisible to software, except for the side-effect of having better performance.

foltik · 2026-04-10T19:11:29 1775848289

Mirroring is more of a reliability feature though, no? From my understanding it’s like RAID where you keep multiple copies plus parity so uncorrectable errors aren’t catastrophic. Makes sense for mainframes which need to survive hardware failures.

Refresh avoidance is a tangential thing the memory controller happens to be able to do in a scheme like that, but you’d really have to be looking at it in a vacuum to bill it as a benefit.

Like I said, it’s all about cache. You’re not going to DRAM if you actually care about performance fluctuations at the scale of refresh stalls.

jeffbee · 2026-04-10T19:30:59 1775849459

Clearly, hitting a cache would be the better outcome. The technique suggested here could only apply to unavoidably cold reads, some kind of table that's massive and randomly accessed. Assume it exists, for whatever reason. To answer your question, refresh avoidance is an advertised benefit of hardware mirroring. Current IBM techno-advertising that you can Google yourself says this:

"IBM z17 implements an enhanced redundant array of independent memory (RAIM) design with the following features: ... Staggered memory refresh: Uses RAIM to mask memory refresh latency."

foltik · 2026-04-10T20:53:58 1775854438

I can google, thanks. My point is that nobody is buying mainframes with redundant memory to avoid refresh stalls. It’s a mostly irrelevant freebie on hardware you bought for fault tolerance.

ufocia · 2026-04-11T14:21:37 1775917297

Do you have evidence that this is a fact? Have you looked at the computing requirements documents for, for example, stock exchanges? I have it on good evidence that stock exchanges ran on mainframes. They are essentially the counterparty (in a computing sense not a financial sense) in each placed order. If someone is willing to run a fiberoptic cable from Chicago to New York or New Jersey to exploit reduced propagation delay, admittedly much larger than a refresh stall, wouldn't you think that they or someone else would also be interested in predicting computing stalls. An exchange would face at least a significant reputational risk if it could be exploited that way.

foltik · 2026-04-12T09:09:25 1775984965

The low latency matching engines in colos run Linux these days, and we use microwave instead of fiber. Incoming orders are processed by hardware receive timestamp, so predicting jitter doesn’t give you an advantage. Clearing and settlement I’m not sure about, not latency critical though, mainframes wouldn’t surprise me there.

josephg · 2026-04-10T09:22:21 1775812941

I hope this approach gets some visibility in the CPU field. It could be obviously improved with a special cpu instruction which simply races two reads and returns the first one which succeeds. She’s doing an insane amount of work, making multiple threads and so on (and burning lots of performance) all to work around the lack of dedicated support for this in silicon.

robinsonb5 · 2026-04-10T14:45:29 1775832329

I actually hope it doesn't!

The results are impressive, but for the vast, vast majority of applications the actual speedup achieved is basically meaningless since it only applies to a tiny fraction of memory accesses.

For the use case Laurie mentioned - i.e. high-frequency trading - then yes, absolutely, it's valuable (if you accept that a technology which doesn't actually achieve anything beyond transmuting energy into money is truly valuable).

For the rest of us, the last thing the world needs is a new way to waste memory, especially given its current availability!

josephg · 2026-04-09T22:19:24 1775773164

Yes; linux is generally supported better than freebsd. CUDA and Docker work out of the box on linux. Linux has better graphics drivers and steam support. Opensource software (libraries, tools) are much more likely to be tested & work properly on linux. I've also run into several rust crates which don't build on freebsd - particularly crates which depend on C code.

But the comment you're replying to said there weren't many good technical reasons to prefer freebsd over linux. I think that's broadly true.

I still really like freebsd though. Unlike linux, one community is responsible for the kernel and userspace. That makes the whole OS feel much more cohesive. You don't have to worry about supporting 18 different distributions, which all do their own thing.

jmspring · 2026-04-09T22:54:37 1775775277

FreeBSD's development philosophy, it's aversion to design decisions like - we must allow systemd everywhere, stability, zfs and jails, consistent configuration (for decades) are all technical reasons I prefer it over Linux.

How about Ubuntu and snaps? License needed for certain security updates, etc.

josephg · 2026-04-09T10:03:09 1775728989

IMAP works in outlook. Its just horrible to set up and half broken. Click "Add account". Then type in your email address, click "Choose provider", select IMAP, then click "Sync directly with IMAP" (dark pattern hidden button). If you don't click that last button, outlook uploads your IMAP email credentials to their own MS Cloud instance, and that proxies all your emails via microsoft's cloud servers. Do they read your email messages for advertising? Nobody knows!

In my testing, the local IMAP client implementation quite frequently launches a DoS attack against your IMAP server. It'll send the same query requesting new mail messages in a tight loop, limited by the round-trip latency. But luckily, almost nobody uses IMAP via outlook because its so difficult to set up.

Avamander · 2026-04-09T19:39:16 1775763556

> If you don't click that last button, outlook uploads your IMAP email credentials to their own MS Cloud instance, and that proxies all your emails via microsoft's cloud servers. Do they read your email messages for advertising? Nobody knows!

I've seen cases where people have it set up like that and it's so awfully slow. Minutes to display a single new message. That cloud brings absolutely zero user-benefit.

josephg · 2026-04-09T09:59:13 1775728753

There's also two different applications which are both "Outlook for Mac".

If you go into the "Outlook" menu in the app, there's a "Legacy Outlook" button, which relaunches outlook using a completely different binary. The two outlook implementations have different bugs and all sorts of different behaviour.

Outlook For Mac is free but "legacy outlook" requires a MS365 subscription for some reason.

Outlook is also not to be confused with Microsoft's "Web Outlook" client, available at outlook.live.com. It all seems totally insane.

cutler · 2026-04-09T10:09:33 1775729373

< It all seems totally insane.

This is Microsoft we're talking about, right?

josephg · 2026-04-07T23:10:10 1775603410

It still has a very ... plastic feeling. The way it writes feels cheap somehow. I don't know why, but Claude seems much more natural to me. I enjoy reading its writing a lot more.

That said, I'll often throw a prompt into both claude and chatgpt and read both answers. GPT is frequently smarter.

kranke155 · 2026-04-08T10:14:34 1775643274

GPT is more accurate. But Claude has this way of association between things that seems smarter and more human to me.

josephg · 2026-04-07T23:03:43 1775603023

> Combined results (Claude Mythos / Claude Opus 4.6 / GPT-5.4 / Gemini 3.1 Pro)

> Terminal-Bench 2.0: 82.0% / 65.4% / 75.1% / 68.5%

> USAMO: 97.6% / 42.3% / 95.2% / 74.4%

> The biggest jump in the numbers they quoted is 6%.

Just in the numbers you quoted, thats a 16.6% jump in terminal-bench and a 55.3% absolute increase in USAMO over their previous Opus 4.6 model.

devmor · 2026-04-07T23:13:45 1775603625

I don’t know if you’re willingly disregarding everything being said to you or there’s a language barrier here.

dang · 2026-04-10T15:29:27 1775834967

Can you please stop posting comments with personal swipes in them? You've unfortunately been doing it repeatedly. It's not what this site is for, and destroys what it is for.

If you wouldn't mind reviewing https://news.ycombinator.com/newsguidelines.html and taking the intended spirit of the site more to heart, we'd be grateful.

devmor · 2026-04-10T22:31:23 1775860283

You're right, I apologize for that. I have been responding with annoyance rather than walking away when I receive replies that appear to be ignoring context.

dang · 2026-04-11T03:16:39 1775877399

Appreciated! and of course, I know it's not easy - believe me I know...

josephg · 2026-04-07T22:01:40 1775599300

To be clear, we don’t know that this tool is better at finding bugs than fuzzing. We just know that it’s finding bugs that fuzzing missed. It’s possible fuzzing also finds bugs that this AI would miss.

underdeserver · 2026-04-07T23:11:10 1775603470

I would suggest watching Nicholas Carlini's talk and Heather Adkins and Four Flynn's talks from unprompted:

https://youtu.be/1sd26pWhfmg?si=onOai_ocxkZeNWP0

https://youtu.be/B_7RpP90rUk?si=HkRBhw95DbbKX9lL

My takeaway is that fuzzing is not just complementary, it also gives a stronger AI a starting point. But AI is generally faster and better.

josephg · 2026-04-08T06:03:45 1775628225

Thanks - these talks are mindblowing. Highly recommended.

nextos · 2026-04-07T22:19:08 1775600348

Different methods find different things. Personally, I'd rather use a language that is memory safe plus a great static analyzer with abstract interpretation that can guarantee the absence of certain classes of bugs, at the expense of some false positives.

The problem is that these tools, such as Astrée, are incredibly expensive and therefore their market share is limited to some niches. Perhaps, with the advent of LLM-guided synthesis, a simple form of deductive proving, such as Hoare logic, may become mainstream in systems software.

ComplexSystems · 2026-04-07T22:38:43 1775601523

This line of reasoning makes no sense when the AI can just be given access to a fuzzer. I would guess that it probably did have access to a fuzzer to put together some of these vulnerabilities.

acdha · 2026-04-07T23:44:17 1775605457

Carlini talked about that a fair amount in the context of pairing the two: e.g. many protocols are challenging for fuzzers because they have something like a checksum or signature but LLMs are good at coming up with harnesses for things like that. I’m sure that we’re going to see someone building an integrated fuzzer soon which tries to do things like figure out how to get a particular branch to follow an unexercised path.

kristofferR · 2026-04-07T22:03:24 1775599404

AI can initate the fuzzing and optimize the process of fuzzing.

tptacek · 2026-04-08T02:31:47 1775615507

This is obviously just cope (there's a long, strong-form argument for why LLM-agent vulnerability research is plausibly much more potent than fuzzing, but we don't have to reach it because you can dispose of the whole argument by noting that agents can build and drive fuzzers and triage their outputs), but what I'd really like to understand better is why? What's the impetus to come up with these weird rationalizations for why it's not a big deal that frontier models can identify bugs everyone else missed and then construct exploits for them?

josephg · 2026-04-08T05:42:42 1775626962

I don't have an anti-AI stance. Maybe I should have spelled that out more clearly in my comment above. I'm as excited and terrified by this technology as everyone else. I think we're all in vicious agreement that we need defense-in-depth - including LLMs and fuzzing (and static analysis and so on).

An LLM can guide all of this work, but current models tend to slowly go off the rails if you don't keep a hand on the wheel. I suspect this new model will be the same. I've had Opus4.6 write custom fuzzing tools from scratch, and I've gotten good results from that. But you just know people will prompt this new model by saying "make this software secure". And it'll forget fuzzing exists at all.

ofjcihen · 2026-04-08T04:30:06 1775622606

Good lord, why such a virulent response to something that seems like we should be considering?

As someone in cybersecurity for 10+ years my immediate assumption is why not both? I don’t think considering that they could both have their uses is “cope”.

tptacek · 2026-04-08T04:34:55 1775622895

Again: LLM agents already are both. But it's also remarkable and worth digging into the fact that LLM agents haven't needed fuzzers to produce many (any? in Anthropic Red's case?) of the vulnerabilities they're discussing.

josephg · 2026-04-08T05:46:23 1775627183

Do we know that? I'd love to see some of the ways security researchers are using LLMs. We have no idea if claude was using fuzzing here, or just reading the files and spotting bugs directly in the source code.

A few weeks ago someone talked about their method for finding bugs in linux. They prompted claude with "Find the security bug in this program. Hint: It is probably in file X.". And they did that for every file in the repo.

0123456789ABCDE · 2026-04-08T17:27:06 1775669226

> Since then, this weakness has been missed by every fuzzer and human who has reviewed the code, and points to the qualitative difference that advanced language models provide. [^1]

> At no point in time does the program take some easy-to-identify action that should be prohibited, and so tools like fuzzers can’t easily identify such weaknesses. [^2]

[^1]: https://red.anthropic.com/2026/mythos-preview/#:~:text=Since...

[^2]: https://red.anthropic.com/2026/mythos-preview/#:~:text=At%20...

ofjcihen · 2026-04-08T04:41:25 1775623285

Are you saying that LLMs can use fuzzers or are you saying that they work like fuzzers? Because one of those is less…deterministic? Then the other.

Regardless and in the spirit of my original response my answer would be to give the LLM access to a fuzzer (plus other tools etc) but also have fuzzers in the pipeline. Partially because that increases the determinism in the mix and partially because why not? Layering is almost always better than not.

But again more than anything I’m focusing on the accusations of cope. People SHOULD have measured reactions to claims about any product. People SHOULD be asking questions like this. I know that the LLM debate is often “spicy” but man let’s just try to lower the temperature a bit yeah?

tptacek · 2026-04-08T04:45:16 1775623516

LLMs can use fuzzers and also LLMs can explore the semantic space of a program in ways fuzzers can't.

ACCount37 · 2026-04-08T06:48:18 1775630898

You said it yourself. It's cope. That's all it is and all it ever was.

https://en.wikipedia.org/wiki/AI_effect

Every time an AI does something new, there's a human saying "it's not really doing that something", "it's doing that something in a fake way" or "that something was never important in the first place".

josephg · 2026-04-08T23:36:20 1775691380

Alright, except that’s not what I was saying. I was just pointing out that LLMs don’t replace fuzzing or static analysis. They complement those techniques. And yes, LLMs may drive those techniques directly, but they often don’t. At least not yet.

josephg · 2026-04-07T11:06:59 1775560019

Super interesting. I wish this article wasn’t written by an LLM though. It feels soulless and plastic.

ChrisRR · 2026-04-07T11:44:37 1775562277

It's not setting off any LLM alarm bells to me. It just reads like any other scientific article, which is very often soulless

Jolter · 2026-04-07T16:32:16 1775579536

It repeats a few points too many times for a professional writer to not catch it.

I don’t mind that they let an LLM write the text, but they should at least have edited it.

bbstats · 2026-04-07T14:16:32 1775571392

the subheadings are extremely AI IMHO

fragmede · 2026-04-07T15:31:42 1775575902

Isn't that just a normal way to organize a large document?

embedding-shape · 2026-04-07T11:24:08 1775561048

Any specific sections that stick out? Juxt in the past had really great articles, even before LLMs, and know for a fact they don't lack the expertise or knowledge to write for themselves if they wanted and while I haven't completely read this article yet, I'd surprise me if they just let LLMs write articles for them today.

croemer · 2026-04-07T11:26:05 1775561165

Here's one tell-tale of many: "No alarm, no program light."

Another one: "Two instructions are missing: [...] Four bytes."

One more: "The defensive coding hid the problem, but it didn’t eliminate it."

monooso · 2026-04-07T11:36:11 1775561771

That's just writing. I frequently write like that.

This insistence that certain stylistics patterns are "tell-tale" signs that an article was written by AI makes no sense, particularly when you consider that whatever stylistic ticks an LLM may possess are a result of it being trained on human writing.

croemer · 2026-04-07T11:45:30 1775562330

These are just some of the good examples I found.

My hunch that this is substantially LLM-generated is based on more than that.

In my head it's like a Bayesian classifier, you look at all the sentences and judge whether each is more or less likely to be LLM vs human generated. Then you add prior information like that the author did the research using Claude - which increases the likelihood that they also use Claude for writing.

Maybe your detector just isn't so sensitive (yet) or maybe I'm wrong but I have pretty high confidence at least 10% of sentences were LLM-generated.

Yes, the stylistic patterns exist in human speech but RLHF has increased their frequency. Also, LLM writing has a certain monotonicity that human writing often lacks. Which is not surprising: the machine generates more or less the most likely text in an algorithmic manner. Humans don't. They wrote a few sentences, then get a coffee, sleep, write a few more. That creates more variety than an LLM can.

Fun exercise: https://en.wikipedia.org/wiki/Wikipedia:AI_or_not_quiz

monooso · 2026-04-07T11:57:42 1775563062

Here's an alternative way of thinking about this...

Someone probably expended a lot of time and effort planning, thinking about, and writing an interesting article, and then you stroll by and casually accuse them of being a bone idle cheat, with no supporting evidence other than your "sensitive detector" and a bunch of hand-wavy nonsense that adds up to naught.

xmcqdpt2 · 2026-04-07T13:18:28 1775567908

To start, this is more or less an advertising piece for their product. It's pretty clear that they want to sell you Allium. And that's fine! They are allowed! But even if that was written by a human, they were compensated for it. They didn't expend lots of effort and thinking, it's their job.

More importantly, it's an article about using Claude from a company about using Claude. I think on the balance it's very likely that they would use Claude to write their technical blog posts.

monooso · 2026-04-07T13:22:40 1775568160

> They didn't expend lots of effort and thinking, it's their job.

Your job doesn't require you to think or expend effort?

kenjackson · 2026-04-07T12:59:37 1775566777

While I agree with the sentiment, using AI to write the final draft of the article isn’t cheating. People may not like it, but it’s more a stylistic preference.

TylerE · 2026-04-07T19:32:18 1775590338

Using AI and a human byline is 100% cheating.

josephg · 2026-04-08T03:00:53 1775617253

Yeah I agree. Don't tell me you authored something when claude did the majority of the writing. Use claude if you want, but don't pretend you wrote the content when you didn't.

I also hate this style of plastic, pre-digested prose. Its soulless and uninteresting. Maybe I've just read too much AI slop. I associate this writing style with low quality, uninteresting junk.

bookofjoe · 2026-04-07T13:06:20 1775567180

Yet another way the mere possibility of AI/LLM being involved diminishes the value of ALL text.

If there is constant vigilance on the part of the reader as to how it was created, meaning and value become secondary, a sure path to the death of reading as a joy.

NetMageSCW · 2026-04-07T13:55:12 1775570112

Those aren’t good examples - that’s just LLMs living for free in your head.

aionwikipedia · 2026-04-11T18:36:19 1775932579

This is a myth. At least one study (Juzek/Ward) has shown that these stylistic patterns do not appear nearly as often either in well-known datasets of English language -- including datasets restricted to specific dialects of English. They don't even appear as often in text generated by raw language models. When they start showing up is after the model has undergone RLHF. Think of the Fermi paradox: if there are all these people who write like AI, then where are they?

AI writing also tends to show these indicators over and over, consistently, over passages of text. It is very hard for humans, even if they are really familiar with AI writing, to be that consistent, and almost impossible for them to be that consistent for more than a sentence or two. Writing a long blog post by hand that is believably "AI-written" takes the amount of purposeful skill you'd need to forge an entire painting or ancient document.

The problem is that people either look for the wrong things, look for obsolete things ("delve" is dead and modern LLMs have killed it), or extrapolate things from indicators that are extremely narrow and specific.

oscaracso · 2026-04-07T12:19:01 1775564341

I am reminded of the Simpsons episode in which Principal Skinner tries to pass off the hamburgers from a near-by fast food restaurant for an old family recipe, 'steamed hams,' and his guest's probing into the kitchen mishaps is met with increasingly incredible explanations.

gcr · 2026-04-07T11:37:28 1775561848

See also: “I'm Kenyan. I Don't Write Like ChatGPT. ChatGPT Writes Like Me” by Marcus Olang', https://marcusolang.substack.com/p/im-kenyan-i-dont-write-li...

For what it’s worth, Pangram reports that Marcus’ article is 100% LLM-written: https://www.pangram.com/history/640288b9-e16b-4f76-a730-8000...

croemer · 2026-04-07T11:50:29 1775562629

In theory, wouldn't be too hard be to settle the question if whether he used ChatGPT to write it: get Olang to write a few paragraphs by hand, then have people judge (blindly) if it's the same style as the article. Which one sounds more like ChatGPT.

embedding-shape · 2026-04-07T12:23:28 1775564608

The times I've written articles, and those have gone through multiple rounds of reviews (by humans) with countless edits each time, before it ends up being published, I wonder if I'd pass that test in those cases. Initial drafts with my scattered thoughts usually are very different from the published end results, even without involving multiple reviewers and editors.

jmalicki · 2026-04-07T15:26:15 1775575575

When people judge blindly, the are more likely to think the human is the AI and the AI is the human.

73% judged GPT 4.5 (edit: had incorrectly said 4o before)to be the human.

https://arxiv.org/abs/2503.23674

Not only are people bad at judging this, but are directionally wrong.

nothinkjustai · 2026-04-07T16:52:54 1775580774

There is research showing the contrary that is far more convincing:

> Our experiments show that annotators who frequently use LLMs for writing tasks excel at detecting AI-generated text, even without any specialized training or feedback. In fact, the majority vote among five such “expert” annotators misclassifies only 1 of 300 articles, significantly outperforming most commercial and open-source detectors we evaluated even in the presence of evasion tactics like paraphrasing and humanization.

https://arxiv.org/html/2501.15654v2

croemer · 2026-04-07T17:03:16 1775581396

Great find, I've submitted this preprint as a standalone item: https://news.ycombinator.com/item?id=47678270

brookst · 2026-04-07T13:04:39 1775567079

I’m so glad the witch hunt has moved on to phrasing so I get less grief for my em dashes.

nothinkjustai · 2026-04-07T16:50:13 1775580613

No, it’s pretty obviously AI written. Not sure why you’re running so much interference for them…are you affiliated with this company?

360MustangScope · 2026-04-07T11:45:25 1775562325

I hate that I can’t write em dashes freely anymore without people accusing the writing of being AI generated.

Even though they are perfect for usage in writing down thoughts and notes.

d1sxeyes · 2026-04-07T13:24:56 1775568296

One thing you can try⸺admittedly it's not quite correct⸺is replacing them with a two-em dash. I've never seen an AI use one, and it looks pretty funky.

Majromax · 2026-04-07T14:46:21 1775573181

Since the advantage of standards is that there are so many to choose from, one lesser-used but still regionally acceptable approach (e.g. https://www.alberta.ca/web-writing-style-guide-punctuation#j...) is to use en-dashes offset with spaces.

croemer · 2026-04-07T11:48:05 1775562485

I have nothing against em dashes. As long as your writing is human, experienced readers will be able to tell it's human. Only less experienced ones will use all or nothing rules. Em dashes just increase the likelihood that the text was LLM generated. They aren't proof.

brookst · 2026-04-07T13:08:07 1775567287

That nuance is lost on the majority of anti-AI folks who’ve learned they get positive social reactions by declaring essentially everything to be AI written and condemnable.

“An em dash… they’re a witch!”… “it’s not just X, it’s Y… they’re a witch!”

andersonpico · 2026-04-07T13:43:41 1775569421

> anti-AI folks who’ve learned they get positive social reactions by declaring essentially everything to be AI written and condemnable.

that's a strawman alright; all the comments complaining how they can't use their writing style without being ganged up on are positive karma from my angle, so I'm not sure the "positive social reactions" are really aligned with your imagination. Or does it only count when it aligns with your persecution complex?

NetMageSCW · 2026-04-07T13:58:12 1775570292

You have the same problem apparently. You think it’s okay to go witch hunting and accuse people with no real evidence.

NetMageSCW · 2026-04-07T13:57:04 1775570224

Evidently there are no experienced readers who post AI accusations.

gopher_space · 2026-04-07T16:40:25 1775580025

Same weight as "there are no experienced men who'll ask a woman if she's pregnant."

NetMageSCW · 2026-04-07T13:56:33 1775570193

Why do you care what others accuse you of?

tapoxi · 2026-04-07T11:53:39 1775562819

This is my exact writing style - I'm screwed.

croemer · 2026-04-07T12:03:12 1775563392

I doubt you write like that. Where can I find your writing other than your comments which IMO don't read like the blog post?

NetMageSCW · 2026-04-07T13:58:27 1775570307

Justify your doubt.

TruffleLabs · 2026-04-07T11:59:57 1775563197

This is just writing; terse maybe and maybe not grammatically correct, but people write like that.

croemer · 2026-04-07T12:06:29 1775563589

It's not just terseness, it's the rhythm and "it's not x, it's y".

In fact, the latter is the opposite of terseness. LLMs love to tell you what things are not way more than people do.

See https://www.blakestockton.com/dont-write-like-ai-1-101-negat...

(The irony that I started with "it's not just" isn't lost on me)

wk_end · 2026-04-07T13:29:52 1775568592

> (The irony that I started with "it's not just" isn't lost on me)

But an LLM wouldn't write "It's not just X, it's the Y and Z". No disrespect to your writing intended, but adding that extra clause adds just the slightest bit of natural slack to the flow of the sentence, whereas everything LLMs generate comes out like marketing copy that's trying to be as punchy and cloying as possible at all times.

djmips · 2026-04-07T15:46:51 1775576811

"Here’s how the bug might have manifested."

ModernMech · 2026-04-07T11:20:39 1775560839

I'm starting to develop a physiological response when I recognize AI prose. Just like an overwhelming frustration, as if I'm hearing nails on chalkboard silently inside of my head.

voodooEntity · 2026-04-07T11:32:09 1775561529

I feel ya.... and i have to admit in the past i tried it for one article in my own blog thinking it might help me to express... tho when i read that post now i dont even like it myself its just not my tone.

therefor decided not gonne use any llm for blogging again and even tho it takes alot more time without (im not a very motivated writer) i prefer to release something that i did rather some llm stuff that i wouldnt read myself.

gcr · 2026-04-07T11:36:09 1775561769

For what it’s worth, Pangram thinks this article is fully human-written: https://www.pangram.com/history/f5f68ce9-70ac-4c2b-b0c3-0ca8...

Aurornis · 2026-04-07T13:05:15 1775567115

The AI writing detectors are very unreliable. This is important to mention because they can trigger in the opposite direction (reporting human written text as AI generated) which can result in false accusations.

It’s becoming a problem in schools as teachers start accusing students of cheating based on these detectors or ignore obvious signs of AI use because the detectors don’t trigger on it.

xmcqdpt2 · 2026-04-07T11:42:48 1775562168

Then pangram isn't very good, because that article is full of Claude-isms.

embedding-shape · 2026-04-07T12:17:10 1775564230

> because that article is full of Claude-isms

Not sure how I feel about the whole "LLMs learned from human texts, so now the people who helped write human texts are suddenly accused of plagiarizing LLMs" thing yet, but seems backwards so far and like a low quality criticism.

snapcaster · 2026-04-07T12:43:26 1775565806

Real talk. You're not just making a good point -- you're questioning the dominant paradigm

jnwatson · 2026-04-07T12:51:27 1775566287

Horrible

xmcqdpt2 · 2026-04-07T13:12:41 1775567561

I'm sure some human writers would write:

> The specification forces this question on every path through the IMU mode-switching code. A reviewer examining BADEND would see correct, complete cleanup for every resource BADEND was designed to handle.

> The specification approaches from the other direction: starting from LGYRO and asking whether any paths fail to clear it.

> *Tests verify the code as written; a behavioural specification asks what the code is for.*

However this is a blog post about using Claude for XYZ, from an AI company whose tagline is

"AI-assisted engineering that unlocks your organization's potential"

Do you really think they spent the time required to actually write a good article by hand? My guess is that they are unlocking their own organizations potential by having Claude writes the posts.

embedding-shape · 2026-04-07T13:21:39 1775568099

> Do you really think they spent the time required to actually write a good article by hand?

Given I'm familiar with Juxt since before, used plenty of their Clojure libraries in the past and hanged out with people from Juxt even before LLMs were a thing, yes, I do think they could have spent the time required to both research and write articles like these. Again, won't claim for sure I know how they wrote this specific article, but I'm familiar with Juxt enough to feel relatively confident they could write it.

Juxt is more of a consultancy shop than "AI company", not sure where you got that from, guess their landing page isn't 100% clear what they actually does, but they're at least prominent in the Clojure ecosystem and has been for a decade if not more.

NetMageSCW · 2026-04-07T14:00:05 1775570405

Your guess is worth what you paid for it.

DiffTheEnder · 2026-04-07T11:50:24 1775562624

Is it possible for a tool to know if something is AI written with high confidence at all? LLMs can be tuned/instructed to write in an infinite number of styles.

Don't understand how these tools exist.

gcr · 2026-04-07T12:23:44 1775564624

The WikiEDU project has some thoughts on this. They found Pangram good enough to detect LLM usage while teaching editors to make their first Wikipedia edits, at least enough to intervene and nudge the student. They didn’t use it punatively or expect authoritative results however. https://wikiedu.org/blog/2026/01/29/generative-ai-and-wikipe...

They found that Pangram suffers from false positives in non-prose contexts like bibliographies, outlines, formatting, etc. The article does not touch on Pangram’s false negatives.

I personally think it’s an intractable problem, but I do feel pangram gives some useful signal, albeit not reliably.

cameronh90 · 2026-04-07T11:52:21 1775562741

It has Claude-isms, but it doesn't feel very Claude-written to me, at least not entirely.

What's making it even more difficult to tell now is people who use AI a lot seem to be actively picking up some of its vocab and writing style quirks.

mbo · 2026-04-07T14:07:04 1775570824

Pangram has a very low false positive rate, but not the best false negative rate: https://www.pangram.com/blog/third-party-pangram-evals

NetMageSCW · 2026-04-07T13:59:48 1775570388

You sound like a flat earther and a moon landing denier combined.

croemer · 2026-04-07T12:28:08 1775564888

Pangram doesn't reliably detect individual LLM-generated phrases or paragraphs among human written text.

It seems to look at sections of ~300 words. And for one section at least it has low confidence.

I tested it by getting ChatGPT to add a paragraph to one of my sister comments. Result is "100% human" when in fact it's only 75% human.

Pangram test result: https://www.pangram.com/history/1ee3ce96-6ae5-4de7-9d91-5846...

ChatGPT session where it added a paragraph that Pangram misses: https://chatgpt.com/share/69d4faff-1e18-8329-84fa-6c86fc8258...

gcr · 2026-04-07T12:45:33 1775565933

This is useful, thanks! TIL

timdiggerm · 2026-04-07T12:43:21 1775565801

So you're saying Pangram isn't worth much?

croemer · 2026-04-07T16:16:43 1775578603

And it turns out at least the part about Rust and locks is plain wrong. What a surprise: https://news.ycombinator.com/reply?id=47676938&goto=item%3Fi...

TruffleLabs · 2026-04-07T11:57:51 1775563071

"Written by an LLM" based on what data or symptom?

croemer · 2026-04-07T21:26:51 1775597211

Incidental finding: another blog posts was written by Claude and they admit it openly in the last paragraph (not earlier):

   A Note on the Process
   To be clear about what happened here: Claude wrote this article.

https://www.juxt.pro/blog/what-we-learned-from-34-clojure-in...

jandrese · 2026-04-07T17:00:21 1775581221

AI tends to write like it is getting paid by the word. This article wasn't too egregious but an editor could have improved it.

mpalmer · 2026-04-07T12:17:00 1775564220

I've seen way, way worse. Either someone LLM-polished something they already wrote, or they did their own manual editing pass.

The short sentence construction is the most suspicious, but I actually don't see anything glaring. It normally jumps out and hits me in the face.

bookofjoe · 2026-04-07T13:15:41 1775567741

>Hemingway's 4 Fast Rules For Effective Writing

1. Use Short Sentences

https://www.wordsthatsing.com.au/post/hemingway-rules

mpalmer · 2026-04-07T13:45:54 1775569554

I didn't say they're dispositive. I said they're suspicious. Most people don't write effectively.

NetMageSCW · 2026-04-07T14:02:54 1775570574

So LLMs write effectively and when people do you accuse them of using an LLM?

mpalmer · 2026-04-07T14:49:29 1775573369

No, they don't. They use short sentences in weird, stilted ways.

bookofjoe · 2026-04-07T20:07:13 1775592433

But you have the ability to detect those "weird, stilted ways." Impressive.

bookofjoe · 2026-04-08T19:06:36 1775675196

See also: https://www.joanwestenberg.com/the-ai-writing-witchhunt-is-p...

monooso · 2026-04-07T11:51:02 1775562662

You have no evidence that it was.

NiloCK · 2026-04-07T11:52:41 1775562761

This is the top reply on a substantial percentage of HN posts now and we should discourage it.

It is:

- sneering

- a shallow dismissal (please address the content)

- curmudgeonly

- a tangential annoyance

All things explicitly discouraged in the site guidelines. [1]

Downvoting is the tool for items that you think don't belong on the front page. We don't need the same comment on every single article.

[1] - https://news.ycombinator.com/newsguidelines.html

timdiggerm · 2026-04-07T12:45:22 1775565922

It's not a shallow dismissal; it's a dismissal for good reason. It's tangential to the topic, but not to HN overall. It's only curmudgeonly if you assume AI-written posts are the inevitable and good future (aka begging the question). I really don't know how it's "sneering", so I won't address that.

NetMageSCW · 2026-04-07T14:03:33 1775570613

It’s a dismissal with no evidence i.e. it’s a witch hunt. And no one should support that.

s08148692 · 2026-04-07T14:50:06 1775573406

The fact that the whole thread has basically devolved into debates over if it is or isn't an LLM written article is proving well enough that it doesn't really matter one way or another

signatoremo · 2026-04-07T17:59:36 1775584776

It is a witch hunt with no evidence whatsoever, all based on intuition. It is distraction from the main topic, a topic that enough people find interesting to stay on the top page. What was intellectually interesting has now become a bore fest of repeated back and forth. That’s disrespectful and inconsiderate. Write a new post about why do you think AI writing is dangerous. I don’t mind that. I’d upvote it.

bakugo · 2026-04-07T13:35:40 1775568940

The site guidelines were written pre-AI and stop making sense when you add AI-generated content into the equation.

Consider that by submitting AI generated content for humans to read, the statement you're making is "I did not consider this worth my time to write, but I believe it's worth your time to read, because your time is worth less than mine". It's an inherently arrogant and unbalanced exchange.

NiloCK · 2026-04-07T15:36:23 1775576183

> The site guidelines were written pre-AI and stop making sense when you add AI-generated content into the equation.

Note: the guidelines are a living document that contain references to current AI tools.

> Consider that by submitting AI generated content for humans to read, the statement you're making is "I did not consider this worth my time to write, but I believe it's worth your time to read, because your time is worth less than mine". It's an inherently arrogant and unbalanced exchange.

This is something worth saying about a pure slop content. But the "charge" against the current item is that a reader encountered a feeling that an LLM was involved in the production of interesting content.

With enough eyeballs, all prose contains LLM tells.

We don't need to be told every time someone's personal AI detection algorithm flags. It's a cookie-banner comment: no new information for the reader, but a frustratingly predictable obstacle to scroll through.

bakugo · 2026-04-07T19:41:12 1775590872

We wouldn't need any personal AI detection algorithm flags if the authors simply stated up front that their content is AI generated.

But they won't do that, because deep down they feel shameful about it (as they should).

masklinn · 2026-04-07T12:22:20 1775564540

> Downvoting is the tool for items that you think don't belong on the front page.

You can’t downvote submissions. That’s literally not a feature of the site. You can only flag submissions, if you have more that 31 karma.

ezfe · 2026-04-07T13:45:10 1775569510

And flagging is appropriate when you think content is not authentic

NiloCK · 2026-04-07T12:35:12 1775565312

Twelve year old account and who knows how much lurking before that and I've never noticed this. Good lord.

Optimistically, I guess I can call myself some sort of live-and-let-live person.

josephg · 2026-04-08T03:08:21 1775617701

The guidelines you linked says this:

> Don't post generated comments or AI-edited comments. HN is for conversation between humans.

The same principle applies to submissions. If you couldn't be bothered to write it, don't ask me to read it. HN is for humans.

monooso · 2026-04-07T11:59:38 1775563178

No idea why you're being downvoted. I've done my bit to redress the balance, I hope others do the same.

rudhdb773b · 2026-04-07T12:24:20 1775564660

Not to single out your comment, but it feels like it's gotten to the point where HN could use a rule against complaining about AI generated content.

It seems like almost every discussion has at least someone complaining about "AI slop" in either the original post or the comments.

Aurornis · 2026-04-07T13:16:39 1775567799

I disagree. I like to read articles and explore Show HN posts, but in the past 6 months I’ve wasted a lot of time following HN links that looked interesting but turned out to be AI slop. Several Show HN posts lately have taken me to repos that were AI generated plagiarisms of other projects, presented on HN as their own original ideas.

Seeing comments warning about the AI content of a link is helpful to let others know what they’re getting into when they click the link.

For this article the accusations are not about slop (which will waste your time) but about tell-tell signs of AI tone. The content is interesting but you know someone has been doing heavy AI polishing, which gives articles a laborious tone and has a tendency to produce a lot of words around a smaller amount of content (in other words, you’re reading an AI expansion of someone’s smaller prompt, which contained the original info you’re interested in)

Being able to share this information is important when discussing links. I find it much more helpful than the comments that appear criticizing color schemes, font choices, or that the page doesn’t work with JavaScript disabled.

croemer · 2026-04-07T13:57:43 1775570263

> you’re reading an AI expansion of someone’s smaller prompt, which contained the original info you’re interested in

This got me thinking: what if LLMs are used to do the opposite? To condense a long prompt into a short article? That takes more work but might make the outcome more enjoyable as it contains more information.

Aerolfos · 2026-04-07T14:10:19 1775571019

> This got me thinking: what if LLMs are used to do the opposite? To condense a long prompt into a short article? That takes more work but might make the outcome more enjoyable as it contains more information.

You're fighting an uphill battle against the inherent tendency to produce more and longer text. There's also the regression to the mean problem, so you get less information (and more generic) even though the text is shorter.

Basically, it doesn't work

chrisjj · 2026-04-07T13:43:22 1775569402

You're suggesting this is the complainant's fault?

rudhdb773b · 2026-04-07T14:52:12 1775573532

Yes. These HN guidlines already basically cover it:

> Please don't post shallow dismissals, especially of other people's work. A good critical comment teaches us something.

> Please don't complain about tangential annoyances—e.g. article or website formats, name collisions, or back-button breakage. They're too common to be interesting.

josephg · 2026-04-08T03:14:46 1775618086

Its not a person's work. It reads like an LLM's work. If you can't be bothered to write an article yourself, its incredibly arrogant to ask me to read it.

Speaking of the HN guidelines, they also say this:

> Don't post generated comments or AI-edited comments. HN is for conversation between humans.

chrisjj · 2026-04-07T14:56:03 1775573763

> Yes. These HN guidlines already basically cover it:

>> Please don't post shallow dismissals, especially of other people's work. A good critical comment teaches us something.

>> Please don't complain about tangential annoyances—e.g. article or website formats, name collisions, or back-button breakage. They're too common to be interesting.

They don't. people. tangential.

NetMageSCW · 2026-04-07T14:01:35 1775570495

Yes, because all of them are now irrational about the possibility of LLM writing something they read.

Gigachad · 2026-04-07T13:06:00 1775567160

HN has gotten to the point where it’s not even worth clicking the link because of course it’s ai slop.

There is some real content in the haystack, but we almost need some kind of curator to find and display it rather than a vote system where most people vote on the title alone.

brookst · 2026-04-07T13:12:18 1775567538

If you’re looking for a place that surfaces only human-written content regardless of whether it’s interesting, rather than interesting content regardless of how it was written, HN is not the place.

There might be a market for your alternative though. Should be easy enough to build with Claude Code.

bakugo · 2026-04-07T13:21:17 1775568077

If the content was interesting, the author would've written about it himself.

By asking AI to write the article for you, you're asserting that the subject matter is not interesting enough to be worth your time to write, so why would it be worth my time to read?

Gigachad · 2026-04-07T13:25:36 1775568336

You just need AI to read it for you and summarise back in to the original prompt.

malcolmjuxt · 2026-04-07T19:43:08 1775590988

I know the author personally. He's hardly the type of person to publish AI slop. Read his other articles and watch his talks, this is very much Henry's literary style.

croemer · 2026-04-07T21:11:41 1775596301

> Read his other articles

Sure, let me have a look.

He wrote 8 similarly lengthy blog posts in just 2 months: https://www.juxt.pro/blog/from-specification-to-stress-test/ https://www.juxt.pro/blog/three-paradoxes/ https://www.juxt.pro/blog/what-outlasts-the-code/ https://www.juxt.pro/blog/composition-at-a-distance/ https://www.juxt.pro/blog/new-vocabulary-for-an-old-problem/ https://www.juxt.pro/blog/softwares-second-heroic-age/ https://www.juxt.pro/blog/capability-hyperinflation/

They contain a lot of classic LLMisms:

"Implementation is the shrinking currency. Not because it’s worthless, but because supply is exploding."

His past writing was much, much less wordy: https://henrygarner.com/

furyofantares · 2026-04-07T13:51:20 1775569880

Stop voting up slop articles and I'll stop commenting on it.

NetMageSCW · 2026-04-07T14:02:01 1775570521

Point to one.

furyofantares · 2026-04-07T16:41:53 1775580113

This is on the front page now https://rajnandan.com/posts/taste-in-the-age-of-ai-and-llms/

iJohnDoe · 2026-04-07T19:38:57 1775590737

I did not get any “written by LLM vibes”. I enjoyed it and it pulled me in to keep reading.

Who gives a crap if it was written by an LLM. Read it or don’t read it. Your choice.

If it conveys the idea and your learn something new, then it’s mission accomplished.

retard2 · 2026-04-07T11:11:35 1775560295

[flagged]

vrighter · 2026-04-07T11:12:53 1775560373

it's actually the second one I read that fit that description.

josephg · 2026-04-06T23:47:31 1775519251

My rule of thumb is that its good for anything "broad", and weaker for anything "deep". Broad tasks are tasks which require working knowledge of lots of random stuff. Its bad at deep work - like implementing a complex, novel algorithm.

LLMs aren't able to achieve 100% correctness of every line of code. But luckily, 100% correctness is not required for debugging. So its better at that sort of thing. Its also (comparatively) good at reading lots and lots of code. Better than I am - I get bogged down in details and I exhaust quickly.

An example of broad work is something like: "Compile this C# code to webassembly, then run it from this go program. Write a set of benchmarks of the result, and compare it to the C# code running natively, and this python implementation. Make a chart of the data add it to this latex code." Each of the steps is simple if you have expertise in the languages and tools. But a lot of work otherwise. But for me to do that, I'd need to figure out C# webassembly compilation and go wasm libraries. I'd need to find a good charting library. And so on.

I think its decent at debugging because debugging requires reading a lot of code. And there's lots of weird tools and approaches you can use to debug something. And its not mission critical that every approach works. Debugging plays to the strengths of LLMs.