Hacker Newsnew | past | comments | ask | show | jobs | submit | fg137's commentslogin

What are people's thoughts on how this could affect static analysis tools? I know they are very different beats but often they achieve the same goal. Static analysis tools can be slow, and they report lots of false positives.

I wonder if these models will get good + cheap enough so that people rarely reach for static analysis.


LLMs are much better at using tools than replacing tools. The tools are generally a lot faster than trying to achieve the same result with an LLM.

Using LLM coding tools to stay on top of static analysis tool output works very well and adding some guard rails that enforce that there are no issues is probably a good idea. Just like adding CI checks to make sure everything is clean.

As for false positives, it depends on the tool. I tend to avoid tools that generate mostly noise. Most of these tools allow you to disable rules if they produce a lot of noise. Or you can just tell the LLM to fix all the issues. When it's cheaper to fix things than to argue with the rule, just fix it. That used to be really expensive when you had to do that manually. Now it isn't.

I recently did this to an Ansible code base that I needed to refresh after not touching it for a few years. It had hundreds of ansible-lint issues; mostly deprecation warnings and some non fatal other warnings. And 10 minutes later I had zero. Mostly they probably weren't very serious ones but it's a form of technical debt. If you have to fix hundreds of warnings manually, you are probably not going to do it. But if you can wave a magical wand and it all goes a way, why not? I adjusted the guard rails so it now always runs ansible-lint and fixes any issues. Only takes a few seconds extra.


I've been thinking about this. Static analysis tools can also be much faster and most are fully deterministic, so including them in CI can catch bugs or latent bugs before they have a chance to land.

I maintain a static analysis tool using in Firefox's CI. False positives have to be fixed or annotated as non-problems in order for you to land a patch in our tree. That means permitting zero positives (false or true), which is a strict threshold. This is a conscious tradeoff; it requires weakening the analysis and getting some false negatives (missed bugs) in order to keep the signal-to-noise ratio high enough that people don't just ignore it and annotate everything away, or stop running it. Nearly all static analysis tools have to do this balancing act.

AI, as commonly used, is given more leeway. It's kind of fundamental that it must be allowed to hallucinate false positives; that's the source of much of its power. Which means you need layers of verification and validation on top of it. It can be slow, you'll never be able to say "it catches 100% of the errors of this particular form: ...", and yet it catches so much stuff.

Data point: my analysis didn't cover one case that I erroneously thought was unlikely to produce true positives (real bugs), and was more complex to implement than seemed worth the trouble. Opus or Mythos, I'm not sure which, started reporting vulnerabilities stemming from that case, so I scrambled and extended the analysis to cover the gap. It took me long enough to implement that by the time I had a full scan of the source tree, Claude had found every important problem that it reported. The static analysis found several others, and I still honestly don't know whether any of them could ever be triggered in practice.

I still think there's value in the static analysis. Some of those occurrences of the problematic pattern might be reachable now through paths too tricky for the AI to construct. Some of them might turn into real problems when other code changes. It seems worth having fixes for all of them now for both possibilities, and also for the lesser reason of not wanting the AI to waste time trying to exploit them. At the same time, clearly the cost/benefit balance has shifted.

They could also team up: if I relax my standards and allow my analysis to write an additional warnings report of suspected problems, with the clear expectation that they might be false alarms, then I could feed that list to an AI to validate them. Essentially, feed slop to the slop machine and have it nondeterministically filter out the diamonds in the rough.

Food for thought...


Not everyone is on Mac. In fact, most people use Windows. So Safari and Ladybird are out of the question, that's two gone.

Sounds like the author could have used just about any laptop in the world and it would serve him well.

So, what's the point of the article?


Scan QR code -- you don't have our "captcha app" installed, automatically redirect to Play store -- download malware because Google Play's horrible screening -- profit

I must not be the first one to think of this, right?

Right???


Does it hurt Google if that happens? No, not really, unless it happens a lot and one of the victims happens to be a US senator or something. The value of the control this gives them, if adopted widely, is immeasurable, not to mention the ad-targeting value of identifying more people across devices.

Hey at least in September they're going to stop you from installing F-Droid. For your safety, citizen!

Yeah, idiots would fall for it.

Both (Google/Apple) need a much higher level of certification for anything to be allowed to be prompted to install. Either you're already big (and can easily afford to pay for some human time to verify), or you're a manufacturer selling something that has an associated app (again, which implies you're reasonably big and can afford to pay for verification.)

You're neither? Get lost. Somebody types in the name of the app, fine, but the user must find it.


People already complain about the level of control Apple has over apps and you want there to be much more control? That’s never going to happen.

> Backlogs are cleared faster than new items are added

Totally depends on what kind of product and codebase.

Last time I checked, the number of open issues in Claude Code repo has increased.

And I have seen tons of tickets that are open for years. Not because it's technically hard or anything. An intern can do that. Those tickets are not closed because nobody wants to deal with what comes after it.


> Last time I checked, the number of open issues in Claude Code repo has increased.

The Claude Code repo features bug reports that are a mishmash of complains about prompt output, backend responses, documentation updates, browser extensions, etc.

Still, during the last week the repository reports ~2k closed issues vs ~1.3k new issues.

https://github.com/anthropics/claude-code/pulse


I do need to point out that not all meetings are equal, and the "hypocrisy" you are seeing may come from different groups of people.

I am against built-in VPN for the same reason I am against this. There is nothing novel or cutting edge about them. Any browser could have done it back in 2003, but they didn't do it for a reason.

Of course, it's not like any of this matters in the end.


Seems a very competitive market, unlike...

Makes me wonder -- does it make any economic sense for a theater to have screenings before 2pm on a Tuesday? I get that some people can afford the leisure, but I'm almost certain the theater loses money on that

For the first few weeks of a film's release, all of the ticket sales goes to the studio. Pretty much the only revenue for the theatre is popcorn, candy and soda.

Since most theatres have gone full digital, the "projector" won't show the film if there have been no tickets sold. That eliminated the game of buying one ticket and then sneaking in to see a few more movies.


Why do you have projector in scare quotes? And what makes you think they don't screen the movie? Exhibitors have a contractual obligation to show movies a certain number of times a week, and the media players that run them show receipts to the studios ... it would be surprising if they didn't actually do that simply because nobody bought a ticket.

Probably just showing that it's not a film projector, but in actuality, it is still a projector. It may be assumed that digital movies are shown on a giant display, but that's cost prohibitive at that size; it's still a blank screen with light projected onto it.

How much extra does it cost to show an extra movie that nobody shows up to? It seems like the majority of the cost is rent of the building and salary. If employees are already working to operate the few movies that are expected to have people and the building is already being paid for then they may as well use up all their theaters just in case since the extra cost would be low.

Especially since you may have someone walk-up and willing to buy-in to the movie even though it started ten minutes ago, so I think I've answered my question above - they just show everything no matter what the sales look like.

For theaters with reserved seats, if you don't buy early, you only end up with the worst seat.

That said, for "general admission" theaters, if you want to get good seats, you'd have to show up early and waste time watching all those trailers.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: