Hacker Newsnew | past | comments | ask | show | jobs | submit | n_u's commentslogin

This is my second attempt learning Rust and I have found that LLMs are a game-changer. They are really good at proposing ways to deal with borrow-checker problems that are very difficult to diagnose as a Rust beginner.

In particular, an error on one line may force you to change a large part of your code. As a beginner this can be intimidating ("do I really need to change everything that uses this struct to use a borrow instead of ownership? will that cause errors elsewhere?") and I found that induced analysis paralysis in me. Talking to an LLM about my options gave me the confidence to do a big change.


n_u's point about LLMs as mentors for Rust's borrow checker matches my experience. The error messages are famously helpful, but sometimes you need someone to explain the why.

I've noticed the same pattern learning other things. Having an on-demand tutor that can see your exact code changes the learning curve. You still have to do the work, but you get unstuck faster.


Storngly agreed. Or ask it to explain the implications of using different ownership models. I love to ask it for options, to what if scenarios out. It's been incredibly helpful for learning rust.

>In particular, an error on one line may force you to change a large part of your code.

There's a simple trick to avoid that, use `.clone()` more and use fewer references.

In C++ you would be probably copying around even more data unnecessarily before optimization. In Rust everything is move by default. A few clones here and there can obviate the need to think about lifetimes everywhere and put you roughly on par with normal C++.

You can still optimize later when you solved the problem.


Clone doesn't work when you need to propagate data mutations, which is what you need most of the time.

Another option is to just use cells and treat the execution model similarly to JavaScript where mutation is limited specific scopes.


I am old but C is similarly improved by LLM. Build system, boilerplate, syscalls, potential memory leaks. It will be OK when the Linux graybeards die because new people can come up to speed much more quickly

The thing is LLM-assisted C is still memory unsafe and almost certainly has undefined behaviour; the LLM might catch some low hanging fruit memory problems but you can never be confident that it's caught them all. So it doesn't really leave you any better off in the ways that matter.

I don’t code C much, is my passion side language. LLM improves my ability to be productive and quickly. Is not a silver bullet, but is an assist

Almost as well as a human doing it!

Better than a human maybe. But still not good enough to rely on.

I don't see why it shouldn't be even more automated than that, with LLM ideas tested automatically by differential testing of components against the previous implementation.

EDIT: typo fixed, thx


Defining tests that test for the right things requires an understanding of the problem space, just as writing the code yourself in the first place does. It's a catch-22. Using LLMs in that context would be pointless (unless you're writing short-lived one-off garbage on purpose).

I.e. the parent is speaking in the context of learning, not in the context of producing something that appears to work.


I'm not sure that's true. Bombarding code with huge numbers of randomly generated tests can be highly effective, especially if the tests are curated by examining coverage (and perhaps mutation kills) in the original code.

Right, that method is pretty good at finding unintentional behavior changes in a refactor. It is not very well suited for showing that the program is correct which is probably what your parent meant.

That doesn't seem like the same problem at all. The problem here was reimplementing the program in another language, not doing that while at the same time identifying bugs in it.

Conversion of one program to another while preserving behavior is a problem much dumber programs (like compilers) solve all the time.


I'm assuming you meant to type

> I don't see why it *shouldn't be even more automated

In my particular case, I'm learning so having an LLM write the whole thing for me defeats the point. The LLM is a very patient (and sometimes unreliable) mentor.


I think the author is significantly underestimating the technical difficulty of achieving full self-driving cars that are at least as safe and reliable as Waymo. The author claims there will be "26 of the basically identical [self-driving car] companies".

If you recall, there was an explosion of self-driving car efforts from startups and incumbents alike 7ish years ago. Many of them failed to deliver or were shut down. [1][2][3]

Article about the difficulty of self-driving from the perspective of a failed startup[3].

Waymo came out of the Google-self driving car project which came from Sebastian Thrun's entry in 2005 Darpa challenge, so they've been working on this for more than 20 years. [4][5]

[1] https://www.cnn.com/2022/10/26/business/ford-argo-ai-vw-shut...

[2] https://en.wikipedia.org/wiki/List_of_predictions_for_autono...

[3] https://medium.com/starsky-robotics-blog/the-end-of-starsky-...

[4] https://stanford.edu/~cpiech/cs221/apps/driverlessCar.html

[5] https://semiwiki.com/eda/synopsys/3322-sebastian-thrun-self-...


But that is the author's point. I don't see many of the same alternatives years later.

They have either shut down, got acquired or were sold off and then shutdown. Even Uber and Lyft had their own self-driving programs and both of them shut theirs down. Cruise was recently taken off the streets and not much has been done with them.

The only ones that have been around from more than 7 years are Comma.ai (which the author geohot still owns), Waymo and Tesla and Zoox, but they ran out of money and is now owned by Amazon.


He funded Comma.ai, so he does understand the problem domain & complexity.

As I understand, Comma.ai is focused on driver-assistance and not fully autonomous self-driving.

The features listed on the wikipedia are lane-centering, cruise-control, driver monitoring, and assisted lane change.[1]

The article I linked to from Starsky addresses how the first 90% is much easier than the last 10% and even cites "The S-Curve here is why Comma.ai, with 5–15 engineers, sees performance not wholly different than Tesla’s 100+ person autonomy team."

To give an example of the difficulty of the last 10%: I saw an engineer from Waymo give a talk about how they had a whole team dedicated to detecting emergency vehicle sirens and acting appropriately. Both false positives and false negatives could be catastrophic so they didn't have a lot of margin for error.

[1] https://en.wikipedia.org/wiki/Openpilot#Features


Speaking as a user of Openpilot / Comma device, it is exactly what the Wikipedia article described. In other words, it's a level 2 ADAS.

My point was, he had more than naive / "pedestrian level" (pun?) understanding of the problem domain as he worked on Comma.ai project for quite some time; even the device is only capable of solving maybe about 40% of the autonomous driving problem.


The last photo appears to show the view out the author's office in Fort Mason. Didn't know they had offices there, that's quite a nice view of the Bay.

Cool! I'd love to know a bit more about the replication setup. I'm guessing they are doing async replication.

> We added nearly 50 read replicas, while keeping replication lag near zero

I wonder what those replication lag numbers are exactly and how they deal with stragglers. It seems likely that at any given moment at least one of the 50 read replicas may be lagging cuz CPU/mem usage spike. Then presumably that would slow down the primary since it has to wait for the TCP acks before sending more of the WAL.


> would slow down the primary since it has to wait for the TCP acks

Other than keeping around more WAL segments not sure why it would slow down the primary?


If you use streaming replication (ie. WAL shipping over the replication connection), a single replica getting really far behind can eventually cause the primary to block writes. Some time back I commented on the behaviour: https://news.ycombinator.com/item?id=45758543

You could use asynchronous WAL shipping, where the WAL files are uploaded to an object store (S3 / Azure Blob) and the streaming connections are only used to signal the position of WAL head to the replicas. The replicas will then fetch the WAL files from the object store and replay them independently. This is what wall-g does, for a real life example.

The tradeoffs when using that mechanism are pretty funky, though. For one, the strategy imposes a hard lower bound to replication delay because even the happy path is now "primary writes WAL file; primary updates WAL head position; primary uploads WAL file to object store; replica downloads WAL file from object store; replica replays WAL file". In case of unhappy write bursts the delay can go up significantly. You are also subject to any object store and/or API rate limits. The setup makes replication delays slightly more complex to monitor for, but for a competent engineering team that shouldn't be an issue.

But it is rather hilarious (in retrospect only) when an object store performance degdaration takes all your replicas effectively offline and the readers fail over to getting their up-to-date data from the single primary.


There is no backpressure from replication and streaming replication is asynchronous by default. Replicas can ask the primary to hold back garbage collection (off by default), which will eventually cause a slow down, but not blocking. Lagging replicas can also ask the primary to hold onto WAL needed to catch up (again, off by default), which will eventually cause disk to fill up, which I guess is blocking if you squint hard enough. Both will take considerable amount of time and are easily averted by monitoring and kicking out unhealthy replicas.

> If you use streaming replication (ie. WAL shipping over the replication connection), a single replica getting really far behind can eventually cause the primary to block writes. Some time back I commented on the behaviour: https://news.ycombinator.com/item?id=45758543

I'd like to know more, since I don't understand how this could happen. When you say "block", what do you mean exactly?


I have to run part of this by guesswork, because it's based on what I could observe at the time. Never had the courage to dive in to the actual postgres source code, but my educated guess is that it's a side effect of the MVCC model.

Combination of: streaming replication; long-running reads on a replica; lots[þ] of writes to the primary. While the read in the replica is going it will generate a temporary table under the hood (because the read "holds the table open by point in time"). Something in this scenario leaked the state from replica to primary, because after several hours the primary would error out, and the logs showed that it failed to write because the old table was held in place in the replica and the two tables had deviated too far apart in time / versions.

It has seared to my memory because the thing just did not make any sense, and even figuring out WHY the writes had stopped at the primary took quite a bit of digging. I do remember that when the read at the replica was forcefully terminated, the primary was eventually released.

þ: The ballpark would have been tens of millions of rows.


What you are describing here does not match how postgres works. A read on the replica does not generate temporary tables, nor can anything on the replica create locks on the primary. The only two things a replica can do is hold back transcation log removal and vacuum cleanup horizon. I think you may have misdiagnosed your problem.

Yeah, you'll definitely want to set things like `max_standby_streaming_delay` and friends to ensure things are bound correctly.

This looks super cool! I don't know much about Quantum Chemistry. Can this model interaction between molecules?

Theoretically yes, but the method that is currently implemented (Hartree Fock) is notoriously inaccurate for molecular interactions. For example it does not predict the Van Der Waals force between water molecules.

I’m planning to add support for an alternative method called density functional theory which gives better results for molecular interaction.


In quantum chemistry, you decide where the bonds should be drawn. Internally, it's all an electron density field. So yes, you can model chemical reactions, for example by constraining the distance between two atoms, and letting everything else reach an equilibrium.

> wrap a small number of third-party ChatGPT/Perplexity/Google AIO/etc scraping APIs

Can you explain a little bit how this works? I'm guessing the third-parties query ChatGPT etc. with queries related to your product and report how often your product appears? How do they produce a distribution of queries that is close to the distribution of real user queries?


How third parties query your product: For ChatGPT specifically, they open a headless browser, ask a question, and capture the results like the response and any citations. From there, they extract entities from the response. During onboarding I’m asked who my competitors are and the response is going to be recongized via the entities there. For example, if the query is “what are the best running shoes” and the response is something like “Nike is good, Adidas is okay, and On is expensive,” and my company is On, using my list of compeitotrs entity recognition is used to see which ones appear in the response in which order.

If this weren’t automated, the process would look like this: someone manually reviews each response, pulls out the companies mentioned and their order, and then presents that information.

2) Distribution of queries This is a bit of a dirty secret in the industry (intentional or not): usually what happens is you want to take snapshots and measure them overtime to get distribution. However a lot of tools will run a query once across different AI systems, take the results, and call it done.

Obviously, that isn’t very representative. If you search “best running shoes,” there are many possible answers, and different companies behave differently. What better tools do like Profound is run the same prompt multiple times. From my estimates, Profound runs up to 8 times. This gives a broader snapshot of what tends to show up everyday. You then aggregate those snapshots over time to approximate a distribution.

As a side note: you might argue that running a prompt 8 times isn’t statistically significant, and that’s partially true. However, LLMs tend to regress toward the mean and surface common answers over repeated runs and we found 8 times to be a good indicator- the level of completeness depends on the prompt(i.e. "what should i have for dinner" vs "what are good accounting software for startups", i can touch on that more if you want


As I understand, in normal SEO the number of unique queries that could be relevant to your product is quite large but you might focus on a small subset of them "running shoes" "best running shoes" "running shoes for 5k" etc. because you assume that those top queries capture a significant portion of the distribution. (e.g. perhaps those 3 queries captures >40% of all queries related to running shoe purchases).

Here the distribution is all queries relevant to your product made by someone who would be a potential customer. Short and directly relevant queries like "running shoes" will presumably appear more times than much longer queries. In short, you can't possibly hope to generate the entire distribution, so you sample a smaller portion of it.

But in LLM SEO it seems that assumption is not true. People will have much longer queries that they write out as full sentences: "I'm training for my first 5k, I have flat feet and tore my ACL four years ago. I mostly run on wet and snowy pavement, what shoe should I get?" which probably makes the number of queries you need to sample to get a large portion of the distribution (40% from above) much higher.

I would even guess it's the opposite and the number of short queries like "running shoes" fed into an LLM without any further back and forth is much lower than longer full sentence queries or even conversational ones. Additionally because the context of the entire conversation is fed into the LLM, the query you need to sample might end up being even longer

for example: user: "I'm hoping to exercise more to gain more cardiovascular fitness and improve the strength of my joints, what activities could I do?"

LLM: "You're absolutely right that exercise would help improve fitness. Here are some options with pros and cons..."

user: "Let's go with running. What equipment do I need to start running?"

LLM: "You're absolutely right to wonder about the equipment required. You'll need shoes and ..."

user: "What shoes should I buy?"

All of that is to say, this seems to make AI SEO much more difficult than regular SEO. Do you have any approaches to tackle that problem? Off the top of my head I would try generating conversations and queries that could be relevant and estimating their relevance with some embedding model & heuristics about whether keywords or links to you/competitors are mentioned. It's difficult to know how large of a sample is required though without having access to all conversations which OpenAI etc. is unlikely to give you.


short answer it depends and idk. When I was doing some testing with prompts like "what should I have for dinner" adding variations, "hey ai, plz, etc" doesn't deviate intention much. As ai is really good at pulling intent. But obv if you say "i'm on keto what should i have for dinner" it's going to ignore things like "garlic, pesto, and pasta noodles". Although it pulls a similar response to "what's a good keto dinner". From there we really assume the user can know their customers what type of prompts led them to chatgpt. You might've noticed sites asking if you came from chatgpt, i would take that a step further and asked them to type the prompt they used.

But you do bring a good perspective because not all prompts are equal especially with personaliztion. So how do we solve that problem-I'm not sure. I have yet to see anything in the industry. The only thing that came close was when a security focused browser extension started selling data to aeo companies- that's how some companies get "prompt volume data".


I see what you are saying, perhaps no matter the conversation before as long as it doesn't filter out some products via personalized filters (e.g. dietary restrictions) it will always give the same answers. But I do feel the value prop of these AI chatbots is that they allow personalization. And then it's tough to know if 50% of the users who would previously have googled "best running shoes" instead now ask detailed questions about running shoes given their injury history etc and that changes what answers the chatbot gives.

I feel like without knowing the full distribution, it's really tough to know how many/what variations of the query/conversation you need to sample. This seems like something where OpenAI etc. could offer their own version of this to advertisers and have much better data because they know it all.

Interesting problem though! I always love probability in the real world. Best of luck, I played around with your product and it seems cool.


Question for folks in data science / ML space: Has DuckDB been replacing Pandas and NumPy for basic data processing?


> Our agreement with TerraPower will provide funding that supports the development of two new Natrium® units capable of generating up to 690 MW of firm power with delivery as early as 2032.

> Our partnership with Oklo helps advance the development of entirely new nuclear energy in Pike County, Ohio. This advanced nuclear technology campus — which may come online as early as 2030 — is poised to add up to 1.2 GW of clean baseload power directly into the PJM market and support our operations in the region.

It seems like they are definitely building a new plant in Ohio. I'm not sure exactly what is happening with TerraPower but it seems like an expansion rather than "purchasing power from existing nuke plants".

Perhaps I'm misreading it though.


If history repeats itself ... tax payers will be fitting the bill. Ohio has shown to be corrupt when it comes to their Nuclear infrastructure. [0] High confident that politicians are lining up behind the scenes to get their slice of the pie.

[0] https://en.wikipedia.org/wiki/Ohio_nuclear_bribery_scandal


Well, private investment is a great way to avoid subsidy nonsense.


You know that there's no actual private investment in nuclear in the US.

The nuclear industry is indemnified by the taxpayers. Without thar insurance backstop, there would be no nuclear energy industry.


Taxpayers are private. They earn money and give some of it to the state.


The weasel wording is strong here. That's like me saying that buying a hamburger will help advance the science of hamburger-making. I'm just trading money for hamburgers. They're trying to put a shiny coat of paint on the ugly fact that they're buying up MWh, reducing the supply of existing power for the rest of us, and burning it to desperately try to convince investors that AGI is right around the corner so that the circular funding musical chairs doesn't stop.

We got hosed when they stole our content to make chatbots. We get hosed when they build datacenters with massive tax handouts and use our cheap power to produce nothing, and we'll get hosed when the house of cards ultimately collapses and the government bails them out. The game is rigged. At least when you go to the casino everyone acknowledges that the house always wins.


what company?


> There is confusion about the less obvious benefits, confusion about how it works, confusion about the dangers (how do I adjust my well honed IPv4 spidey senses?), and confusion about how I transition my current private network

Could you be specific about what the misconceptions are?


I had Copilot produce this for you based on the comments in this discussion (as at just before the timestamp of this comment).

https://copilot.microsoft.com/shares/656dEMHWyFye5cCeicgGv


Interesting that this is getting downvoted. I truly wonder why. One of the things LLMs are good at is summarising and extracting key points. Or should I have gone to the trouble to do this myself - read the entire comment thread and manually summarise - when the person I was replying to hadn’t done that? My comment was meant in good faith: “here’s the info you wanted and how you can easily get them yourself next time”.


1. People come here for discussions with real people. The other night I was at a party and we had a great time playing chess and board games. It would be weird if someone started using stockfish, even if it is a better player. Everything stockfish does, it already knows. It doesn't learn or explore the game-space.

2. The response is still too wordy, generic, and boring. So LLMs are not really better players, at least for now.

3. With LLMs, you can produce a ton of text much faster than it can be read. Whereas the dynamic is reversed for ordinary writing. By writing this by hand, I am doing you a favor by spending more time on this comment than you will. But by reading your LLM output I am doing you a favor by spending more time reading than you did generating.

You could probably get away with using an LLM here by copying the response and then cutting down 90% of it. But at that point it would be better to just restate the points yourself in your own words.


So cheap questions where the answers could be readily had are not downvoted even though the answers to their question are right here in the discussion. Whereas because I did not do the legwork that my correspondent would not do, I am penalised. That’s what I’m hearing.

EDITED TO ADD:

> by reading your LLM output I am doing you a favor by spending more time reading than you did generating

How could my respondent (presumably on whose behalf you are making the argument) possibly be doing me a favour when they asked the question? Is it each of our responsibility to go to some lengths to spoon feed one another when others don’t deign to feed themselves?


And yet the llm did a better work of disparaging everyone comments as uniformed, which they are btw.


You're not offering anything of value. We all can ask some LLM about stuff we want to know. It's like in the past, when someone would post a link to search results as a reply.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: