This was a great post, one of the best I've seen on this topic at HN.
But why is the cost never discussed or disclosed in these conversations? I feel like I'm going crazy, there is so much written extolling the virtues of these tools but with no mention of what it costs to run them now. It will surely only get more expensive from here!
> But why is the cost never discussed or disclosed in these conversations?
And not just the monetary cost of accessing the tools, but the amount of time it takes to actually get good results out. I strongly suspect that even though it feels more productive, in many cases things just take longer than they would if done manually.
I think there are really good uses for LLMs, but I also think that people are likely using them in ways that feel useful, but end up being more costly than not.
Indeed, most of us are probably limited with what our companies let us use and also not to mention not everyone can afford to use AI tooling in their own time without thinking about the cost assuming you want to build something your company doesn't claim as their own IP.
The current realistic lower bound for actual work is the $100/€90/month Claude Max ("5x") plan. It allows roughly enough usage for a typical working month (4.25 x 40-50h). "Single-threaded", interactive usage with normal human breaks, sort of.
There are two usage quota windows to be aware of: 5h and 7d. I use https://github.com/richhickson/claudecodeusage (Mac) to keep track of the status. It shows green/yellow/red and a percentage in the menu bar.
the first time I did work as the article suggests I used my monthly allowance in a day.
Apparently out of 3-5k people with access to our AI tools, there's fewer than a handful of us REALLY using it. Most are asking questions in the chatbot style.
Anyway, I had to ask my manager, the AI architect, and the Tooling Manager for approval to increase my quota.
I asked everyone in the chain how much equivalent dollars I am allocated, and how much the increase was and no one could tell me.
Honestly, the costs are so minimal and vary wildly relative to the cost of a developer that it's frankly not worth the discussion...yet. The reality is the standard deviation of cost is going to oscillate until there is a common agreed upon way to use these tools.
> Honestly, the costs are so minimal and vary wildly relative to the cost of a developer that it's frankly not worth the discussion...yet
Is it? Sure, the chatbot style maxes at $200/month. I consider that ... not unreasonable ... for a professional tool. It doesn't make me happy, but it's not horrific.
The article, however, explicitly pans the chatbot style and is extolling the API style being accessed constantly by agents, and that has no upper bound. Roughly $10-ish per Megatokens. $10-ish per 1K web searches. etc.
This doesn't sound "minimal" to me. This sounds like every single "task" I kick off is $10. And it can kick those tasks and costs off very quickly in an automated fashion. It doesn't take many of those tasks before I'm paying more than an actual full developer.
It's worse than that, BF6's anticheat is kernel level and requires the Windows-only version secure boot to be enabled, at least on my motherboard. There is no way I'm going to faff about with my BIOS when rebooting just to play this game.
I don't know how EFI boot works but I am running a gaming PC in dual boot and I have both Microsoft and my own personal secure boot keys loaded (for linux and grub)
I boot my own signed bootloader (grub) from which I can also boot Windows. Windows shows it is in secure boot mode and it works fine with BF6 for me.
But I have a feeling this allows users to run some bootkit/rootkit and bypass any of those kernel level anti-cheats. Maybe I'm wrong and EFI handover to Windows clears all the memory, but I somehow doubt it.
I play around with a Guix install in a VM, and with less than half of my system resources, a `guix pull` with a `guix system reconfigure ...` takes about 10 minutes. That said, if a pull happens to include a kernel update it can take much longer. I think pinning large packages (like the kernel) that may not need incidental updating is key to keeping pull times lower.
10 minutes is still an order of magnitude slower than a nixos-rebuild if you're hitting the cache, and there's no issue with forgetting to rebuild the guix profile. If you include nonguix (as is required for the 99% of users who need nonfree blobs) you get the 30-50 minutes as described in the article. I believe Nix-the-language being lazily evaluated is a big part of this; you can even inadvertently evaluate nixpkgs multiple times and not notice.
>(as is required for the 99% of users who need nonfree blobs)
If you forgo the built-in WiFi, the ThinkPad T440p (Haswell) works fine without any blobs (speaking from experience). I think all newer gens need iGPU blobs, sadly, but I wanted to point out a viable middle ground between modern Nvidia gamerware setups and Librebooted X200s that can barely browse the web.
Brand loyalty might matter when the cost of a good is relatively low and the availability high. I can basically choose between coke or Pepsi anywhere, and they cost about the same, so why not go with my favorite?
For airlines availability with a preferred carrier is not guaranteed, and prices can vary wildly. Do I have so much brand loyalty that I will pay perhaps 2x the cost? Like most people, I wouldn't.
In terms of availability and cost, LLM providers are much closer to Coke than to an airline.
Not entirely true, there are other PKE and DSA algorithms that were/are a part of the competition that used problems not related to lattices. However, the lattice-based options were often among the fastest and smallest.
I know you're kidding but for the benefit of the class isogeny schemes were pulled when their best candidate design turned out to be breakable with a Python script owing to obscure non-cryptographic mathematic research from the 1990s.
The traditional, elegant method of a more civilized age:
Last on the program were Len Adleman and his computer, which had accepted a challenge on the first night of the conference. The hour passed; various techniques for attacking knapsack systems with different characteristics were heard; and the Apple II sat on the table waiting to reveal the results of its labors. At last Adleman rose to speak mumbling something self-deprecatingly about “the theory first, the public humiliation later” and beginning to explain his work. All the while the figure of Carl Nicolai moved silently in the background setting up the computer and copying a sequence of numbers from its screen onto a transparency. At last another transparency was drawn from a sealed envelope and the results placed side by side on the projector. They were identical. The public humiliation was not Adleman‘s, it was knapsack’s.
W. Diffie, The first ten years of public-key cryptography, Proceedings of the IEEE, vol. 76, no. 5, pp. 560-577, May 1988
AFAIK, only SIDH-like schemes that exposes auxiliary points are broken, so others schemes like CSIDH may have some chances?
https://issikebrokenyet.github.io/
I was at a conference with some of these folks recently and they stated some glimmer of hope remains for repairing isogeny-based crypto. I guess we'll see.
It'll likely be a lot of work to get going, but it might contain some valuable hints that I had to search for through mailing lists and reading the QEMU source.
I remember needing a semi-custom kernel (maybe) and (I think) the rust version of virtiofsd.
I've implemented a toy version of a 3+ MPC protocol for graduate school, specifically private set intersection. Would you mind sharing what kind of MPC protocols you design and if you can for what types of applications? I don't often see this discussed on HN and my curiosity is piqued!
Two-party set intersection and variants (intersection-sum, etc.), federated learning (secure aggregation) and its variants, and several things that are not yet public. I also did some work on anonymous trust tokens, which is kind of like a generalization of privacy pass that is meant to replace cookies for conveying e.g. whitelist/blacklist information. For the most part my work involves companies doing some kind of statistical analysis of joint data sets while maintaining some privacy constraint. Some of the work involves analyzing ads effectiveness, some involves public health, some involves machine learning, and there is a long tail of obscure applications that were deployed as a one-off. Resource constraints are the biggest technical challenge, but a bigger problem I and the rest of the people I work with face is lack of awareness or poor understanding of MPC (people often assume it is just a variant of DP, or that it is a blockchain something or other, or that it is totally impractical, etc.).
This is super exciting for me, I am very interested in MPC/PSI but I haven't been introduced to much about it outside of academia. A ton of potential applications obviously but limited by computational power, as I understand it. Would you mind sharing what company(ies) you work with/for? If you can't or don't want to disclose publicly you can email me: kyoji1@gmail.com or jowens17@fau.edu. I would love to hear more!
Anything worthwhile in fully homomorphic encryption yet? I keep seeing the tools get faster but security is still relatively unknown compared to modern symmetric/asymmetric ciphers. There's also several interesting papers on anonymous/garbled circuit evaluation that I'm assuming will lead to even better untrusted third-party computation services. What I'm waiting for is FHE/circuits/something that can selectively decrypt some of their own outputs.
FHE security is reasonably well understood but not as well understood as EC or RSA/DH security. For the most part today's FHE systems are all based on the (R)LWE problem and the hardness of that problem is not in doubt for the right parameter choices (though choosing the right parameters is a careful balancing act).
It is unlikely (in my opinion) that "true" FHE applications will be deployed any time soon, but "leveled" FHE applications are already being deployed for a small number of levels (e.g. 2). Beyond quartic functions the performance is probably going to be too much of a problem for most applications. Homomorphic encryption in general is commonly used as a building block in larger MPC systems and you will probably see more widespread use of leveled FHE as such a building block too.
As for selectively decrypting outputs, that sounds like functional encryption and it is still an active area of research (see also obfuscation, which was a hot topic a few years ago). I doubt you will see practical applications for a very long time.