Yes, you can be the guy at a social media company who says "perhaps earning a few extra billion in revenue this way is bad for children," but the executives are just going to replace you.
To be fair, Youtube isn't about the comments and the discussion. The comments are sort of there just to give feedback to the creators and serve as a signal to the YouTube algorithm.
> I can imagine a future in which some or even most software is developed by witches, who construct elaborate summoning environments, repeat special incantations (“ALWAYS run the tests!”), and invoke LLM daemons who write software on their behalf.
This sort of prompting is only necessary now because LLMs are janky and new. I might have written this in 2025, but now LLMs are capable of saying "wait, that approach clearly isn't working, let's try something else," running the code again, and revising their results.
There's still a little jankiness but I have confidence LLMs will just get better and better at metacognitive tasks.
UPDATE: At this very moment, I'm using a coding agent at work and reading its output. It's saying things like:
> Ah! The command in README.md has specific flags! I ran: <internal command>. Without these flags! I missed that. I should have checked README.md again or remembered it better. The user just viewed it, maybe to remind me or themselves. But let's first see what the background task reported. Maybe it failed because I missed the flags, or passed because the user got access and defaults worked.
I'm concerned that developing better metacognition is really just throwing more finite resources at the problem. We surely don't have unlimited compute, or unlimited (V)RAM, and so there must be a wall here. If it could be demonstrated that this improved metacognition was coming without associated increases in resource utilization, I would find these improvements to be much more convincing... but as things stand, we're very much not there.
(There may be an argument here re: the move from dense to MoE models, but all research I am aware of suggests that MoE models are not dramatically more efficient than dense models - i.e., active parameter count is not the overriding factor, and total parameter count is still extremely important, though it does seem to roughly follow a power law.)
I enjoyed how it complained about how unclear and ambiguous Markdown’s syntax for bold and italic text supposedly is, then showed an HTML translation with <strong> and <em>.
People are still trying to figure out how to use AI. Right now the meme is it's used by juniors to churn out slop, but I think people will start to recognize it's far more powerful in the hands of competent senior devs.
It actually surprised me that you can use AI to write even better code: tell it to write a test to catch the suspected bug, then tell it to fix the bug, then have it write documentation. Maybe also split out related functionality into a new file while you're at it.
I might have skipped all that pre-AI, but now all that takes 15 minutes. And the bonus creating more understandable code allows AI to fix even more bugs. So it could actually become a virtuous cycle of using AI to clean up debt to understand more code.
In fact, right now, we're selling technical debt cleanup projects that I've been begging for for years as "we have to do this so the codebase will be more understandable by AI."
Having worked on many long-lived projects for 5+ years at big firms, I think theres an aspect of project management being a dark art which will conflict with the hopes & dreams of AI.
Developer productivity is notoriously difficult to measure. Even feature velocity, cadence or volume improvements are rarely noticed & acknowledged by users for long. They will always complain about speed and somehow notice slowdowns (and invent them in their head as well).
I once joined a team that was in crises, they couldn’t ship for 6 months due to outages. We stabilized production, put in tests, introduced better SDLC, and started shipping every 1-2 weeks. I swear to you that it was not more than a few months before stakeholders were whinging about velocity again. You JUST had zero, give me a break.
If you get a 3x one-off boost by adopting AI and then that’s the new normal, you’ll be shocked how little they pat you on the back for it. Particularly if some of that 3x is spent on tickets to “make the code easier for AI to understand”, testing, and low priority tickets in the backlog no one had bothered doing previously (seen a lot of these anecdotes). And god help you if your velocity slips after that 3x boost, they will notice the hell out of that.
reply