Not talking about this tool, but in general-incorrect LLM-generated documentation can have some value - developer knows they should write some docs, but are starring at a blank screen and not sure what to write so they don’t. Then developer runs an LLM, gets a screenful of LLM-generated docs, notices it is full of mistakes, starts correcting them-suddenly, a screenful of half-decent docs.
For this to actually work, you need to keep the quantity of generated docs a trickle rather than a flood-too many and the developer’s eyes glaze over and they miss stuff or just can’t be bothered. But a small trickle of errors to correct could actually be a decent motivator to build up better documentation over time.
There isnt a single AI out there that wont lie to your face, reinterpret your prompt, or just decide to ignore your prompt.
When they try to write a doc based off code, there is nothing you can do to prevent them from making up a load of nonsense and pretending it is thoroughly validated.
Do we have any reason to believe alignment will be solved any time soon?
Why should this be an issue? We are producing more and more correct training data and at some point the quality will be sufficient. To me its not clear what speaks against this.
We don’t expect 100% reliability from humans-humans will slack off, steal, defraud, harass each other, sell your source code to a foreign intelligence service, turn your business behind your back into a front for international drug cartels-some of that is very low probability, but never zero probability-so is it really a problem if we can’t reduce the probability to literally zero for AIs either?
You want the ai aligned with writing accurate documentation, not aligned with a goal thats near but wrong, e.g. writing accurate sounding documentation.
Not talking about this tool, but in general-incorrect LLM-generated documentation can have some value - developer knows they should write some docs, but are starring at a blank screen and not sure what to write so they don’t. Then developer runs an LLM, gets a screenful of LLM-generated docs, notices it is full of mistakes, starts correcting them-suddenly, a screenful of half-decent docs.
For this to actually work, you need to keep the quantity of generated docs a trickle rather than a flood-too many and the developer’s eyes glaze over and they miss stuff or just can’t be bothered. But a small trickle of errors to correct could actually be a decent motivator to build up better documentation over time.