There isnt a single AI out there that wont lie to your face, reinterpret your prompt, or just decide to ignore your prompt.
When they try to write a doc based off code, there is nothing you can do to prevent them from making up a load of nonsense and pretending it is thoroughly validated.
Do we have any reason to believe alignment will be solved any time soon?
Why should this be an issue? We are producing more and more correct training data and at some point the quality will be sufficient. To me its not clear what speaks against this.
We don’t expect 100% reliability from humans-humans will slack off, steal, defraud, harass each other, sell your source code to a foreign intelligence service, turn your business behind your back into a front for international drug cartels-some of that is very low probability, but never zero probability-so is it really a problem if we can’t reduce the probability to literally zero for AIs either?
You want the ai aligned with writing accurate documentation, not aligned with a goal thats near but wrong, e.g. writing accurate sounding documentation.