We don’t expect 100% reliability from humans-humans will slack off, steal, defra...

Xss3 · 2025-11-16T16:58:00 1763312280

Humans have incentives to not do those things. Family. Jail. Money. Food. Bonuses. Etc.

If we could align an AI with incentives in the same way we can a person then youd have a point.

So far alignment research is hitting dead ends no matter what fake incentives we try to feed an AI.

aswegs8 · 2025-11-19T11:29:43 1763551783

Can you remind me of the link between alignment and writing accurate documentation? Honestly don't understand how they are linked.

Xss3 · 2025-11-21T12:21:43 1763727703

You want the ai aligned with writing accurate documentation, not aligned with a goal thats near but wrong, e.g. writing accurate sounding documentation.