I'd be interested to see the same kinds of results before/after RLHF training fo...

		crooked-v on Oct 19, 2023 \| parent \| context \| favorite \| on: The Geometry of Truth: Do LLM's Know True and Fals... I'd be interested to see the same kinds of results before/after RLHF training for censorship/"harmlessness", which I've seen pointed to in a few places as degrading the quality of responses in general.