Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I'd be interested to see the same kinds of results before/after RLHF training for censorship/"harmlessness", which I've seen pointed to in a few places as degrading the quality of responses in general.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: