Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Sure, you can simplify these observations into just codegen. But the real observation is not that these models are more susceptible to fail when generating code, but that they are more susceptible to jailbreak-type attacks that most people have come to expect to be handled by post training.

Having access to open models is great, and even if their capabilities are somewhat lower than the closed-source SoTA models, and we should be aware of the differences in behavior.



> more susceptible to jailbreak-type attacks that most people have come to expect to be handled by post training

the keyword here is "more". The big models might not be quite as susceptible to them, but they are still susceptible. If you expect these attacks to be fully handled, then maybe you should change your expectations.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: