Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Using “strange tokens” is fairly useful for holding the instructions apart from the learned corpus and confusing the model. This is for instance how you inject a LORA most effectively. Using common language tokens and semantics is more confusing to the model in separating control instructions from the relevant semantic context.


Sure, this post was about ChatGPT.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: