Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

LLMs don't have any distinction between instructions & data. There's no "NX" bit. So if you use a local LLM to process attacker-controlled data, it can contain malicious instructions. This is what Simon Willson's "prompt injection" means: attackers can inject a prompt via the data input. If the LLM can run commands (i.e. if it's an "agent") then prompt injection implies command execution.


>LLMs don't have any distinction between instructions & data

And this is why prompt injection really isn't a solvable problem on the LLM side. You can't do the equivalent of (grep -i "DROP TABLE" form_input). What you can do is not just blindly execute LLM generated code.


NX bit doesn’t work for LLMs. Data and instruction tokens are mixed up in higher layers and NX bit is lost.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: