What I mean is basically looking at the last (few) messages in the context, translating that to a RAG query, query your embeddings database + BM25 lookup if desired, and if you find something relevant inject that right before the last message in the context.
It’s pretty common in a lot of agents, but I don’t see a way to do that with Claude Code.
I'm not familiar with Claude's architecture, but I'd be surprised if it doesn't index your codebase for semantic search with the explore feature it has. How else would they find context? They already have a semantic search tool -- which is rag.
Cursor and Windsurf both use semantic search via embeddings.
You can get semantic search in Claude Code using this unofficial plugin: https://github.com/zilliztech/claude-context - it's built by and uses a managed vector database called Zilliz Cloud.
It’s surprisingly fast to generate embeddings. I don’t think it’s a UX issue as much as it’s that Anthropic themselves don’t offer any embeddings API (they only have one internally, but publicly recommend Cohere).
They do use RAGs a lot for their desktop app, their projects implementation make a lot of use of it.
It’s pretty common in a lot of agents, but I don’t see a way to do that with Claude Code.