Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

One of the topics he briefly touches on is indexing for vector similarity based retrieval (essentially search).

Has anyone on HN tried replacing systems like Elasticsearch with an LLM based index? Curious.

One of the systems at my startup is an elasticsearch based search of a large corpus of structured data that contains larger text fields.



Yes, there's lots of folks doing this and currently it looks like combining results from an LLM based index and a standard text retrieval indexes (e.g. using BM25) may beat either alone. Note that you can add and search LLM derived vectors from within Elasticsearch: look up 'dense vector search' in the docs. Check out the Haystack conference next week for lots of discussion of current practices: https://haystackconf.com/2023/


Thanks!

I've looked at haystack a little bit and was wondering how involved it would be to set up for a research spike.


The LLM embedding approach has given me much better search results than Elasticsearch.

Cost is definitely an issue.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: