Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

In my experience, the cube extension is unusable for >10M x 128D vectors without PCA. I'm using Faiss now with ~500M vectors, and it works great!


With how many dimensions are you using Faiss with 100m+ vectors? I’m currently looking a solution to handle 1024 dimensions for ~100m items.


On one index I'm using OPQ16_64,IVF262144_HNSW32,PQ16 with 128 dimensions initially.

1024 dimensions is a lot! Could you elaborate on what application requires that many? If it's a DNN layer output, your data must be sparse, so dimensionality reduction won't affect your recall if tuned properly.


It's actually a DNN layer output. I haven't considered dimensionality reduction, yet. Thanks for pointing my there, I'll look into it. Probably thats the better way to go.

Thanks a lot for your reply!




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: