Author here - ya "binary vectors" means quantizing to one bit per dimension. Nor... | Hacker News

Hacker Newsnew | past | comments | ask | show | jobs | submit

		alexgarcia-xyz on May 3, 2024 \| parent \| context \| favorite \| on: I’m writing a new vector search SQLite Extension Author here - ya "binary vectors" means quantizing to one bit per dimension. Normally it would be 4 * dimensions bytes of space per vector (where 4=sizeof(float)). Some embedding models, like nomic v1.5[0] and mixedbread's new model[1] are specifically trained to retain quality after binary quantization. Not all models do tho, so results may vary. I think in general for really large vectors, like OpenAI's large embeddings model with 3072 dimensions, it kindof works, even if they didn't specifically train for it. [0] https://twitter.com/nomic_ai/status/1769837800793243687 [1] https://www.mixedbread.ai/blog/binary-mrl

barakm on May 3, 2024 [–]

Thank you! As you keep posting your progress, and I hope you do, adding these references would probably help warding off crusty fuddy-duddys like me (or at least give them more to research either way) ;)

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact