Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

There's a searching concept known as edit (levenshtein) distance that is close but not exactly what you're talking about. To rephrase your question: is there a datastructure that can pre-compute the levenshtein distance for a set or results?

Yes, and it's crazy complicated. See Levenshtein Automata[1]. In reality, I would let the professionals handle this by either using Open/ElasticSearch or Apache Lucene.

[1]http://blog.notdot.net/2010/07/Damn-Cool-Algorithms-Levensht...



An in-between option that is relatively simple and a rather fast (though not as fast as the automata) optimization to levenstein distance and based on Tries is: http://stevehanov.ca/blog/?id=114.


>I would let the professionals handle this...

If this site doesn't contain "the professionals" then the phrase has zero meaning.


Fair enough. "Crazy complicated" and "the professionals" are both intended as subjective terms. To ME this stuff might as well be magic.


I'd flip that and decide that there is no algorithm that exists and is documented that you can't implement yourself, for amusement and educational value or, given time and the will, more than that. Just making that decision has value for you even if you don't actually do it terribly often.

Same as you can design a cpu in HDL and flash it to an FPGA. (The "Nand to Tetris" course seems popular) Same as you can write an OS. Same as you can write a compiler and/or interpreter.

Believing these things (and they're actually true!) gets us all away from learned helplessness. There's enough of that when it comes to actual silicone...

Nobody understands it all. _You_ understand as much of it as you choose, in the direction that takes you, as deep as you want to go. It's just work, a lot of it, but no more than that.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: