Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

The only thing suffix trees have in common with B-tries that I can see is that they're both non-binary trees.

A B-tree stores whole keys in the nodes, not fragments of values. And a B-tree tries to self-balance, so there's a lot of leeway for what keys get pushed up to the root node. B-trees split and merge nodes based on the number of items in them, trying to achieve a certain fill factor.

A suffix tree, on the other hand, is storing fragments of the values, and I believe (but have not bothered to prove) that there is one and only one minimal and valid way to organize the tree for a given set of data. Suffix trees add children when the data mandate that they do in order to remain correct, not just in order to make room.



I should phrase it differently; what does a suffix tree get you that a standard sorting tree doesn't get you. A standard sorting tree would get you all the neighbors in lexical order. What more do you need?


The B-tree just lets you quickly find out if a given string is in a set. The suffix tree lets you do things like finding the longest common substring very quickly, or looking for repeating motifs.


A B-tree (or any sorting/index tree) allows one to find nearby elements in where sorting order one choose. If one takes a B-tree of strings sorted in lexical order, one find the neighbors of a given string X. If you start with a string X=YZ, where Y is smallest substring in our base string-to-search and Z is the part that doesn't occur, then the (left and right) neighbors of X will be other strings containing Y, if such strings exist. Search of this sort may take O(log(l)) time admitted where a suffix tree might take O(1) time (for all I know) but still, a lot of specialized search trees are not that different from B-tree+special-sort-order, as far as I can tell. For example, B-tree + Z-ordering gives you a spatial sorting tree.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: