Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Sorting of integers and floats 23% faster in PostgreSQL 9.2 (long explanation) (pgeoghegan.blogspot.com)
69 points by selenamarie on Aug 2, 2012 | hide | past | favorite | 8 comments


Peter is awesome and is also primarily responsible for a bunch of power consumption improvements: http://pgeoghegan.blogspot.com/2012/01/power-consumption-in-...


PostgreSQL should really be using a radix sort for these data types, not a quick sort (however "optimized").


Definitely not quick sort -- getting it to be an O(n log n) worst case is very hard, and usually enough to disqualify it as "quick" (if such a title is ever warranted, given the n^2 worst case).

Any reason why specifically radix sort, rather than (say) TimSort (super-optimized merge-sort), which has better memory locality than radix sort, and can accommodate non-lexicographic orders (useful for e.g. non-normalized unicode) unlike radix sort?


The data types being discussed here are 32-bit integers and floating point numbers; radix sort can be made downright epic for these types.

http://stereopsis.com/radix.html

Also, the reason I mention quick sort is that that is the algorithm the EXPLAIN output of PostgreSQL claims it uses: I agree that for more complex types or complex comparison functions radix sort will not work, and agree as well that some other sorting algorithm may be preferred to what PostgreSQL currently does.


C++'s template-based sorting mechanisms allow for the same optimization without manually special-casing anything. (The compiler "integrates everything", to quote the author, out of the box.) C++ port of Postgresql, anyone?


Only if you actually can special case it automatically during compilation. It could still require a few tricks if your comparison comes from a user-defined query. At a minimum it would have to be something like: figure out which column type are you using, figure out what basic type does it map to, use that type in the template.


PostgreSQL is extensible enough that runtime specialisation would be best, I think. A JIT compiler, or a compiled type-safe multi-stage[1] programming language like MetaOCaml, would give the best performance.

[1] http://www.cs.rice.edu/~taha/MSP/


or what about a java port? it's as feasible as a c++ port.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: