Yeah, it's polynomial, but it's still really slow and not practical. In a string...

ramchip · on March 10, 2009

My algorithms teacher does research in something similar. IIRC he has a program exactly for showing copy/pasted code. You may want to have a look at his papers on Clone Detection Tools, "A Novel Approach to Optimize Clone Refactoring Activity", etc.

http://www.polymtl.ca/recherche/rc/en/professeurs/details.ph...

silentbicycle · on March 10, 2009

Awesome, thanks!

lacker · on March 10, 2009

Comparing every something to every other something is an N^2 operation. But, in this case, our something is already O(N^2), so the final algorithm is actually O(N^4).

If you are just checking for equality, comparing each of N items to another group of N items is O(N). Use a hash table.

silentbicycle · on March 11, 2009

Exactly. Interned strings / symbols work out the same, too.