Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

The idea is to use a two stage process:

1. take a sha1 hash of the last 4KB of each file.

2. for any equal hashes, compare the whole file.

With this method you should be able to skip reading many large files in their entirety.



That is an idea, and it would, in typical cases, avoid reading most large files in their entirety, but it is not what I read in signa11's comment.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: