Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

The Wikipedia data dumps [0] are multistream bz2. This makes them relatively easy to partially ingest, and I'm happy to be able to remove the C dependency from the Rust code I have that deals with said dumps.

[0]: https://meta.wikimedia.org/wiki/Data_dump_torrents#English_W...



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: