So I started working with MCCs (Merchant Category Codes AKA ISO 18245) and I needed some decent lookup tables for them.
I just spent several hours fighting with awful spec PDFs containing hundreds upon hundreds of tables of these. Well, Tabula made quick work of all of them:
It extracted over 200 pages of tables with nearly no errors, and maybe a grand total of ~15 mins of manual cleanup work needed to have the data be processable by the library.
I just spent several hours fighting with awful spec PDFs containing hundreds upon hundreds of tables of these. Well, Tabula made quick work of all of them:
https://github.com/jleclanche/python-iso18245
It extracted over 200 pages of tables with nearly no errors, and maybe a grand total of ~15 mins of manual cleanup work needed to have the data be processable by the library.
Final CSVs: https://github.com/jleclanche/python-iso18245/tree/master/is...