pickle.dump/load is only slow if your main objects has references to many small ... | Hacker News

Hacker Newsnew | past | comments | ask | show | jobs | submit

		ogrisel on March 18, 2021 \| parent \| context \| favorite \| on: Exploiting machine learning Pickle files pickle.dump/load is only slow if your main objects has references to many small nested objects: e.g. a large python dicts with million of key values that are small Python str or int objects for instance. If your main object has only references to a few large sub-objects (e.g. a bunch of multi-MB or GB numpy arrays to store the numerical parameters of a machine learning moodel), then it can be very fast, basically IO-bottlenecked by writing or reading the bytes to/from the disk.

tyingq on March 18, 2021 [–]

Interesting. The link I posted is a lot (100k) of small 4-field objects, but they aren't nested.

ogrisel on March 20, 2021 | [–]

Nesting is actually not such a problem in itself, it just hides the fact that your seemingly simple object on the surface might have a reference to a large collection of small subojects that will be slow to pickle.

Consider applying for YC's Summer 2026 batch! Applications are open till May 4
Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact