pickle.dump/load is only slow if your main objects has references to many small nested objects: e.g. a large python dicts with million of key values that are small Python str or int objects for instance.
If your main object has only references to a few large sub-objects (e.g. a bunch of multi-MB or GB numpy arrays to store the numerical parameters of a machine learning moodel), then it can be very fast, basically IO-bottlenecked by writing or reading the bytes to/from the disk.
Nesting is actually not such a problem in itself, it just hides the fact that your seemingly simple object on the surface might have a reference to a large collection of small subojects that will be slow to pickle.
If your main object has only references to a few large sub-objects (e.g. a bunch of multi-MB or GB numpy arrays to store the numerical parameters of a machine learning moodel), then it can be very fast, basically IO-bottlenecked by writing or reading the bytes to/from the disk.