I think breaking up is a ZIP limitation (max archive size is 4GB)
I'm exporting it regularly (although I certainly don't have 120Gb of photos on it). You can choose an option for regular exports (every two months), and delivery method (e.g. Google Drive). Then I have a script that runs daily, mounts google drive, moves the takeout locally if it's present and removes from google drive (so it doesn't take space).
Then indeed, inside you have mess with some data in HTML, some in JSON, etc. But well, at least you can parse it... I have a library which I'm using as an API to various data exports, in particular, archived takeouts too (so I don't even have to unpack them to access)
"Then indeed, inside you have mess with some data in HTML, some in JSON, etc. But well, at least you can parse it..."
how is some regular schmuck that wants to move his data out of google to another service supposed to determine what they actually have to parse? the user simply uploads pictures into the system but gets garbage out?
the scarry part here is that google makes it extermely easy to suck in the data but for an average user it's extremly difficult to get back out and takeout is absolutely not a good solution.
They seem to use standardized formats where possible: vcard for contacts, mbox for email, image files for photos, etc. I'll grant you that they do some not-nice things (like separating photo metadata into json files), but I'm curious what format for search or timeline activity would be useful for a "regular schmuck"?
If said person wants to view the data on their own time, HTML seems adequate. And JSON seems ideal if they plan on sending this data to a new service that ostensibly supports parsing Google's takeout.
I think a big part of the problem is even if Takeout is using standard formats, none of their competing services or software platforms are set up to ingest those formats.
Like mbox is fine for opening in a desktop client, but if you move from Gmail to Fastmail or Outlook or whatever, mbox might as well be a ClarisWorks spreadsheet file.
I'm exporting it regularly (although I certainly don't have 120Gb of photos on it). You can choose an option for regular exports (every two months), and delivery method (e.g. Google Drive). Then I have a script that runs daily, mounts google drive, moves the takeout locally if it's present and removes from google drive (so it doesn't take space).
Then indeed, inside you have mess with some data in HTML, some in JSON, etc. But well, at least you can parse it... I have a library which I'm using as an API to various data exports, in particular, archived takeouts too (so I don't even have to unpack them to access)
- https://github.com/karlicoss/HPI/blob/master/my/google/takeo...
- https://github.com/karlicoss/HPI/blob/master/my/location/goo...
- https://github.com/karlicoss/HPI/blob/master/my/media/youtub...
Described this in more detail here: https://beepb00p.xyz/my-data.html#takeout