I think breaking up is a ZIP limitation (max archive size is 4GB) I'm exporting ...

ir77 · on Nov 12, 2020

"Then indeed, inside you have mess with some data in HTML, some in JSON, etc. But well, at least you can parse it..."

how is some regular schmuck that wants to move his data out of google to another service supposed to determine what they actually have to parse? the user simply uploads pictures into the system but gets garbage out?

the scarry part here is that google makes it extermely easy to suck in the data but for an average user it's extremly difficult to get back out and takeout is absolutely not a good solution.

panopticon · on Nov 12, 2020

They seem to use standardized formats where possible: vcard for contacts, mbox for email, image files for photos, etc. I'll grant you that they do some not-nice things (like separating photo metadata into json files), but I'm curious what format for search or timeline activity would be useful for a "regular schmuck"?

If said person wants to view the data on their own time, HTML seems adequate. And JSON seems ideal if they plan on sending this data to a new service that ostensibly supports parsing Google's takeout.

ocdtrekkie · on Nov 12, 2020

I think a big part of the problem is even if Takeout is using standard formats, none of their competing services or software platforms are set up to ingest those formats.

Like mbox is fine for opening in a desktop client, but if you move from Gmail to Fastmail or Outlook or whatever, mbox might as well be a ClarisWorks spreadsheet file.

edoceo · on Nov 12, 2020

Hmm, I've seen ZIP files over 4GiB - not from takeout but from others

akx · on Nov 12, 2020

That'd be ZIP64 -- see https://en.wikipedia.org/wiki/ZIP_(file_format)#ZIP64

Closi · on Nov 12, 2020

The original specification was 4GB, although there is an extension to the specification that allows for over 4GB (64 bit is required).