Thank you for taking the time to reply. FWIW my comment was added a few minutes before your edits so it only contained the [1] footnote and possibly other edits.
I guess I was missing the "why" where your post focused on the "what".
Of course when you are working with sensitive data like that it makes a lot of sense to be really careful about security.
I am well aware about the problems with the csv format. But a lot of times you can not really expect clients to be able to download and decode a parquet file, when all they want is to submit a password, download the data and look at it in excel.
Oh, that is a whole extra rant! As much as I actually like Excel as a tool for numerous tasks, it gets things annoyingly wrong too.
We've had clients who manually merge or otherwise tinker with data in Excel to feed to us¹ which can cause numerous problems if leading zeros get cut off identifiers that aren't actually numbers but get interpreted as such, and the age-old joke about Excel being a bit of an inv-cel that mistakes things for dates is still very relevant.
Also it requires (as do a number of MS tools) UTF8 files to have the initial BOM, which is actually not recommended as per the relevant standards², or it will assume Win1252 with all the text corruption that implies if your data contains accented characters, currency symbols, or anything else not in the 7-bit ASCII character set. Sometimes we see data where this has happened at some point to some of it, but not all, so the file has a mix of current UTF8 and corrupted UTF8 then encoded as UTF8. Funfunfun.
----
[1] in one case because their requirements changed, and they didn't want to pay anyone (us, the company responsible for the system the data was output from, or their own internal IT/processing/other teams) to automate the new data manipulation…
I guess I was missing the "why" where your post focused on the "what".
Of course when you are working with sensitive data like that it makes a lot of sense to be really careful about security.
I am well aware about the problems with the csv format. But a lot of times you can not really expect clients to be able to download and decode a parquet file, when all they want is to submit a password, download the data and look at it in excel.