I was also concerned about the wasted overhead. However I guess it's just there for compatibility (since space is cheap) and for common encodings you'll be able to skip reading it with range requests and use your trusted codec to decode the data. Smart move imho.
It really depends on the order of priorities. If the overall goal is to allow digital archeologist to make sense of some file they found, it would be prudent to give them some instructions on how it is decoded.
I just hope that people will not just execute that code in an unconfined environment.