Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Users of JSON, probably the most common data interchange format on the planet, frequently have implicit requirements about key ordering. It is highly convenient to be able to parse a JSON string into a native Python data structure, add a field, emit it back, and preserve the ordering.


in what reasonable use case would the order of the properties on an object matter? I can't think of one


when you are diffing the serialized output?


From what I recall of the JSON standard itself, there's no guarantee about key ordering being significant. If you're diffing serialized output to compare two JSON objects you need to be serializing it in a consistent format, otherwise even whitespace is going to throw you off.


It's significant to a human that wants to know what has changed.


If a human is inspecting serialized JSON using pen and paper, the human is presumably clever enough to match up key for key regardless of ordering.

If the human is using a computer to compare two JSON payloads (as the use of a diffing algorithm suggests), the human and computer should be clever enough as a team to realize that they could just deserialize and reserialize each JSON payload such that the keys were lexicographically sorted and the data was pretty-printed in the exact same way before running it through the diffing algorithm. `jq -Sc` would do the trick.


Most diff programs don't have what you describe. And in a lot of cases you don't have the easy ability to "do stuff" before running the input through a diffing algorithm.


> Most diff programs don't have what you describe.

That’s why pipes were invented.

> And in a lot of cases you don't have the easy ability to "do stuff" before running the input through a diffing algorithm.

Well, in that case you’re not going to be able to meaningfully compare two JSON payloads because neither order nor white space have any semantic meaning in JSON. I’m really curious what you’re talking about though, since if you’re working on the shell you can easily use jq to do as I describe.


So you can sort the keys at serialization time rather than paying a performance penalty all the time?


I tracked down a bug once where someone was hashing state that happened to include Python dicts serialized to JSON, which caused different hash values for the same entity. I know, I know, insert tears of blood emoji, and any engineer worth his salt wouldn't be so sloppy. But you don't always get to choose to work only with top talent that understands why you should design things elegantly as a baseline[1]. In these cases, removing implicit gotchas like "the same code has the same behavior" (ie iteration over the same dict) is valuable.

[1] Well you _can_ choose to do so, and I recently have, but it's not without its costs.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: