Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Hyrum's Law (hyrumslaw.com)
73 points by pmoriarty on Oct 21, 2022 | hide | past | favorite | 52 comments


I see Hyrum's Law as the highlighting a problem with the robustness principle.

You start with:

     We followed the robustness principle, which is be conservative in what you do and be liberal in what you accept from others
But then, the 'things you accept' become part of your api:

     The problem with the robustness principle is a flaw can become entrenched as the defacto standard. Any implementation of a protocol is required to replicate the apparent behavior. This is both a consequence of applying the robustness principle and a product of a natural reluctance to avoid fatal error conditions.
And you arrive at:

     With a sufficient number of users of an API, it doesn’t matter what you promise. Any observable behavior of your system will be dependent upon by somebody

I have a long rant about this, and how web browser standards fell victim to this. It's why user agent string look like they do and why 'chucknorris' is an HTML color.

https://corecursive.com/internet-is-duct-tape/


There's one exception to Hyrum's Law: nobody ever uses bucket interface of C++ std::unordered_map.


Throw Chaos-monkey at it. Make the observable behavior of your service random where outside the specification.


Sounds a great, I'll keep that in mind when I need a source of randomness next. No need to work with the ugly <random> interface, when the API I already use, gives me randomness. I sure hope they never change this lovely feature.


Good libraries are random on non-guaranteed parameters (like sort order) so they client tests detect that non-guaranteed immediately and clients don't depend on non-guarantees.


This is not a good idea because Chaos monkey will use a non uniform distribution that is time variant.


He never specified he needed uniform randomness.


Generally when using a random number generator, you will want to know the distribution and behavior through time.

Without knowing anything, the RNG could generate all 0s one day, and all 1s the next day, which most people would consider a very bad and worthless RNG.


That is why rng whitening algorithms exist.


They will not work if the distribution is time-variant.


These issues can be improved by rising above the bad or non-existent communication practices seen in software engineering, which are partly a technical issue and part a cultural one.

If you want to be helpful to your users, rather than punish them for doing something you didn't explicitly allow them to do, you can communicate. Issue warning messages to people using a behavior that will be deprecated. Keep the old version around for a while for compatibility, but require users to opt into it and tell them when it's going away. Email your users to announce breaking changes. Et cetera.

All of this can take significant effort, and of course some people won't listen, won't check logs, will ignore warning messages and sunset dates. Which is the other side of the problem.

Whether we're changing software or adapting to changes, the message remains the same: software engineering isn't just futzing around with code until something "works", and then anything that breaks after that is someone else's problem. We live in a society where we all depend on each other. We need to communicate, and communicate in a way that values each other's success.


This is so true, probably even without "a sufficient number of users". I wrote lots of code generation stuffs for binding and networking and I've seen a good number of people who always want to just hack into generated codes because it's easier than asking me nicely. Of course, I did put a very clear "DO NOT EDIT" headline for every single generated files. They just don't care. They just edit and commit. Unfortunately the team at the moment didn't have a tool or process to block them, like code review or CI. But even if we had that, I'm sure that they would find a creative way to circumvent that. Anyway, I could stop them only after making the generated code to somehow validate itself and ensuring it hard enough to understand and bypass.

Also, there was another creative exploit of our in-house slab allocator like object pool. The contract was simple; if the client asks, it will return a new object handle, which is just a pointer with some metadata. And every time it needs to access object, it should explicitly fetch the object pointer with some utility function which performs very basic memory safety checks like dangling pointer or double free etc. I wasn't its designer and it's not perfect but it actually caught some potential bugs on earlier stages so probably gave us some value. Anyway these are tangential to the real story.

One day someone found out that its implementation resembled a sort of linked list on a large array and technically you can go through those objects if you're willing to play with reinterpret_cast. They decided to exploit that for "an efficient object query" and did it secretly. Obviously they didn't have all the handles, so safety checks are bypassed. This seemed to work for most cases because the code checked a tombstone bit, a part of object metadata rather than handle metadata. But for some case, an object could be released and a new object takes that memory place during the iteration (it's multi threaded) and this caused a number of subtle memory bugs since the object access was done out of iteration context.

Yeah, library writers might deserve the right to break its clients if they ignores the contract. But I just wanted to say that sometime, it could be inevitable if the job's done as your means of livelihood and in that case you probably want to keep those creative people in your mind when you design your library. That probably won't stop them, but at least you will be less shocked by finding out their sole existence...


Often when it's mentioned it's treated as a hard rule, claiming semver has no justification as any change is a breaking change.

Though in practice most software packages are not used extensively to the point that any observable behavior is depended upon (and Hyrum himself opens with this caveat "With a sufficient number of users ...").


"sufficient number of users" is a bit vague.

Is there intended to be an implicit limit of "all the people in the world who might plausibly use this API"? Then I'm sure the law is false.

If there isn't intended to be such a limit, then the law is unfalsifiable and not useful as a guide to activity.

I think this purported law comes from the sort of advice you can end up giving when you notice that many young programmers err too much in the direction of X, so you make an exaggerated statement in the direction of not-X to correct them. This can be distinctly unhelpful in the long run.


It's born out of direct experience trying to change code within Google.


I'm sure it is, and if you interpret that page as really saying something like "If you're working at Google around 2020 and you're thinking of changing some undocumented behaviour then you should be aware that it's more likely that someone's relying on it than you might expect", then I've no doubt it's splendid advice.

But that's not what he actually wrote, and I can easily imagine a future where this "law" starts to be quoted in contexts where it doesn't turn out to apply.


It's a decade old, and the things that Titus and Hyrum work on aren't Google specific but "improving c++ apis". It's just that Google has

1. A lot of code 2. The ability to test all of their dependencies code

And so the people making the changes can detect all of the breakages. If it was possible for me to run the CI for every system on GitHub that depended on the library, it might be more obvious, but usually the cost of these unintentional breakages is borne by users not maintainers.

The law applies to anyone who maintains a widely used library, whether that's good, Django, or std::.

And note that many more people in this thread are calling the law "obvious", than thinking it's not true. That should give you some pause.


I think it's likely that nobody in this conversation actually disagrees with anyone else about how frequent such dependencies are.

But the "law" is not literally true, so to make it useful you have to mentally add some reservation: "no, of course I don't mean that someone will be depending on a detail like _that_".

The trouble then is that it doesn't work usefully as a piece of advice: if you're not already experienced enough to know where to draw the line, it doesn't help you get any more accurate.

It's like saying "however big you think X is, it's bigger than you thought".

But maybe I'm wrong about what you believe. Do you actually think the "law" as stated is literally true (without using a quibble like "well, there's no upper bound on what we mean by 'sufficient'")?


This is also Mother Night's law:

Be careful who you pretend to be, because you are who you pretend to be.


It ain't different than for user interfaces. Its all about managing expectations based on existing usage and behavior. Migration paths to new API endpoints or changes in behavior are possible but require hand holding in both cases.


The solution is not to care about breaking customers that depend on implementation details. If every interface author had this policy then I'm sure people would write better code.


He went on a podcast last year to talk about it and his recent book: https://spoti.fi/3EY33E4


I have strong feeling that it’s a specialized form of Pigeonhole principle. Am I missing something?


No, the pigeonhole principle is the mathematical fact that you can't allocate m thingies to n doodahs, where m > n, without doubling up somewhere, i.e. having a doodah with more than one thingy allocated to it. (Eg. thingies = letters, doodahs = pigeonholes.)

Applying the pigeonhole principle here would give "with a sufficient number of users it's guaranteed that some observable behaviour of the system will be relied on by more than one user", which is true (assuming the system has a finite number of observable behaviours) but not very interesting.

Hyrum's law says that with enough users all the observable behaviours are relied on by at least one person. Mathematically it's trivial for that not to be the case regardless of how many users you have. You just need a single behaviour that nobody's relying on. So Hyrum's law isn't mathematical, it's an observation about software development and human behaviour.


This immediately come to my mind: https://xkcd.com/1172/


(It's mentioned in the article.)


Please tell me it wasn't Hyrum himself who made an observation that many people have made over time (even citing some of them), then called it a "law", named it after himself, and bought the domain name for it.

EDIT: Upon re-reading it, I'm seeing that Hyrum gave this Titus Winters guy (who I've never heard of), also at Google, credit for giving credit to him. ...what an impressive density of greatness over there.


It looks like he merely created a website named after the law which is named after himself. Which is slightly less pompous.

Once someone has popularized a law in your name then it’s too late to change the name. For related reasons.


Titus is the head of the C++ org. He's pretty chill, not at all pompous like you seem to be implying.


I'm not implying anything about Titus, I'm implying something about Hyrum. Apparently he must have thought to himself: "Well, I'm still missing a name drop here for my credit-grabbing." If that name had been "Niklaus Wirth" or whatever, and it was Niklaus Wirth who put up the website, then, sure.

...but the best name he could come up with is a name only recognized by fellow Googlers and people attending highly specialist conferences, and he's going around telling people "Hey look, this super-smart thing that I once said. It's so cool. Look, this guy over here also thinks that I'm the one who deserves credit for it. This is so smart, it will blow your mind. Now listen up, ..."

If this kind of thing doesn't "pop" to you as a violation of social norms then you know you're living in a subpopulation where, basically, narcissism has gone endemic.


I think you are extrapolating a bit much from an almost in-joke in a community you aren't part of. Domains are cheap, putting up a website about something doesn't mean much.


I met the eponymous Hyrum; went to lunch with him on a social occasion that had nothing to do with software. He is a very nice guy and quite humble, in fact. Thoroughly good chap.


I too see the page as a bit pretentious, but life is so short and this small quantity hubris isn't going to lead to the starvation of orphans. Let people deploy some arrogance.

Most HN users do more damage to the world through their adtech careers than any blog post.


>Titus Winters guy (who I've never heard of)

He's well known from the talks he has given at CppCon.


Treating Hyrum's Law as something to pay attention to when improving software is a grave error.

There are good reasons to change undocumented behaviors with each release just to discourage dependence on them. E.g., change hash functions. People will squawk, but will be better off.

Probably the best thing is to throw exceptions from functions that used to not. People who imagine that having searched the whole call tree one time means anything deserve to be surprised. Cleanup code should always be in destructors where it will be exercised properly on every run.


Nobody reads the spec though.

I recall a bit of drama in Linux land was when ext4 broke the "write temp file + rename over old one" classic technique for safely rewriting config files. People ended up with data loss and were outraged. ext4 devs said "But POSIX never actually promised this! Doing things our way is much better performance. Add extra fsyncs to actually get things to stick".

The result as I recall was a lot of drama back and forth.


The drama was because POSIX formally offered no way to get the needed semantics without crazy changes. Shells would have had to be made very very slow. And the things made faster were already fast, and not even very important to have fast.

When shells that trace back to v6 Unix depend on a behavior, that being undocumented is a bug in documentation.


I think the lesson there is that "the spec is what the program currently does" doesn't work well for the behaviour around rare errors.

(POSIX was a red herring: Linux has never limited itself to POSIX, and anyway POSIX didn't promise that doing things the way the ext4 maintainer recommended would work either.)


Depends on what the API is, right? Like, Linux has this law encoded as “don’t break userland”, ie a very absolutist interpretation. This makes sense because the people affected by a breaking Linux API change are not the people responsible for shipping code that depends on undocumented behaviour.

But for a lot of APIs, including nearly all OSS libraries that are only updated by the developer using them, your stance makes a lot more sense.


But definition of breaking userland is essentially - if we make a breaking change and no one notices is it a breaking change?

In other words just test it on every piece of software out there.


> There are good reasons to change undocumented behaviors with each release just to discourage dependence on them. E.g., change hash functions. People will squawk, but will be better off.

How is that not paying attention to Hyrum's Law, something you just told us to not do? Hyrum's Law explicitly does not say to never change undocumented behaviors, and what you propose is a reaction to the observation in it.


I think it's useful as a counter to Postel's law.

Postel's law (or robustness principle) states that you should be liberal in what you accept, conservative in what you output. Hyrum's law shows the danger in being liberal in what you accept.

(Shameless plug: I've just made public an open source CLI flashcard app where the example deck includes a set of Hacker Laws including the aforementioned ones, you can try it here: https://github.com/SteveRidout/flashdown)


This is an interesting point. Taken to its extreme, Postel's law will trade some abstraction for robustness. Conversely, Hyrum's law will trade some robustness for abstraction.


Hyrum's law is an extension Postel's law to say "accept everything you accepted in the past"


Postel's law has turned out to be a disaster for security, and for extensibility. If everything you can get already means something, nothing can ever mean anything new.

"Be clear about what you will accept", instead. And, make sure there is an easy way to discover why it was not accepted.


No, it does not prescribe anything you should or shouldn't do.


There are different degrees of "pay attention". You should consider how Hyrum's Law affects your development; that is, feel free to change that hash function but be aware it's going to break someone.

What I think you mean is: don't live in fear of Hyrum's Law.


Yes. The law says that something will depend on incidentals, not that changing such incidentals is wrong.


In fact, Hyrum Wright and Titus Winters (the person who named Hyrum’s Law) both spend a large amount of their time specifically upgrading through Hyrum’s Law examples.


This sounds like its own huge maintenance burden for me as a library developer. If I picked a particular hash function, it's because it fulfilled my requirements. I'm not necessarily going to have a list of backup options that meet those requirements and also don't negatively impact other parts of my own code.


If this isn't peak silicon valley hubris I'm not sure what is. He even included a poem! LOL.


I have arein3's law: If you depend on behavior not specified in the interface, then it's possible that it will break in the future! (Shocking, I know)

arein3slaw.com




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: