It's not really clear what this even means. HTML5 and CSS3 aren't new versions o...

cxr · on June 10, 2024

It's a semi-noob illusion that focusing only on modern standards and practices would surely result in a leaner implementation. This goes all the way back to the big push for web standards in 1998—e.g. you can find Slashdot Q&As from the early 2000s where people bring up the idea of how much smaller the browser could be without quirks mode and IE compatibility and then get corrected about how much of the code base this stuff actually takes up.

zellyn · on June 10, 2024

The Ladybird folks frequently claim that directly implementing the modern versions of standards is a huge benefit.

cxr · on June 10, 2024

It's a lot easier to implement the HTML5 parsing algorithm from the spec than trying to reverse engineer it yourself. That's a completely separate matter from the confused belief that "ignor[ing] everything that's not modern HTML5 and CSS3" would somehow "cut down on the scope significantly".

shiomiru · on June 10, 2024

It's not a huge benefit because they are simpler; they are usually equivalent to the old standard. The benefit comes from newer standards being a lot more precise.

e.g. HTML4 did not specify what to do with invalid markup, which makes writing a conformant parser easier. In practice, many websites weren't valid HTML4, so you had to reverse engineer whatever the other parsers did with invalid markup.

HTML5 doesn't really have a formal grammar, it's specified as an imperative tokenizer and parser. It actually takes somewhat longer to implement than HTML4, but it doesn't suffer from compatibility issues.

OTOH there are new problems with the "modern" standards that old ones did not have:

* It's unversioned, updated pretty much daily; insert walking on water quote[0]. Random example: I added support for the (ancient) document.write API to my browser a few months ago. Recently I looked at the standard again, and it turns out my implementation is outdated, because there's a new type of object in the standard that must be special cased. Many similar cases; this particular one I don't mind, but it shows how hard it is to just stay fully compliant.

* It's gigantic and bloated, full of things nobody ever uses. If something gets into the standard, it typically won't ever get removed, and WHATWG has operated under this policy for more than a decade. So implementing it from start to end takes way too long, and the best strategy to get something useful is to "just" implement features that websites you use will need.

* Above is the WHATWG model, which applies for HTML and DOM, but not CSS. The W3C model (used in CSS) has versioning, but they broke it in a different way: there is no comprehensive "CSS 3 standard", just a bunch of modules haphazardly building on top of each other and CSS 2. Plus it's much less precise than the HTML standard, with basic parts left entirely unspecified for decades. See the table module[1], or things like this[2].

[0]: "Walking on water and developing software from a specification are easy if both are frozen." - Edward V Berard

[1]: https://drafts.csswg.org/css-tables-3/ still in "not ready for implementation" limbo after years.

[2]: https://github.com/w3c/csswg-drafts/issues/2452 - "resolved" by an unclear IRC log(?), but never specified to my knowledge. It's not an irrelevant edge case either, Wikipedia breaks if you get it wrong.

immibis · on June 10, 2024

It worked for Wayland... sort of.