Why is it being proposed as data- attributes? My understanding is that the point of these was for unspecified attributes. It seems odd to cut into that space for something being proposed as an actual spec.
https://www.w3.org/TR/css-speech-1/ seems like it could be a better approach. Though this draft specification is better suited wrt. some details like IPA/X-SAMPA pronunciation, which is something that should be addressed directly in HTML since it impacts the intended content of the document, not just its aural styling.
This is just a set of HTML attributes with no associated JavaScript API. Any client that doesn't care about enhanced screen reader functionality will be free to ignore them completely, and be no worse off than today. But it seems useful for clients – and users – who do care.
Although, this does only work when web content authors actually use these attributes, and I'm not sure how common that will be…
I feel you both are right. The specs for web-related technologies become more and more bloated everyday, and the barrier to entry is already too high (even for a software monster such as Microsoft!). From this point of view, it can be seen as obstructing the open web.
However, there should be no harm in adding optional specs for the interested parties, especially for niche uses. So this actually favors open web as it's better to have open standards rather than ad-hoc custom solutions.
It seems they specifically used the data-* attributes. These are already a thing in the HTML5 spec and can be used for personal use.
That means there isn't really much new for the clients/browsers to implement, except for the interpretation part. The parsing and DOM API already supports data attributes, so this spec only stands as a recommendation for content creators and screen readers to work with a officials standardized syntax.
I'm actually surprised w3 didn't come with a completely new set of tags.