I would like to stop here and thank one of the microsoft STL contributors Stephan T. Lavavej[1] for all the work he has done on maintaining his own MinGW repository for many many years, keeping his hand compiled gcc, SDL and boost versions up to date.
These kinds of announcements from Microsoft still boggle my mind a little.
It seems only a little while ago that I heard stories of a friend’s high school programming teacher, who received threats from Microsoft’s legal department for emailing them a patch
to fix a crash in Windows 95.
I'm not sure you've noticed, but one of the core authors of this STL implementation -- Stephan T. Lavavej -- has exactly the initials of the project he's working on: STL. This means he's perfect for this job :)
I'm not well-versed in C++, so I'm a bit confused about the difference between the C++ Standard Library and the Standard Template Library (STL). Wikipedia tells me that they're related but technically separate things, yet this GitHub page says:
> This is the official repository for Microsoft's implementation of the C++ Standard Library (also known as the STL)
and appears to use the two terms interchangeably. Is this the C++ Standard Library or the STL?
It's the standard library.
People sometimes say STL to refer to the entire standard library because the names are similar and because the STL is a huge chunk of the standard library (and the non-STL parts of the standard library also tend to use templates anyway, to add to the whole confusion...)
But if you look at the repo you see the C headers like cmath, etc. that have nothing to do with the STL and are just part of the standard library.
There was a library called the STL that is now, with modifications, a part of the std library. So in a sense you could say that the STL is a subset of the std library, or that it was an inspiration for part of it, but many people just use the terms interchangeably.
So, what that read me says is correct, it is also called the STL. If you want to be precise you should probably only call it the std library though. There are still implementations of the original STL around separate from the std library, so keeping the distinction is probably still wise.
Also to complicate things, some people use the term STL to refer to the specific parts of the std library that are from the original STL. Those pieces are containers, algorithms, functions, and iterators I think.
tl;dr it's all a mess and people say the two meaning different things all the time.
> There are still implementations of the original STL around separate from the std library, so keeping the distinction is probably still wise.
As far as I am aware, those libraries are defunct and unused (e.g. STLPort), so I disagree that there is an important distinction to preserve. The most widely used C++ Standard Library implementations by far are GCC’s libstdc++, Clang/LLVM’s libc++, and MSVC’s STL. Being bundled with compilers means that there is no reason to want an implementation of the historical ancestor library.
Intellectually, the STL represents a significant set of innovations and a new way of designing reusable software, and it is an elegant jewel of design; the C++ standard library does not share those distinctions. If someone is having trouble with a software design problem in another programming language, it might be a useful suggestion to tell them to try doing it “the STL way,” but telling them to do it the “standard library way” would mean something quite different.
There's also the issue of credit. The STL is a work by Alexander Stepanov and Meng Lee. Conflating it with the C++ standard library of which it has become a part—or, worse, with a particular company's implementation thereof—has the distasteful flavor of plagiarism.
It’s what we’ve been calling our code for a long time (in VCBlog articles, etc.) and it’s a perfectly valid use of metonymy: https://en.wikipedia.org/wiki/Metonymy
Scott Meyers wrote a book called “Effective STL”, and there’s a somewhat-similar-interface library named https://github.com/electronicarts/EASTL - neither of these lead to confusion.
This is a bit of a pointless question, but you use this phrase a lot:
> it’s a perfectly valid use of metonymy
and I'm not really sure what it means. I appreciate the wikipedia link and understand what metonymy is, but at the end of the day, metonymy is a figure of speech and formal technical writing discourages the use of extraneous figures of speech that don't serve any purpose, preferring clear technical language. If not this, then what does an invalid use of metonymy look like?
Metonymy has a purpose when it replaces a long, exactingly precise description with a short term that is equally recognizable. For example, “Wall Street” versus “US financial institutions, which may not necessarily be located in New York”.
An invalid use would be one that is not widely recognized. For example, referring to our library as MSVCP. That’s the name (minus version) of our DLL, but very few people recognize it as such.
> Metonymy has a purpose when it replaces a long, exactingly precise description with a short term that is equally recognizable
So... not here? ("Standard library" and "STL" are about the same number of syllables).
I still think metonymy is a bit besides the point. The explanation "Using STL in place of the standard library is a perfectly valid use of metonymy, which is a type of rhetorical device, which is typically avoided in formal writing - therefore you should not do this" seems coherent to me. That is - being a "perfectly valid use of metonymy" is neither clearly an argument for not clearly an argument against the use of term.
I know I'm replying to the "STL" with vastly more domain experience here than me, but isn't it true that Scott Meyers' book covers just the STL, as originally defined? The only thing outside of the STL as shown in the table of contents is some lessons on std::string, but even that is in context / comparison with vector, an STL container type.
I'm sorry, but effective communication is valuable and it is annoying that your employer continues to conflate the C++ standard library and the STL. Are we going to start referring to the standard library as "boost" a few years hence? That's basically the situation we've ended up in here.
I used to use “the STL” to refer to “the part of the C++ Standard Library strongly influenced by the design of the historical library”, excluding iostreams etc. Later, I decided this didn’t really achieve anything. It’s a Standard Library that’s full of Templates! There are other design philosophies in there, but they are increasingly unimportant. Having to say “C++ Standard Library” all the time, or trying to popularize a new acronym, are worse alternatives.
Yes, effective communication is valuable. I have been working on this library for over a decade and I find that the short, memorable acronym ultimately makes it easier to talk about the thing.
That said, we are primarily using the acronym for the repo URL and the directory structure; the readme makes it clear that this is “also known as” the STL.
While I admire your work and feel sympathy towards your wish. This naming got me confused. I wondered whether it's just the templated part of the standard lib or whole.
It's anecdotal but there certainly is a strong spirit of getting it right. I guess it's a characteristic of people working with it. In cpp details like that matter a lot.
I personally would feel more sympathy towards naming msvc implementation after your initials, then it could be even called STL library =D. Same effect with better motivation imho. Cheers.
The very first line of the readme had me searching through this thread to figure out what exactly this repo contains. Is it the standard library, or just the STL?
Very cool release, but confusing naming. I'm sure it's longstanding practice at Microsoft, but maybe instead of saying "also known as the STL," it would be more clear to say, "which we also call the STL."
Anyway, I'm looking forward to poking around this code.
That's a fair reason. I do echo the sibling comment though that naming the repo "stl" was/is very confusing. I would have called it "stdlib" or something.
Jeffrey Friedl's book on regexes unapologetically popularized the incorrect use of the terms "NFA" and "DFA" to characterize the different implementations of regex engines. Translating to something more sensible, "NFA" means "unbounded backtracking," where as "DFA" means "Thompson NFA simulation." In this vernacular, as far as I can tell, there is no word to describe an actual DFA.
PCRE's docs continue this confusion. For example, they expose a DFA API... But it's not actually a DFA!
One might say this is similar to how "regex" doesn't necessarily imply "regular," which is true. That's definitely a battle that has been lost long ago. But NFA and DFA are even more specific jargon terms, and the way in which they have been redefined by the Friedls of the world has led to exactly your confusion over and over again.
Unfortunately, yes. Although there is a connection. If you squint, you can see parts of the NFA. Indeed, it is very easy to simulate an NFA with unbounded backtracking (taking exponential worst case time). All you need to do from there is add a few op codes for look around and backreferences. Where you end up still looks like you're simulating an NFA, but the actual implementation has a lot more power.
To be fair, the same thing happens with the Thompson NFA simulation. It's not too hard to add a little extra power to it in the form of memory in order to track capturing group locations. It becomes more powerful than an NFA with this addition, but retains its linear time bound. This is why I've normally seen this addition called the Pike VM. (At least, Russ Cox and I both use than name.)
For completeness, you can also use bounded backtracking to simulate an NFA. It keeps track of all states visited per input byte, so it uses a lot of extra memory, but maintains a linear time bound.
Any NFA that does not have backreferences can be converted to a DFA. If you have backreferences in a regex, you must use an NFA to drive it.
Also, backreferences stand a chance of turning an NFA into NIA. Turns out it's not possible to programmatically determine which backreference-containing regexes will halt or not.
Cool! Are you going to accept patches to port this to other operating systems? It would be cool if `clang -stdlib=msvc` joined -stdlib=libc++ and -stdlib=libstdc++.
I'm at CppCon and attended the talk today where Microsoft announced this to the CppCon attendees. The folks speaking made it clear that they do intend to accept pull requests, and that they're positive about it being used outside of the VC++ environment under the appropriate license (Apache 2 with LLVM Exception).
--- LLVM Exceptions to the Apache 2.0 License ----
As an exception, if, as a result of your compiling your source code, portions
of this Software are embedded into an Object form of such source code, you
may redistribute such embedded portions in such Object form without complying
with the conditions of Sections 4(a), 4(b) and 4(d) of the License.
In addition, if you combine or link compiled forms of this Software with
software that is licensed under the GPLv2 ("Combined Software") and if a
court of competent jurisdiction determines that the patent provision (Section
3), the indemnity provision (Section 9) or other Section of the License
conflicts with the conditions of the GPLv2, you may retroactively and
prospectively choose to deem waived or otherwise exclude such Section(s) of
the License, but only in their entirety and only with respect to the Combined
Software.
>Now, can anyone ELI5 the high-level goal of this exception?
I am not a lawyer; this is my extremely limited understanding.
1. Normally the Apache license requires attribution, even in binary forms. We can't say "You said #include <string>, now you must staple this to all your binaries". libstdc++ has a similar exception to the GPL for the same reason: http://gcc.gnu.org/onlinedocs/libstdc++/manual/license.html
Now, can anyone ELI5 the high-level goal of this exception? At a glance it looks like GPL2 folks are still not happy with the patent provisions of Apache 2.0 and this is added to appease them.
The first piece is standard for compilers, and essentially says "our compiler linking in bits of this code into your code doesn't count." It means the license matters if you're modifying the compiler and support libraries yourself, but not if you're merely using the compiler.
The second piece revolves around the concerns of mixing licenses: Apache 2 and *GPLv3 are happy in combination, but that's because of some explicit compatibility text in GPLv3; this extends the compatibility text to GPLv2.
Just to be clear: this means useful for developers who need to build programs to run on Windows, but can scarcely imagine developing on it. They treat it as a huge embedded system.
Last I heard, the maintainer of MinGW was one of those: never booted Windows, and tested only on Wine. (That was 20 years ago.) Someday Microsoft will be among their august number.
> Just to be clear: this means useful for developers who need to build programs to run on Windows, but can scarcely imagine developing on it. They treat it as a huge embedded system.
I'm not just talking about people who don't want to run Windows, I'm talking about things like CI infrastructure. Much easier to automate cross-building everything from one platform with one set of tools.
There is no reason to use MSVC when clang-cl[1] exists. Both Google and Mozilla build their browsers for Windows platform for more than a year without using MSVC. So it is a pretty solid platform and ready for production.
As barchar (my coworker on the STL team) mentioned, not only does Clang use MSVC’s STL on Windows, but also the support is from both Clang and Microsoft. We test MSVC’s STL with Clang as a first-class citizen, and have shipped features active for Clang before they were active for C1XX (MSVC’s compiler front-end), notably Class Template Argument Deduction.
Clang is a great compiler! We think it’s extra great with our STL. We added Clang/LLVM to the VS installer to make using it easier.
yep! Clang-cl uses this library btw (I think it's possible to get libc++ sorta working on windows, but it's not something that's officially supported).
Microsoft has come a long way lately when it comes to open source. Apple seems to be the only major company that is still not an enthusiastic participant.
Apple wrote and open sourced their C++ compiler (clang) and C++ standard library (libc++), both part of the LLVM project, years before Microsoft did. In fact it was more than a decade ago in 2007 that Apple open-sourced clang, and now widely adopted by the industry. Sure Apple's participation in open sourcing may not be "enthusiastic" with huge "️Heart Open Source" billboards but it gets real job done and has real positive impact.
Perhaps Apple's contributions aren't as visible, but just off the top of my head here's two major open source contributions from them: Swift[1] and FoundationDB[2]
Plus there's a page for all their OS components: https://opensource.apple.com. There's also WebKit and LLVM, which started at Apple and are still heavily driven by engineers at the company.
WebKit was certainly started at Apple. I just checked and the paper for LLVM was actually published before Lattner joined Apple, so I guess it's not quite true that it was "started" there (though, of course, most of its development has happened there).
That's correct. LLVM was started at UIUC, Apple later hired Lattner with the intention of making LLVM production-ready (in part as a replacement for the GNU toolchain), and ultimately to replace the GCC frontend by a custom / dedicated one.
The Clang project is part of the LLVM ecosystem, but is a major effort in its own right, so I think it is right to give credit where credit is due, in the same way e.g. the Rust compiler uses Rust as a backend, but credit for it doesn't to to the LLVM project.
This is an extremely disingenuous statement. WebKit was forked from KHTML and everybody knows this. It "certainly" sounds like you're trying to rewrite history (and you make a similar assertion elsewhere in this thread)..
I say it is disingenous because it is completely counter to what can be found on something as mainstream as the wikipedia page.. one does not have to dig deep.
If you want to dig deeper, the mailing list archives are available for all, and I think given the contribution KHTML made to Webkit initially it is poor form to diminish its role.
"Blink started at Google". It's all context. As someone mentioned, it's technically correct but in the context of this discussion thread which is about open-source origins and contributions, saying Blink "started at Google" comes across as dishonest. Blink started at google with a fork of Webkit, which itself was not created in a vacuum at Apple.
That’s correct, but in the context of this source release, I think it’s not more incorrect to say “Apple created WebKit” than it is to say “Microsoft created this library” (it started life as a Dinkumware product)
Apparently it is gaining traction. For example, the Tensorflow project is working on building a next generation library with Swift - https://github.com/tensorflow/swift
Only because Lattner is part of the Tensorflow team, and they still don't have a solid story regarding OS support, in spite of several people pointing out how much better supporting Julia would have been.
Which incidentally also supports Windows out of the box.
Thank you, the second link you provided was an extremely interesting read. It seems to me that Swift has reached a level of maturity where the language and the ecosystem can now exist independently of Apple’s support.
Appart from siblings' mentions, I think they also opensourced networking libs/protocols, bonjour and cups. Cups should be mentioned especially, since that was a basis for network printing under linux. (I hope im right on this)
While Im not a fan of the corporation, its strategy and most of the policies, some dilligence is due here. They are known to release low level stuff.
CUPS was not originally written or open sourced by Apple. CUPS was released in the late 2000s as open source and soon after became the default print system for most Linux distros. Apple hired the original creator and purchased the source code in 2007.
Due to its nature of C++ templates, MSVC STL source code was practically available to anybody. I think in theory, I could get hand on free of charge version of MSVC without agreeing the MS EULA then extract the archive, again without agreeing the MS UELA, and peak the source code.
In reality, I don't bother to do that and whenever I teach C++, I ignore MSVC and its STL. I'm willing to teach but Nooo, it's MS who prohibit me to do so.
The MSVC library has always been "source available" with Visual Studio, but I guess MS thought that they're very unlikely to actually make a profit from it[1] so they're giving it a permissive license instead.
[1] The "new MS" seems to delight a lot of people, but IMHO it's not all great; I'd much rather they not be "shedding" all this open-source stuff and kept selling software, if it meant they wouldn't stuff telemetry in everything they touch.
> And they don't even ship the CRT with the compiler any more
Just in case you/any other reader didn't know: That's just because they don't consider the ucrt compiler-version-specific anymore, it's distribution (binaries and source) moved into the windows SDK, E.g. Windows Kits\10\Source\10.0.17763.0\ucrt. The vcruntime is still with MSVC.
You are correct. I'm using an obsolete name for the library that contains the dang functions I want to debug :) I honestly don't understand the rationale for putting it in the Windows SDK. It's not like any other compiler is going to use it and you could just update it with VC patches if it changed. The underlying OS API is even less likely to change anyway (UCRT being the C99 lib) so tying it to the Windows SDK is puzzling.
I'll concede I'm probably missing some obvious gotcha.
The separately compiled parts of the C++ Standard Library have shipped in VS installations for many years (look for xrngdev.cpp in a subdirectory typically named crt/src, that should be distinctive enough). This was intended for debugging purposes, covered by the VS EULA as always. These parts weren’t standalone-buildable and we occasionally forgot to ship a file or two, but we always tried to make it all source-readable.
That last statement was very bold. Do you think there is a direct link between open-sourcing this library and increased telemetry? What about VSCode and increased telemetry?
We really don’t like the preprocessor, but it is currently necessary for many things, especially supporting different architectures, compilers, and Standard version modes. We regularly purge our implementation of no-longer-necessary macros, even when it breaks (non-Standard) code, as we recently did with `_NOEXCEPT`, and we look forward to removing even more.
Please file an issue if you encounter macros that don’t appear to be necessary.
I was looking for this a couple of months ago but I couldn't find it in the CRT source. I don't see any tags indicating compiler version releases, so it's a bit useless for debugging purposes (which is the main use I would have for it).
CRT source really should also be also in an easy to find place instead of tucked away in the Windows SDK. But I've given up expecting any kind of actual developer consideration from the new Microsoft.
This is a separate layer from the CRT source, which is why you didn't find it there. It's maintained separately, versioned separately, and deployed separately.
This also doesn't actually appear to have any commits corresponding to any released compiler versions- commit history only goes back a couple of weeks. I suspect they'll show up once an actual release has shipped out of this repository.
Our commit history is stretched across multiple Microsoft-internal source control systems. In my career (2007+ in MSVC), that’s Source Depot, two databases of Team Foundation Server, and currently Git in Azure DevOps.
Unfortunately, we will not be making the history prior to the GitHub “Initial commit” public. It would be a huge amount of work, would bloat the git repo, and would be of little utility. (We occasionally have to perform programmer-archaeology, but rarely more than a year, and usually it’s only something that the developer who made the change can do due to personal memories.)
Consider this a fresh start. We will figure out a way to tag commits corresponding to shipped releases.
When you select the windows sdk version you want to use in visual studio it switches you over to the version of ucrt in that sdk. It's not dependent on compiler version
I’d suggest this is the opposite, this is Microsoft changing their ways for the better. It’s really hard to open source legacy code, this new code should be the restart of open source releases going forward, and who knows, maybe they’ll upload historical tags at some point in future. Personally I think there will be a lot tighter integration between open source git repos and Microsoft’s newer developer tools efforts, including eventually cross-platform versions.
Of course, but my point about the new Microsoft was in relation to UCRT source, which used to be part of the standard VC runtime library and distributed with the compiler.
This new release of the c++ stdlib is fine and I'd actually be fine with the entire C/C++ runtime (including UCRT) being on github. At least there'd be just one place to look for it.
As other posts pointed out, I believe it moved from the compiler to Windows SDK, because it was not considered compiler-specific any longer. Not sure if that addresses all of the complaint. I’d be first in line to say that the mess Microsoft has in distributing proprietary tool chains and SDKs is pretty terrible. The fact that I have to have Visual Studio installed for certain msbuild features, though legacy ones, still bugs me. But then, it could be worse (see Xcode).
Any color on why the choice of CMake for build? Versus bazel or open-sourcing Microsoft’s tool? Something like bazel remote builds is an absolute game changer for large C++ projects (especially versus CMake).
(re "We're working on a CMake build system... Until that's done, we're keeping our legacy build system around in the stl/msbuild subdirectory. (We're keeping those files in this repo, even though they're unusable outside of Microsoft, because they need to be updated whenever source files are added/renamed/deleted. ")
They want you to contribute but you have to assign microsoft your copyright and give them a guarantee in case anyone sues microsoft over your work. (People sue without merit literally all the time and legal fees are bankruptcy material for any individual).
No, flipping, thanks.
Surely microsoft can, you know, pay you for the copyright if they want own it when they decide to merge your work? Or just take it on the same license. And as far as providing legal guarantees for microsoft? Yeah that's gonna be a no from any sane individual. Perhaps a big company like google might agree to those terms but also, perhaps not.
So no they don't want you to contribute, at all, or they'd do it right.
This looks quite disappointing. But as noted elsewhere you can read some source, or fork and refuse to contribute back I guess... But why?
All open-source licenses effectively require this; otherwise, by default, nobody would have a right to copy or use your contribution.
> give them a guarantee in case anyone sues microsoft over your work
I don't see an indemnification clause in the CLA. Which language are you referring to?
> (People sue without merit literally all the time and legal fees are bankruptcy material for any individual)
If they had even a shred of plausible belief, the estate of Elvis Presley could sue you tomorrow for defecating on a sofa at Graceland and potentially bankrupt you, even if you've never visited. What's your point?
Release your patch on the same license as the project. The End. Everyone is granted a license to the patch under that license. Nobody can take your code and re-license under a non-compatible license.
"a. Copyright License. You grant Microsoft, and those who receive the Submission directly or
indirectly from Microsoft, a perpetual, worldwide, non-exclusive, royalty-free, irrevocable license in the
Submission to reproduce, prepare derivative works of, publicly display, publicly perform, and distribute
the Submission and such derivative works, and to sublicense any or all of the foregoing rights to third
parties. "
Copyright assignment has long been used as an argument of why not to contribute to FSF projects, and is arguably one of the reasons that Clang/LLVM were able to garner contributions from a much broader set of folks.
1. https://nuwen.net/stl.html