> If code with a license, say GPL, goes out somewhere else, by someone else (and therefore I don't have the right to change the license) and then I fork it, as per the license I keep the license and put it on github, and Github violates that license, aren't they violating the law? Don't they then have to remove that code?
You are violating the law, probably. The ToS would say something like "I hereby declare that I hve the right to agree over the software to submit it under the ToS".
I'm not violating the law, I'm violating the ToS. They should them remove the my account and the offending code, lest they then go on to violate the law, no?
Yes but I guess it won't happen until someone complains. Similar to other content, e.g. YouTube, but in reality nobody requests takedowns of forks/copies.
Alright, now suppose someone does. Doesn't that mean Microsoft has to rework all work they made with these codebases using this legal argument and not a fair use one? Doesn't even doing this set them up for a potentially very expensive compliance action?
I think what you say is true: either they train on any open sourced code with fair use, no matter if it was published on github or anywhere, and ignoring the license, OR they trained on data that is potentially not complying with their ToS (e.g. uploaded by someone that is not the author, regardless of license, they couldn't legally agree to a ToS that gives away additional rights of the work).
However, the reality is that this is all extremely muddy, far from proving that software A has copied some code from software B where you can just compare the source code. There are too many muddy steps, and you can bet that Microsoft will just get away with it.
You are violating the law, probably. The ToS would say something like "I hereby declare that I hve the right to agree over the software to submit it under the ToS".