When the recent Github v. youtube-dl fiasco happened, I remember reading similarly strongly-worded but dismissive comments regarding fair use, stating how it is quite obvious that youtube-dl's test code could never be fair use and how fair use itself is a vague, shaky, underspecified provision of the copyright law which cannot ever be relied on.
To me, seeing youtube-dl's case as fair use is so much easier than using hundreds of thousands source code files without permission in order to build a proprietary product.
There is a crucial difference though, the search engine links back to the content. If Google would just display the content on their verbatim, it would definetly not be considered fair use. Even like this several countries have restricted what Google can do when displaying e.g. News.
Somehow building a list of pointers to original content does simply not have the same ring to me as a product that rehashes all of the content. A rehashing of content sounds to me much more like, for example, publishing a sequel to my favourite book. After all, a sequel is just a rehashing of the same characters in new adventures. If we can't do that, why should Copilot be fine?
My point was however that I'm just utterly failing to see how the youtube-dl test thing could be more of a copyright problem than this entire thing based on millions of others' works that is Copilot.
To me, seeing youtube-dl's case as fair use is so much easier than using hundreds of thousands source code files without permission in order to build a proprietary product.