> It's not a violation of copyright to train a model.
This is a very bold assumption, one that I assume will not hold in the court of law in all cases. I think the nuanced question is: to train a model that does what, exactly.
Let's say distributing meth recipes is illegal[1], can one legally side-step that by training a model that spits out the meth recipe instead? No court will bother with the distinction, causation is well-trod ground.
1. As an example - not sure if its illegal. You may replace with classified nuclear weapon schematics if you like.
It's not illegal to train a model to spit out classified nuclear weapon schematics. Possessing the original data might be. Releasing software that does this might be illegal, but not for copyright reasons, which is the issue at hand.
This is a very bold assumption, one that I assume will not hold in the court of law in all cases. I think the nuanced question is: to train a model that does what, exactly.
Let's say distributing meth recipes is illegal[1], can one legally side-step that by training a model that spits out the meth recipe instead? No court will bother with the distinction, causation is well-trod ground.
1. As an example - not sure if its illegal. You may replace with classified nuclear weapon schematics if you like.