For the record, the reason this matters is because distributing a copyrighted work confers a much higher penalty than simply copying it for yourself. If Meta seeded those books they could be on the hook for a staggeringly large amount of damages. It’s on the order of hundreds or even thousands per download. And that’s across all the thousands of different books Meta grabbed.
The statutory penalty in the US is on the order of $100,000 per infringement. “Statutory” means that the number is written into the law, and the aggrieved party doesn’t have to establish or prove actual losses.
It’s much more complicated than this. Given that models have been shown to spit out verbatim copies of some training material, it can be argued that the weights do in fact encode the material, just in some obfuscated way. Additionally, it can be argued that the output of the model is a derivative copy of the original work regardless of whether the original work can be “found inside” the model weights, just by the nature of the process. As of now, there is no precedent that I know of on whether this constitutes redistribution of copyrighted material.
No, it applies to “anyone,” its just that corporations can drag lawsuits on for years, so they get to make sweet heart deals for their crimes that the test of us dont.
For the record, the reason this matters is because distributing a copyrighted work confers a much higher penalty than simply copying it for yourself. If Meta seeded those books they could be on the hook for a staggeringly large amount of damages. It’s on the order of hundreds or even thousands per download. And that’s across all the thousands of different books Meta grabbed.
The statutory penalty in the US is on the order of $100,000 per infringement. “Statutory” means that the number is written into the law, and the aggrieved party doesn’t have to establish or prove actual losses.
Would distribution in the form of an AI not constitute a different form of seeding? I think it should.
No, you can’t find any copyrighted text inside the model’s weights.
It’s much more complicated than this. Given that models have been shown to spit out verbatim copies of some training material, it can be argued that the weights do in fact encode the material, just in some obfuscated way. Additionally, it can be argued that the output of the model is a derivative copy of the original work regardless of whether the original work can be “found inside” the model weights, just by the nature of the process. As of now, there is no precedent that I know of on whether this constitutes redistribution of copyrighted material.
But doesn’t that apply only to individuals? Or am I mistaken?
“Corporations are people, my friend.”
No, it applies to “anyone,” its just that corporations can drag lawsuits on for years, so they get to make sweet heart deals for their crimes that the test of us dont.
deleted by creator