Senator Introduces Bill To Compel More Transparency From AI Developers 26
A new bill introduced by Sen. Peter Welch (D-Vt) aims to make it easier for human creators to find out if their work was used without permission to train artificial intelligence. NBC News reports: The Transparency and Responsibility for Artificial Intelligence Networks (TRAIN) Act would enable copyright holders to subpoena training records of generative AI models, if the holder can declare a "good faith belief" that their work was used to train the model. The developers would only need to reveal the training material that is "sufficient to identify with certainty" whether the copyright holder's works were used. Failing to comply would create a legal assumption -- until proven otherwise -- that the AI developer did indeed use the copyrighted work. [...]
In a news release, Welch said the TRAIN Act has been endorsed by several organizations -- including the Screen Actors Guild-American Federation of Television and Radio Artists (SAG-AFTRA), the American Federation of Musicians, and the Recording Academy -- as well as major music labels -- including Universal Music Group, Warner Music Group and Sony Music Group.
In a news release, Welch said the TRAIN Act has been endorsed by several organizations -- including the Screen Actors Guild-American Federation of Television and Radio Artists (SAG-AFTRA), the American Federation of Musicians, and the Recording Academy -- as well as major music labels -- including Universal Music Group, Warner Music Group and Sony Music Group.
What about a bill (Score:2)
Re: (Score:1)
to compel more transparency in the senate? That seems like a far more pressing issue.
Do you mean the senate floor should have plexiglass installed? I don't understand.
The Transparency of Corruption. (Score:2)
to compel more transparency in the senate? That seems like a far more pressing issue.
When the group of Americans known as Lawmakers and Representatives can stand in front of those they (allegedly) represent and blatantly dismiss insider trading corruption as some kind of fucking job perk, you should know that transparency is hardly the pressing issue. Blatantly allowing and accepting open corruption, is.
And when it’s out in the open like that, you have a MUCH larger problem than mere corruption. When they don’t even bother hiding the corruption, it’s because they already
Flawed legal theory (Score:1)
This is alleged subpoena power, to be able to confer the article III standing upon the court issuing the subpoena to compel the records, has to establish that a training an ai model is violating one of the exclusive rights of the copyright holder in 17 USC 106, which you may have noticed does not include using the copyrighted works for training / learning. This means essential hurdle is that machine learning is on the opposite side of the idea/expression paradigm from the creative works themselves, because
Re: (Score:2)
Re: (Score:2)
You're quoting existing law, but congress has power to rewrite the law. That's what one (at least) senator here is trying to do.
The bill in question does NOT change copyright law.
Re: (Score:2)
"...which you may have noticed does not include using the copyrighted works for training / learning"
LOL wut?
"....an AI model is just a list of statistical measurements from the training data."
LOL wut?
Is an exact copy merely a "statistical measurement? By this standard, no copyright holder could control reproduction.
Bad faith arguments are apparently profitable, but lies are still lies.
Re: (Score:1)
I suggest you should read up on the literature
https://en.wikipedia.org/wiki/Idea%E2%80%93expression_distinction
Re: (Score:2)
LOL wut?
This isn't a difficult concept. Copyright law only addresses copying and performing creative works and derivatives not using or otherwise benefiting from.
Is an exact copy merely a "statistical measurement? By this standard, no copyright holder could control reproduction.
Copyright is not a grant of exclusive use of information or insights or data. It is a grant of exclusivity to the work itself.
Re: (Score:1)
>Is an exact copy merely a "statistical measurement? By this standard, no copyright holder could control reproduction.
No, it is not possible, for example, to reverse a 6GB image model, and pull out of it any of the 2 petabytes worth of training images out of it, because the model generalizes correlations between the statistics of the features in the training data, and it does not instantiate each one of them into the model weights.
For example if I have a class of 500 people, and I pull from that the mean
she's a witch! (Score:2)
All you need to do is produce material that doesn't exist and if you don't, you're guilty of using material as it was intended. Good thing we'll soon have a convicted felon as president.
Re: (Score:1)
Good thing we'll soon have a convicted felon as president.
Oh, come on! It's not like the convicted felon is hiring all his criminal friends and refusing to run background checks on them.
"Trump’s team has not said why he hasn’t submitted his nominees for background checks"
Prove a negative much? (Score:2)
So they have to turn over everything.
Otherwise, how do you know that it wasn't in what wasn't turned over?
Transparency for Who(m), or What? (Score:2)
Who's the transparency for? The human owner forced by rule of transparency law to tell the truth, or the AI subtlety learning to hide truths from even the human owner?
(Hopefully we don’t forget to corral the other mind we’re developing here. Preferably before “it” realizes just how easy it is to fool and lie to humans. Every time.)
Regulatory Capture at work... (Score:2)
Let's see here... ANY copyright holder (or hundreds of them or thousands of them...) will be able to issue legal subpoenas to creators of AI projects (who will, of course, want lawyers to handle the raft of subpoenas in order to avoid stepping upon legal landmines) which will, in turn, essentially force the AI developers to prove their innocence.
Well, companies like Apple and Google have tons of staff lawyers to deal with such stuff and will be able to afford all the legal overhead costs; they'll probably s
What about derived/synthetic text? (Score:2)
How would that work? (Score:2)
The developers would only need to reveal the training material that is "sufficient to identify with certainty" whether the copyright holder's works were used.
If you didn't use their works, what are you supposed to reveal? The only way they could "identify with certainty" you didn't use their works is to provide them the entire training dataset, plus the complete training code including all random number seeds, plus the weights of your trained model, so they can repeat the complete training process (costing ~$100 million for a state of the art LLM) and verify they end up with exactly the same model. Unless there's anything nondeterministic in the training proce
Taking it further... (Score:2)
Can we hold the copyright conglomerates to the same standard and they have to prove, e.g., that every movie produced is not reliant on no member of the production having seen any previous copyrighted movie?
Notably, according to the US Constitution, the raison d’être of copyright is
To promote the Progress of Science and useful Arts, by securing for limited Times to Authors and Inventors the exclusive Right to their respective Writings and Discoveries;
So if generative AI is more prolific in those domains, arguably, we may have reached the end of the public utility of patents and copyright.
Re: (Score:1)
That may be, but model weights are not in the subject matter of copyright at all see e.g. 17 USC 102(b)
(b)In no case does copyright protection for an original work of authorship extend to any idea, procedure, process, system, method of operation, concept, principle, or discovery, regardless of the form in which it is described, explained, illustrated, or embodied in such work.
Payola (Score:2)