Follow Slashdot blog updates by subscribing to our blog RSS feed

 



Forgot your password?
typodupeerror
×
Government AI

Senator Introduces Bill To Compel More Transparency From AI Developers 22

A new bill introduced by Sen. Peter Welch (D-Vt) aims to make it easier for human creators to find out if their work was used without permission to train artificial intelligence. NBC News reports: The Transparency and Responsibility for Artificial Intelligence Networks (TRAIN) Act would enable copyright holders to subpoena training records of generative AI models, if the holder can declare a "good faith belief" that their work was used to train the model. The developers would only need to reveal the training material that is "sufficient to identify with certainty" whether the copyright holder's works were used. Failing to comply would create a legal assumption -- until proven otherwise -- that the AI developer did indeed use the copyrighted work. [...]

In a news release, Welch said the TRAIN Act has been endorsed by several organizations -- including the Screen Actors Guild-American Federation of Television and Radio Artists (SAG-AFTRA), the American Federation of Musicians, and the Recording Academy -- as well as major music labels -- including Universal Music Group, Warner Music Group and Sony Music Group.

Senator Introduces Bill To Compel More Transparency From AI Developers

Comments Filter:
  • to compel more transparency in the senate? That seems like a far more pressing issue.
    • to compel more transparency in the senate? That seems like a far more pressing issue.

      Do you mean the senate floor should have plexiglass installed? I don't understand.

    • to compel more transparency in the senate? That seems like a far more pressing issue.

      When the group of Americans known as Lawmakers and Representatives can stand in front of those they (allegedly) represent and blatantly dismiss insider trading corruption as some kind of fucking job perk, you should know that transparency is hardly the pressing issue. Blatantly allowing and accepting open corruption, is.

      And when it’s out in the open like that, you have a MUCH larger problem than mere corruption. When they don’t even bother hiding the corruption, it’s because they already

  • This is alleged subpoena power, to be able to confer the article III standing upon the court issuing the subpoena to compel the records, has to establish that a training an ai model is violating one of the exclusive rights of the copyright holder in 17 USC 106, which you may have noticed does not include using the copyrighted works for training / learning. This means essential hurdle is that machine learning is on the opposite side of the idea/expression paradigm from the creative works themselves, because

    • You're quoting existing law, but congress has power to rewrite the law. That's what one (at least) senator here is trying to do.
      • You're quoting existing law, but congress has power to rewrite the law. That's what one (at least) senator here is trying to do.

        The bill in question does NOT change copyright law.

    • by dfghjk ( 711126 )

      "...which you may have noticed does not include using the copyrighted works for training / learning"

      LOL wut?

      "....an AI model is just a list of statistical measurements from the training data."

      LOL wut?

      Is an exact copy merely a "statistical measurement? By this standard, no copyright holder could control reproduction.

      Bad faith arguments are apparently profitable, but lies are still lies.

      • I suggest you should read up on the literature
        https://en.wikipedia.org/wiki/Idea%E2%80%93expression_distinction

      • LOL wut?

        This isn't a difficult concept. Copyright law only addresses copying and performing creative works and derivatives not using or otherwise benefiting from.

        Is an exact copy merely a "statistical measurement? By this standard, no copyright holder could control reproduction.

        Copyright is not a grant of exclusive use of information or insights or data. It is a grant of exclusivity to the work itself.

  • All you need to do is produce material that doesn't exist and if you don't, you're guilty of using material as it was intended. Good thing we'll soon have a convicted felon as president.

    • Good thing we'll soon have a convicted felon as president.

      Oh, come on! It's not like the convicted felon is hiring all his criminal friends and refusing to run background checks on them.
      "Trump’s team has not said why he hasn’t submitted his nominees for background checks"

  • So they have to turn over everything.
    Otherwise, how do you know that it wasn't in what wasn't turned over?

  • Who's the transparency for? The human owner forced by rule of transparency law to tell the truth, or the AI subtlety learning to hide truths from even the human owner?

    (Hopefully we don’t forget to corral the other mind we’re developing here. Preferably before “it” realizes just how easy it is to fool and lie to humans. Every time.)

  • Let's see here... ANY copyright holder (or hundreds of them or thousands of them...) will be able to issue legal subpoenas to creators of AI projects (who will, of course, want lawyers to handle the raft of subpoenas in order to avoid stepping upon legal landmines) which will, in turn, essentially force the AI developers to prove their innocence.

    Well, companies like Apple and Google have tons of staff lawyers to deal with such stuff and will be able to afford all the legal overhead costs; they'll probably s

  • I can use a LLM to reword texts, they won't come up in a n-gram based search. They can only find similar ideas not expression. What do they do then? Technically, it won't be training on copyrighted data, but the model would learn all the same.
  • The developers would only need to reveal the training material that is "sufficient to identify with certainty" whether the copyright holder's works were used.

    If you didn't use their works, what are you supposed to reveal? The only way they could "identify with certainty" you didn't use their works is to provide them the entire training dataset, plus the complete training code including all random number seeds, plus the weights of your trained model, so they can repeat the complete training process (costing ~$100 million for a state of the art LLM) and verify they end up with exactly the same model. Unless there's anything nondeterministic in the training proce

Prediction is very difficult, especially of the future. - Niels Bohr

Working...