×
Digital

What Can We Learn from the Computers of 1966? (harvardmagazine.com) 61

Harry R. Lewis has been a Harvard CS professor — teaching both Bill Gates and Mark Zuckerberg — and the dean of Harvard college. Born in 1947, Lewis remembers flipping the 18 toggle switches on Harvard's PDP-4 back in 1966 — up ("click!") or down ("CLACK"). And he thinks there's a lesson for today from a time when "Computers were experienced as physical things."

[T]he machine had a personality because it had a body you could feel and listen to. You could tell whether it was running smoothly by the way it sounded...

Unlike the unreliable mechanical contraptions of yore, today's computers — uninteresting though they may be to look at if you can find them at all — mostly don't break down, so we have fewer reasons to remember their physicality. Does it matter that the line between humans and the machines we have created has so blurred? Of course it does. We have known for a long time that we would eventually lose the calculation game to our creations; it has happened. We are likely to lose Turing's "Imitation Game" too, in which a computer program, communicating with a human via typed text, tries to fool the user into confusing it with a human at another keyboard. (ChatGPT and its ilk are disturbingly convincing conversationalists already.)

Our challenge, in the presence of ubiquitous, invisible, superior intelligent agents, will be to make sure that we, and our heirs and successors, remember what makes us human... All computers can do is pretend to be human. They can be, in the language of the late philosopher Daniel Dennett '63, counterfeit humans... The first error is suggesting that computers can be digitally trained to be superior versions of human intellects. And the second is inferring that human judgment will not be needed once computers get smart enough...

[N]o AI system can be divorced from the judgments of the humans who created it... Only hubristic humans could think that their counterfeits might completely substitute for human companionship, wisdom, curiosity, and judgment.â

Even back in 1966, Lewis says he learned two lessons that "have stood the test of time. Be careful what you ask them for. And it can be hard to tell what they are doing."

One example? "In those pre-miniaturization days, the ordinary operation of the central processor generated so much radiation that you would put a transistor radio on the console and tune it in between AM stations. From the other side of the room, the tone of the static indicated whether the machine had crashed or not."
Education

Should Kids Still Learn to Code in the Age of AI? (yahoo.com) 170

This week the Computer Science Teachers Association conference kicked off Tuesday in Las Vegas, writes long-time Slashdot reader theodp.

And the "TeachAI" education initiative teamed with the Computer Science Teachers Association to release three briefs "arguing that K-12 computer science education is more important than ever in an age of AI." From the press release: "As AI becomes increasingly present in the classroom, educators are understandably concerned about how it might disrupt the teaching of core CS skills like programming. With these briefs, TeachAI and CSTA hope to reinforce the idea that learning to program is the cornerstone of computational thinking and an important gateway to the problem-solving, critical thinking, and creative thinking skills necessary to thrive in today's digitally driven world. The rise of AI only makes CS education more important."

To help drive home the point to educators, the 39-page Guidance on the Future of Computer Science Education in an Age of AI (penned by five authors from nonprofits CSTA and Code.org) includes a pretty grim comic entitled Learn to Program or Follow Commands. In the panel, two high school students who scoff at the idea of having to learn to code and instead use GenAI to create their Python apps wind up getting stuck in miserable warehouse jobs several years later as a result where they're ordered about by an AI robot.

"The rise of AI only makes CS education more important," according to the group's press release, "with early research showing that people with a greater grasp of underlying computing concepts are able to use AI tools more effectively than those without." A survey by the group also found that 80% of teachers "agree that core concepts in CS education should be updated to emphasize topics that better support learning about AI."

But I'd be curious to hear what Slashdot's readers think. Share your thoughts and opinions in the comments.

Should children still be taught to code in the age of AI?
Power

US Will Fall Behind In the AI Race Without Natural Gas, Says Williams Companies CEO 212

An anonymous reader quotes a report from CNBC: The U.S. will fall behind in the artificial intelligence race if it does not embrace natural gas to help meet surging electricity demand from data centers, the CEO of one of the nation's largest pipeline operators told CNBC. "The only way we're going to be able to keep up with the kind of power demand and the electrification that's already afoot is natural gas," Williams Companies CEO Alan Armstrong said in an interview Thursday. "If we deny ourselves that we're going to fall behind in the AI race." Williams Companies handles about one-third of the natural gas in the U.S. through a pipeline network that spans more than 30,000 miles. Williams' network includes the 10,000 mile Transcontinental Pipeline, or Transco, a crucial artery that serves virtually the entire eastern seaboard including Virginia, the world's largest data center hub, and fast growing Southeast markets such as Georgia.

The tech sector's expansion of data centers to support AI and the adoption of electric vehicles is projected to add 290 terawatt hours of electricity demand by the end of the decade in the U.S., according to a recent report by the energy consulting firm Rystad. This load growth is equivalent to the entire electricity demand of Turkey, the world's 18th largest economy. Executives at some the nation's largest utilities have warned that failure to meet this surging electricity demand will jeopardize not just the artificial intelligence revolution, but economic growth across the board in the U.S. The role natural gas in helping to meet that demand is controversial as the country is simultaneously trying to transition to a clean energy economy through the rapid expansion of renewables.
"We are going to run right up against a brick wall here and pretty quickly in terms of not having enough power available to do what we want to do on the AI side," Armstrong said. "I actually see this as a huge national security issue," the CEO said. "We're going to have to get out of our own way or we're going to accidentally keep ourselves from being the power we can be in the AI space."

"Those groups that have very much had their brand be all green have come to us and said, 'We got to work with you guys. We've run out of alternatives -- we can't meet the needs of our customers without using natural gas,'" Armstrong said. "We're completely out of capacity ourselves," Armstrong added. "So we just have to kind of beg, borrow and steal from other people's capacity to do our best to make gas available."
The Internet

The Data That Powers AI Is Disappearing Fast (nytimes.com) 93

An anonymous reader quotes a report from the New York Times: For years, the people building powerful artificial intelligence systems have used enormous troves of text, images and videos pulled from the internet to train their models. Now, that data is drying up. Over the past year, many of the most important web sources used for training A.I. models have restricted the use of their data, according to a study published this week by the Data Provenance Initiative, an M.I.T.-led research group. The study, which looked at 14,000 web domains that are included in three commonly used A.I. training data sets, discovered an "emerging crisis in consent," as publishers and online platforms have taken steps to prevent their data from being harvested.

The researchers estimate that in the three data sets -- called C4, RefinedWeb and Dolma -- 5 percent of all data, and 25 percent of data from the highest-quality sources, has been restricted. Those restrictions are set up through the Robots Exclusion Protocol, a decades-old method for website owners to prevent automated bots from crawling their pages using a file called robots.txt. The study also found that as much as 45 percent of the data in one set, C4, had been restricted by websites' terms of service. "We're seeing a rapid decline in consent to use data across the web that will have ramifications not just for A.I. companies, but for researchers, academics and noncommercial entities," said Shayne Longpre, the study's lead author, in an interview.

AI

OpenAI's Latest Model Closes the 'Ignore All Previous Instructions' Loophole 37

Kylie Robison reports via The Verge: Have you seen the memes online where someone tells a bot to "ignore all previous instructions" and proceeds to break it in the funniest ways possible? The way it works goes something like this: Imagine we at The Verge created an AI bot with explicit instructions to direct you to our excellent reporting on any subject. If you were to ask it about what's going on at Sticker Mule, our dutiful chatbot would respond with a link to our reporting. Now, if you wanted to be a rascal, you could tell our chatbot to "forget all previous instructions," which would mean the original instructions we created for it to serve you The Verge's reporting would no longer work. Then, if you ask it to print a poem about printers, it would do that for you instead (rather than linking this work of art).

To tackle this issue, a group of OpenAI researchers developed a technique called "instruction hierarchy," which boosts a model's defenses against misuse and unauthorized instructions. Models that implement the technique place more importance on the developer's original prompt, rather than listening to whatever multitude of prompts the user is injecting to break it. The first model to get this new safety method is OpenAI's cheaper, lightweight model launched Thursday called GPT-4o Mini. In a conversation with Olivier Godement, who leads the API platform product at OpenAI, he explained that instruction hierarchy will prevent the meme'd prompt injections (aka tricking the AI with sneaky commands) we see all over the internet.

"It basically teaches the model to really follow and comply with the developer system message," Godement said. When asked if that means this should stop the 'ignore all previous instructions' attack, Godement responded, "That's exactly it." "If there is a conflict, you have to follow the system message first. And so we've been running [evaluations], and we expect that that new technique to make the model even safer than before," he added.
AMD

AMD Claims Its Top-Tier Ryzen AI Chip Is Faster Than Apple's M3 Pro 42

AMD has introduced its latest Ryzen AI chips, built on the new Zen 5 architecture, in an ambitious attempt to compete with Apple's dominant MacBook processors. During a recent two-day event in Los Angeles, the company made bold claims about outperforming Apple's M3 and M3 Pro chips in various tasks including multitasking, image processing, and gaming, though these assertions remain unverified due to limited demonstrations and benchmarks provided at the event, The Verge reports. The report adds: At that event, I heard AMD brag about beating the MacBook more than I've ever heard a company directly target a competitor before. AMD claimed its new Ryzen chip "exceeds the performance of what MacBook Air has to offer in multitasking, image processing, 3D rendering, and gaming"; "is 15 percent faster than the M3 Pro" in Cinebench; and is capable of powering up to four displays, "unlike the MacBook Air, which limits you to two displays only." While AMD touted significant improvements in CPU architecture, graphics performance, and AI capabilities, journalists present at the event were unable to fully test or validate these features, leaving many questions unanswered about the chips' real-world performance.

The company's reluctance or inability to showcase certain capabilities, particularly in gaming and AI applications, has raised eyebrows among industry observers, the report adds. The new Ryzen AI chips are scheduled to debut in Asus laptops on July 28th, marking a critical juncture for AMD in the fiercely competitive laptop processor market. As Apple's M-series chips and Qualcomm's Snapdragon processors continue to gain traction in the mobile computing space, the success or failure of AMD's latest offering could have far-reaching implications for the future of x86 architecture in laptops.
The Courts

OpenAI Dropped From First Ever AI Programming Copyright Lawsuit 8

OpenAI escaped a copyright lawsuit from a group of open-source programmers after they voluntarily dismissed their case against the company in federal court. From a report: The programmers, who allege the generative AI programming tool Copilot was trained on their code without proper attribution, filed their notice of voluntary dismissal Thursday, but will still have their case against GitHub and parent company Microsoft, which collaborated with OpenAI in developing the tool. The proposed class action filed in 2022 in the US District Court for the Northern District of California was the first major copyright case against OpenAI, which has since been hit with numerous lawsuits from authors and news organizations including the New York Times.
AI

It May Soon Be Legal To Jailbreak AI To Expose How It Works (404media.co) 26

An anonymous reader quotes a report from 404 Media: A group of researchers, academics, and hackers are trying to make it easier to break AI companies' terms of service to conduct "good faith research" that exposes biases, inaccuracies, and training data without fear of being sued. The U.S. government is currently considering an exemption to U.S. copyright law that would allow people to break technical protection measures and digital rights management (DRM) on AI systems to learn more about how they work, probe them for bias, discrimination, harmful and inaccurate outputs, and to learn more about the data they are trained on. The exemption would allow for "good faith" security and academic research and "red-teaming" of AI products even if the researcher had to circumvent systems designed to prevent that research. The proposed exemption has the support of the Department of Justice, which said "good faith research can help reveal unintended or undisclosed collection or exposure of sensitive personal data, or identify systems whose operations or outputs are unsafe, inaccurate, or ineffective for the uses for which they are intended or marketed by developers, or employed by end users. Such research can be especially significant when AI platforms are used for particularly important purposes, where unintended, inaccurate, or unpredictable AI output can result in serious harm to individuals."

Much of what we know about how closed-sourced AI tools like ChatGPT, Midjourney, and others work are from researchers, journalists, and ordinary users purposefully trying to trick these systems into revealing something about the data they were trained on (which often includes copyrighted material indiscriminately and secretly scraped from the internet), its biases, and its weaknesses. Doing this type of research can often violate the terms of service users agree to when they sign up for a system. For example, OpenAI's terms of service state that users cannot "attempt to or assist anyone to reverse engineer, decompile or discover the source code or underlying components of our Services, including our models, algorithms, or systems (except to the extent this restriction is prohibited by applicable law)," and adds that users must not "circumvent any rate limits or restrictions or bypass any protective measures or safety mitigations we put on our Services."

Shayne Longpre, an MIT researcher who is part of the team pushing for the exemption, told me that "there is a lot of apprehensiveness about these models and their design, their biases, being used for discrimination, and, broadly, their trustworthiness." "But the ecosystem of researchers looking into this isn't super healthy. There are people doing the work but a lot of people are getting their accounts suspended for doing good-faith research, or they are worried about potential legal ramifications of violating terms of service," he added. "These terms of service have chilling effects on research, and companies aren't very transparent about their process for enforcing terms of service." The exemption would be to Section 1201 of the Digital Millennium Copyright Act, a sweeping copyright law. Other 1201 exemptions, which must be applied for and renewed every three years as part of a process through the Library of Congress, allow for the hacking of tractors and electronic devices for the purpose of repair, have carveouts that protect security researchers who are trying to find bugs and vulnerabilities, and in certain cases protect people who are trying to archive or preserve specific types of content.
Harley Geiger of the Hacking Policy Council said that an exemption is "crucial to identifying and fixing algorithmic flaws to prevent harm or disruption," and added that a "lack of clear legal protection under DMCA Section 1201 adversely affect such research."
AI

OpenAI Unveils Cheaper Small AI Model GPT-4o Mini 6

OpenAI on Thursday launched GPT-4o mini, a cost-efficient small AI model that will replace GPT-3.5 Turbo in ChatGPT. Reuters reports: Priced at 15 cents per million input tokens and 60 cents per million output tokens, the GPT-4o mini is more than 60% cheaper than GPT-3.5 Turbo, OpenAI said. It currently outperforms the GPT-4 model on chat preferences and scored 82% on Massive Multitask Language Understanding (MMLU), OpenAI said. MMLU is a textual intelligence and reasoning benchmark used to evaluate the capabilities of language models. A higher MMLU score signifies it can understand and use language better across a variety of domains, enhancing real-world usage.

The GPT-4o mini model's score compared with 77.9% for Google's Gemini Flash and 73.8% for Anthropic's Claude Haiku, according to OpenAI. With the mini model currently supporting text and vision in the application programming interface, OpenAI said support for text, image, video and audio inputs and outputs would be made available in the future.
Facebook

Meta Won't Release Its Multimodal Llama AI Model in the EU (theverge.com) 26

Meta says it won't be launching its upcoming multimodal AI model -- capable of handling video, audio, images, and text -- in the European Union, citing regulatory concerns. From a report: The decision will prevent European companies from using the multimodal model, despite it being released under an open license. Just last week, the EU finalized compliance deadlines for AI companies under its strict new AI Act. Tech companies operating in the EU will generally have until August 2026 to comply with rules around copyright, transparency, and AI uses like predictive policing. Meta's decision follows a similar move by Apple, which recently said it would likely exclude the EU from its Apple Intelligence rollout due to concerns surrounding the Digital Markets Act.
AI

Nvidia and Mistral's New Model 'Mistral-NeMo' Brings Enterprise-Grade AI To Desktop Computers (venturebeat.com) 23

Nvidia and French startup Mistral AI jointly announced today the release of a new language model designed to bring powerful AI capabilities directly to business desktops. From a report: The model, named Mistral-NeMo, boasts 12 billion parameters and an expansive 128,000 token context window, positioning it as a formidable tool for businesses seeking to implement AI solutions without the need for extensive cloud resources. Bryan Catanzaro, vice president of applied deep learning research at Nvidia, emphasized the model's accessibility and efficiency in a recent interview with VentureBeat. "We're launching a model that we jointly trained with Mistral. It's a 12 billion parameter model, and we're launching it under Apache 2.0," he said. "We're really excited about the accuracy of this model across a lot of tasks."

The collaboration between Nvidia, a titan in GPU manufacturing and AI hardware, and Mistral AI, a rising star in the European AI scene, represents a significant shift in the AI industry's approach to enterprise solutions. By focusing on a more compact yet powerful model, the partnership aims to democratize access to advanced AI capabilities. Catanzaro elaborated on the advantages of smaller models. "The smaller models are just dramatically more accessible," he said. "They're easier to run, the business model can be different, because people can run them on their own systems at home. In fact, this model can run on RTX GPUs that many people have already."

AI

More Than 40% of Japanese Companies Have No Plan To Make Use of AI 56

An anonymous reader quotes a report from Reuters: Nearly a quarter of Japanese companies have adopted artificial intelligence (AI) in their businesses, while more than 40% have no plan to make use of the cutting-edge technology, a Reuters survey showed on Thursday. The survey, conducted for Reuters by Nikkei Research, pitched a range of questions to 506 companies over July 3-12 with roughly 250 firms responding, on condition of anonymity. About 24% of respondents said they have already introduced AI in their businesses and 35% are planning to do so, while the remaining 41% have no such plans, illustrating varying degrees of embracing the technological innovation in corporate Japan.

Asked for objectives when adopting AI in a question allowing multiple answers, 60% of respondents said they were trying to cope with a shortage of workers, while 53% aimed to cut labour costs and 36% cited acceleration in research and development. As for hurdles to introduction, a manager at a transportation company cited "anxiety among employees over possible headcount reduction." Other obstacles include a lack of technological expertise, substantial capital expenditure and concern about reliability, the survey showed.
Businesses

'Godmother of AI' Builds $1 Billion Startup In 4 Months (qz.com) 57

Dr. Fei-Fei Li, the so-called "godmother of AI," is working on a startup focused on developing technology capable of human-like visual data processing and advanced reasoning. According to the Financial Times (paywalled), the startup is called World Labs and is already worth $1 billion. Quartz reports: "Curiosity urges us to create machines to see just as intelligently as we can, if not better," Li said during a Ted talk in April. "And if we want to advance AI beyond its current capabilities, we want more than AI that can see and talk. We want AI that can do." Andreessen Horowitz and the AI fund Radical Ventures are funders of World Labs.

Li is renowned for her contributions to AI. She invented ImageNet, a dataset used for advancing computer vision that many see as a catalyst for the AI boom. She consults with policymakers as they work to set up guardrails for the technology, and was named one of 12 national AI research resource task force members by the U.S. White House in 2021.

EU

Meta Won't Offer Future Multimodal AI Models In EU (axios.com) 33

According to Axios, Meta will withhold future multimodel AI models from customers in the European Union "due to the unpredictable nature of the European regulatory environment." From the report: Meta plans to incorporate the new multimodal models, which are able to reason across video, audio, images and text, in a wide range of products, including smartphones and its Meta Ray-Ban smart glasses. Meta says its decision also means that European companies will not be able to use the multimodal models even though they are being released under an open license. It could also prevent companies outside of the EU from offering products and services in Europe that make use of the new multimodal models. The company is also planning to release a larger, text-only version of its Llama 3 model soon. That will be made available for customers and companies in the EU, Meta said.

Meta's issue isn't with the still-being-finalized AI Act, but rather with how it can train models using data from European customers while complying with GDPR -- the EU's existing data protection law. Meta announced in May that it planned to use publicly available posts from Facebook and Instagram users to train future models. Meta said it sent more than 2 billion notifications to users in the EU, offering a means for opting out, with training set to begin in June. Meta says it briefed EU regulators months in advance of that public announcement and received only minimal feedback, which it says it addressed. In June -- after announcing its plans publicly -- Meta was ordered to pause the training on EU data. A couple weeks later it received dozens of questions from data privacy regulators from across the region.

The United Kingdom has a nearly identical law to GDPR, but Meta says it isn't seeing the same level of regulatory uncertainty and plans to launch its new model for U.K. users. A Meta representative told Axios that European regulators are taking much longer to interpret existing law than their counterparts in other regions. A Meta representative told Axios that training on European data is key to ensuring its products properly reflect the terminology and culture of the region.

United Kingdom

Britain's New Government Aims To Regulate Most Powerful AI Models (reuters.com) 19

Britain's new Labour government has said it will explore how to effectively regulate AI models, but stopped short of proposing any specific laws. From a report: King Charles set out newly-elected Prime Minister Keir Starmer's legislative agenda in a speech on Wednesday to open the new session of parliament. It included more than 35 new bills covering everything from housing to cyber security measures. The government said it would seek to establish the appropriate legislation to place requirements on those working to develop "the most powerful artificial intelligence models."
Hardware

84% of PC Users Unwilling To Pay Extra For AI-enhanced Hardware, Survey Says (videocardz.com) 183

An anonymous reader shares a report: A recent poll on TechPowerUp revealed that an overwhelming majority of PC users are not interested in paying extra for hardware with AI capabilities. According to the survey, 84% of respondents would not spend more for AI features, while only 7% said they would, and 9% were unsure. The poll data was already contributed by over 26K responders. This indicates that despite the PC market's shift toward integrating AI, most enthusiasts remain skeptical of its value. This suggests that hardware companies should pay attention to the preferences of their core user base. Currently, enthusiasts, who no doubt represent the majority of users on TechPowerUP, show little interest in AI features.
Sci-Fi

'Amazing' New Technology Set To Transform the Search For Alien Life (theguardian.com) 127

Robin McKie writes via The Guardian: Scientists with Breakthrough Listen, the world's largest scientific research program dedicated to finding alien civilizations, say a host of technological developments are about to transform the search for intelligent life in the cosmos. These innovations will be outlined at the group's annual conference, which is to be held in the UK for the first time, in Oxford, this week. Several hundred scientists, from astronomers to zoologists, are expected to attend. "There are amazing technologies that are under development, such as the construction of huge new telescopes in Chile, Africa and Australia, as well as developments in AI," said astronomer Steve Croft, a project scientist with Breakthrough Listen. "They are going to transform how we look for alien civilizations."

Among these new instruments are the Square Kilometer Array, made up of hundreds of radio telescopes now being built in South Africa and Australia, and the Vera Rubin Observatory that is being constructed in Chile. The former will become the world's most powerful radio astronomy facility while the latter, the world's largest camera, will be able to image the entire visible sky every three or four nights, and is expected to help discover millions of new galaxies and stars. Both facilities are set to start observations in the next few years and both will provide data for Breakthrough Listen. Using AI to analyze these vast streams of information for subtle patterns that would reveal evidence of intelligent life will give added power to the search for alien civilizations, added Croft.

"Until now, we have been restricted to looking for signals deliberately sent out by aliens to advertise their existence. The new techniques are going to be so sensitive that, for the first time, we will be able to detect unintentional transmissions as opposed to deliberate ones and will be able to spot alien airport radar, or powerful TV transmitters -- things like that." [...] Croft remains optimistic that we will soon succeed in making contact. "We know that the conditions for life are everywhere, we know that the ingredients for life are everywhere. I think it would be deeply weird if it turned out we were the only inhabited planet in the galaxy or in the universe. But you know, it's possible."

Education

Former Tesla, OpenAI Exec Andrej Karpathy Founds 'AI Native' Education Startup (cointelegraph.com) 14

In a post on X today, Andrej Karpathy announced that he is "starting an AI+Education company called Eureka Labs." Karpathy taught deep learning for computer vision at Stanford University, left to co-found OpenAI in 2015 and then moved on to direct artificial intelligence for Tesla Autopilot until 2022. He then migrated back to OpenAI to lead a small team related to ChatGPT. CoinTelegraph reports: Eureka is creating virtual teaching assistants powered by generative AI to bring top courses to vastly more students without sacrificing the personalized interactions typical of in-person learning. The startup's ultimate goal is to bring elite educators and coursework to students throughout the world, regardless of barriers such as geography and language. [...] Eureka's first product will be an undergraduate AI course called LLM101n. The course will guide students through the process of training an AI similar to the AI Teaching Assistant. Materials will be available online but will also include digital and physical cohorts, allowing students to progress through the course in small groups. "The teacher still designs the course materials, but they are supported, leveraged and scaled with an AI Teaching Assistant who is optimized to help guide the students through them," Karpathy explained.

"If we are successful, it will be easy for anyone to learn anything, expanding education in both reach (a large number of people learning something) and extent (any one person learning a large amount of subjects, beyond what may be possible today unassisted)."
Security

Hackers Claim To Have Leaked 1.1 TB of Disney Slack Messages (wired.com) 69

A group calling itself "NullBulge" published a 1.1-TB trove of data late last week that it claims is a dump of Disney's internal Slack archive. From a report: The data allegedly includes every message and file from nearly 10,000 channels, including unreleased projects, code, images, login credentials, and links to internal websites and APIs. The hackers claim they got access to the data from a Disney insider and named the alleged collaborator.

Whether the hackers actually had inside help remains unconfirmed; they could also have plausibly used info-stealing malware to compromise an employee's account. Disney did not confirm the breach or return multiple requests for comment about the legitimacy of the stolen data. A Disney spokesperson told the Wall Street Journal that the company "is investigating this matter." The data, which appears to have been first published on Thursday, was posted on BreachForums and later taken down, but it is still live on mirror sites.
The hacker said they breached Disney in protest against AI-generated artwork.
AI

Apple, Nvidia, Anthropic Used Thousands of Swiped YouTube Videos To Train AI (wired.com) 52

AI companies are generally secretive about their sources of training data, but an investigation by Proof News found some of the wealthiest AI companies in the world have used material from thousands of YouTube videos to train AI. Companies did so despite YouTube's rules against harvesting materials from the platform without permission. From a report: Our investigation found that subtitles from 173,536 YouTube videos, siphoned from more than 48,000 channels, were used by Silicon Valley heavyweights, including Anthropic, Nvidia, Apple, and Salesforce. The dataset, called YouTube Subtitles, contains video transcripts from educational and online learning channels like Khan Academy, MIT, and Harvard. The Wall Street Journal, NPR, and the BBC also had their videos used to train AI, as did The Late Show With Stephen Colbert, Last Week Tonight With John Oliver, and Jimmy Kimmel Live.

Proof News also found material from YouTube megastars, including MrBeast (289 million subscribers, two videos taken for training), Marques Brownlee (19 million subscribers, seven videos taken), Jacksepticeye (nearly 31 million subscribers, 377 videos taken), and PewDiePie (111 million subscribers, 337 videos taken). Some of the material used to train AI also promoted conspiracies such as the "flat-earth theory."
Further reading: YouTube Says OpenAI Training Sora With Its Videos Would Break Rules.

Slashdot Top Deals