Claims allegedly pirated content from Books3 dataset trawled by its models
The complaint was filed by three authors, Abdi Nazemian, Brian Keene, and Stewart O'Nan, who claim that books they wrote were among the material used to train the Megatron LLMs.
The lawsuit refers specifically to models that Nvidia released in September 2022, namely NeMo Megatron-GPT 1.3B, NeMo Megatron-GPT 5B, NeMo Megatron-GPT 20B, and NeMo Megatron-T5 3B., along with information about each model, including its training dataset. In this case, the information states that the models were trained on"The Pile" dataset prepared by EleutherAI.
According to the court filing, the Books3 dataset was available separately on Hugging Face until October 2023, when it was removed because it"is defunct and no longer accessible due to reported copyright infringement."
United Kingdom Latest News, United Kingdom Headlines
Similar News:You can also read news stories similar to this one that we have collected from other news sources.
Dell exec reveals Nvidia has a 1,000-watt GPU in the worksHot and hungry, yet direct liquid cooling may not be required
Read more »
Why do Nvidia’s chips dominate the AI market?The firm has three big advantages
Read more »
AI chip firm Nvidia valued at $2tnBooming business at Nvidia sees investors bet the AI revolution will live up to its 'hype'.
Read more »
Nvidia RTX 4080 Super review - more for lessOur Nvidia RTX 4080 Super review goes over the design, features, pricing, and performance of Team Green's latest high-end refresh.
Read more »
RTX HDR: Nvidia's AI video enhancement tool works for games tooRay-tracing radical, Turok technophile, Crysis cultist and motion-blur menace. When not doing Digital Foundry things, he can be found strolling through Berlin examining the city for rendering artefacts.
Read more »
Nvidia: Boss says AI at 'tipping point' as revenues soarThe artificial intelligence boom has helped Nvidia become one of the most valuable firms in the US.
Read more »