I've been contacted by a law student in Seattle. She's writing a law review note on the #OGL and following in my footsteps by discussing the rarely-discussed concept of #copyright misuse. She's citing my work and, I hope, adding more to the argument. I hope this is a trend. More information is better, but I'm tired of arguing about it. #DnD#RPG#TTRPG#WotC
RT @glynmoody
Join 28,000+ to show your support for the @internetarchive, libraries’ digital rights, and an open internet with safe, uncensored access to knowledge. - https://battleforlibraries.com "Big publishers are suing to cut off libraries’ ownership and control of digital books." #copyright
Join 28,000+ signers on the petition below to show your support for the @internetarchive, libraries’ digital rights, and an open internet with safe, uncensored access to knowledge. - https://www.battleforlibraries.com/ "Big publishers are suing to cut off libraries’ ownership and control of digital books, opening new paths for digital book bans and dangerous surveillance." #important#copyright
#Copyright#AI#GenerativeAI#GeneratedMusic#Music#Udio: "It’s not my intention in this article to highlight examples of regurgitation of copyrighted material, with a view to suggesting this regurgitation constitutes copyright infringement. The question that is much more important than regurgitation, in my view, is what models like Udio’s are trained on.
There are many people, myself included, who think that training generative AI models on copyrighted work without permission at all constitutes copyright infringement, whether material from the training set is regurgitated in the output or not.
When likenesses to copyrighted music show up in the outputs of AI music generation systems, there are generally three possibilities: either it is chance (this is of course not impossible), or the systems were trained on copyrighted music with licenses to do so, or they were trained on copyrighted music without licenses in place.
If the models used in Udio’s product are trained on copyrighted work, it is possible they have licenses in place with rightsholders that permit them to train on the copyrighted work whose likenesses are found in these examples.
I feel there has to be a way of training neural networks to recognise the influence of their training data on the output.
This would probably include training a complementary indexing network + database that would then ”reverse-training” resolve and offer at some predetermined accuracy the #copyright-viable sources for each generated #aiart
I need some help though. A proof would show the companies know it can be done, but they just don’t want to.
Apparently the #SciHub censorship in #Germany is "just" a #DNS block, so it's not very elaborate.
But that's not the point. The point that people even DARE to do this type of censorship in the first place. This is #evil and wrong.
This is the result of #copyright ideology taken to its extreme. But I don't think this is the end point. #Copyright maximalists want to further and further a commercialized, expensive Internet only for the privileged few.
When a pile of coalitions go "this new bill is awesome" and not a single one is any science or tech group, be a tad suspicious. But I was telling AI folks "ingredient lists" were coming in 2022.
@textfiles Wild! Effectively requires the creation of a comprehensive #copyright register, even for works the authors themselves have not bothered to register.
Last was an excellent talk by @jtlg on generative AI and copyright at the Allen Institute for AI. Favorite quote: "It's not at all obvious that the incentives to create of the sort that copyright offers are the appropriate system of law to govern this new [technology]. It may be that what replaces copyright due to generative AI is as different from copyright as copyright was from the patronage system that came before it." Highly recommend https://www.youtube.com/watch?v=toPhm4zBp00 (6/6) #GenerativeAI#copyright
I own the #copyright for the lectures I give to college students, but for some reason the #FBI has not yet contacted me with offers to prosecute anyone recording all or portions of my lectures without my written permission. I'm seriously considering putting an FBI warning at the beginning of every lecture. I think it might start good conversations about whose interests the cops choose to protect in American society.
#AI#GenerativeAI#AITraining#Music#USA#Copyright#IP: "Representative Adam Schiff (D-Calif.) introduced new legislation in the U.S. House of Representatives on Tuesday (April 9) which, if passed, would require AI companies to disclose which copyrighted works were used to train their models, or face a financial penalty. Called the Generative AI Copyright Disclosure Act, the new bill would apply to both new models and retroactively to previously released and used generative AI systems.
The bill requires that a full list of copyrighted works in an AI model’s training data set be filed with the Copyright Office no later than 30 days before the model becomes available to consumers. This would also be required when the training data set for an existing model is altered in a significant manner. Financial penalties for non-compliance would be determined on a case-by-case basis by the Copyright Office, based on factors like the company’s history of noncompliance and the company’s size." https://www.billboard.com/business/legal/federal-bill-ai-training-require-disclosure-songs-used-1235651089/
#AI#GenerativeAI#AITraining#Copyright#FairUse#IP#DataCommons#Books: "This paper is a snapshot of an idea that is as underexplored as it is rooted in decades of existing work. The concept of mass digitization of books, including to support text and data mining, of which AI is a subset, is not new. But AI training is newly of the zeitgeist, and its transformative use makes questions about how we digitize, preserve, and make accessible knowledge and cultural heritage salient in a distinct way.
As such, efforts to build a books data commons need not start from scratch; there is much to glean from studying and engaging existing and previous efforts. Those learnings might inform substantive decisions about how to build a books data commons for AI training. For instance, looking at the design decisions of HathiTrust may inform how the technical infrastructure and data management practices for AI training might be designed, as well as how to address challenges to building a comprehensive, diverse, and useful corpus. In addition, learnings might inform the process by which we get to a books data commons — for example, illustrating ways to attend to the interests of those likely to be impacted by the dataset’s development." https://openfuture.pubpub.org/pub/towards-a-book-data-commons-for-ai-training/release/1