The responsible creation of Artificial Intelligence (AI) is an emerging nemesis for lawmakers. In a recent case involving OpenAI, the Federal Court in San Francisco (the Court) is considering whether the utilisation of copyrighted texts to train ChatGPT (and its subsequent language models) is a violation of intellectual property.
Who is OpenAI?
OpenAI is a research company that develops user friendly AI. It was founded by a self-proclaimed group of passionate computer scientists with the goal of empowering people with revolutionary, easy to use technology.
You have probably heard of the remarkable AI Chatbot “Chat GPT” by now. Chat GPT has been OpenAI’s most successful venture in this realm and its functions go far beyond that of an ordinary Chatbot. The GPT-3.5 is free and available for public use, but this writer has refrained thus far in pursuit of an idyllic notion of authenticity.
The Allegations against OpenAI
The lawsuit claims that OpenAI used copyrighted books and text without permission to train its AI systems. There are currently two class actions on foot against Open AI by numerous writers and copyright holders, one being comedian Sarah Silverman.
The Plaintiffs assert that ChatGPT has the capability to effectively summarise their books or copyrighted materials without replicating it. In his submissions, the Plaintiffs’ co-attorney Matthew Butterick states:
“Creators’ work has been vacuumed up by these companies without consent, without credit, without compensation, and that’s not legal… All of these generative-AI systems rely on consuming massive quantities of human creative work, whether it’s text for these language models or whether it’s images for these AI image generators.”
In response, Open AI has filed a motion to dismiss most claims, asserting that:
- their use of bulk text datasets (many of which are copyrighted materials) to train ChatGPT falls under “fair use”;
- that the responses generated by ChatGPT aren’t “derivative works”; and
- there are insufficient pleadings to support a number of the Plaintiff’s claims of direct and vicarious copyright.
Have we seen this before?
Google Books was subjected to a similar claim for its function of summarising millions of bodies of copyrighted books and displaying the results for users to read. After a decade-long legal battle, the United States Supreme Court found that Google Books’ practices were “fair use” and not a violation of copyright law.
Conclusion
The outcome of this case has the potential to change the trajectory of how companies obtain the data and information necessary to ‘educate’ AI technology. If AI companies are prohibited from allowing their technology to consume copyrighted data without consent, this will significantly stifle the advancement of AI technology.
Is there a difference between humans or machines reading the material available to enhance the breadth of their/its knowledge? Since the dawn of time, our evolution has relied on humans acquiring and subsequently advancing that information (copyrighted or not). Are we at a new singularity event? A new dawn of time…?
Tagged in: AI, Artificial Intelligence, ChatGPT, copyright, Copyright law