As we’ve previously written, the rise of generative AI has led to a spate of copyright suits across the country. One major target of these suits has been OpenAI. Actor/comedian Sarah Silverman and author Paul Tremblay are among the plaintiffs to bring suit in California, while authors George R.R. Martin, John Grisham, and others have filed in New York. The lawsuits allege that OpenAI used the plaintiffs’ creative content without permission to train OpenAI’s generative AI tool in violation of the U.S. Copyright Act. OpenAI moved to dismiss the majority of claims in the Silverman and Tremblay cases on several bases: (1) the Copyright Act does not protect ideas, facts, or language; (2) the plaintiffs cannot show that outputs from OpenAI’s large language model (“LLM”) tool are substantially similar to the original content used to train the tool; and (3) any use of copyright-protected content by OpenAI’s tool constitutes fair use, and thus is immune to liability under the Act. Yesterday, Plaintiffs hit back, noting that OpenAI hasn’t moved to dismiss the “core claim” in the lawsuits—direct infringement.Continue Reading Famous Authors Clap Back at OpenAI’s Attempt to Dismiss Claims Regarding Unauthorized Use of Content for Training LLM Models
If there is anything movies like The Terminator have shown us, it’s that AI systems might one day become self-aware and wreak havoc. But until Skynet becomes self-aware, let’s enjoy the AI toy that is quickly becoming a part of our daily lives. Some Samsung employees recently discovered that playing with AI models like ChatGPT may have unexpected consequences. These employees used ChatGPT for work and shared sensitive data, such as source code and meeting minutes. This incident was labeled as a “data leak” due to fears that ChatGPT would disclose the data to the public once it is trained on the data. In response, many companies took action, such as banning or restricting access, or creating ChatGPT data disclosure policies.
First, let’s talk about ChatGPT’s training habits. Although ChatGPT does not currently train on user data (its last training session was in 2021), its data policy for non-API access says it may use submitted data to improve its AI models. Users are warned against sharing sensitive information, as specific prompts cannot be deleted. API access data policy is different, stating that customer data is not used for training/tuning the model, but is kept for up to 30 days for abuse and misuse monitoring. API access refers to access via ChatGPT’s API, which developers can integrate into their applications, websites, or services. Non-API access refers to accessing ChatGPT via the website. For simplicity, let’s focus on non-API access. We’ll also assume ChatGPT has not been trained on user data yet – but, like Sarah Connor warning us about Judgment Day, we know it’s coming. Our analysis will mainly focus on ChatGPT. As noted below, this analysis may change based on a given usage policy of a chatbot.
This situation brings to mind the classic philosophical question: If a tree falls in a forest and no one’s around to hear it, does it make a sound? In our AI-driven world, we might rephrase it as: If we share our secrets with an AI language model like ChatGPT, but the information remains unused, does it count as trade secret disclosure or public disclosure of an invention?Continue Reading Spilling Secrets to AI: Does Chatting with ChatGPT Unleash Trade Secret or Invention Disclosure Dilemmas?