If there is anything movies like The Terminator have shown us, it’s that AI systems might one day become self-aware and wreak havoc. But until Skynet becomes self-aware, let’s enjoy the AI toy that is quickly becoming a part of our daily lives. Some Samsung employees recently discovered that playing with AI models like ChatGPT may have unexpected consequences. These employees used ChatGPT for work and shared sensitive data, such as source code and meeting minutes. This incident was labeled as a “data leak” due to fears that ChatGPT would disclose the data to the public once it is trained on the data. In response, many companies took action, such as banning or restricting access, or creating ChatGPT data disclosure policies.
First, let’s talk about ChatGPT’s training habits. Although ChatGPT does not currently train on user data (its last training session was in 2021), its data policy for non-API access says it may use submitted data to improve its AI models. Users are warned against sharing sensitive information, as specific prompts cannot be deleted. API access data policy is different, stating that customer data is not used for training/tuning the model, but is kept for up to 30 days for abuse and misuse monitoring. API access refers to access via ChatGPT’s API, which developers can integrate into their applications, websites, or services. Non-API access refers to accessing ChatGPT via the website. For simplicity, let’s focus on non-API access. We’ll also assume ChatGPT has not been trained on user data yet – but, like Sarah Connor warning us about Judgment Day, we know it’s coming. Our analysis will mainly focus on ChatGPT. As noted below, this analysis may change based on a given usage policy of a chatbot.
This situation brings to mind the classic philosophical question: If a tree falls in a forest and no one’s around to hear it, does it make a sound? In our AI-driven world, we might rephrase it as: If we share our secrets with an AI language model like ChatGPT, but the information remains unused, does it count as trade secret disclosure or public disclosure of an invention?
Disclosure of Patentable Ideas
To figure out if disclosing invention details to ChatGPT is a public disclosure under patent law, we need to see if it can be categorized as a description in a printed publication, public use, or public sale. Spoiler alert: sharing invention details with ChatGPT does not count as public use or on sale.
For US patent purposes, a “printed publication” is any communication that:
- Appears in a fixed and permanent form,
- Is considered available to the public (like a scientific journal article or a letter to a friend without confidentiality obligations), and
- Describes an invention in enough detail for someone skilled in the field to replicate or use it (i.e., enablement).
Although disclosing to ChatGPT might be in a fixed and permanent form, it is not explicitly public since the model has not been trained on the data yet. However, the data was shared without confidentiality obligations and might be used for future training, so it could be considered public.
Feeling a bit nervous because you submitted code snippets or asked ChatGPT to rewrite a patent application background section? Fear not! You might be safe as long as the enablement requirement is not satisfied. For patentability to be affected, the disclosure must be detailed enough for someone skilled in the field to replicate or use the invention. This prevents some abstracts, articles, and the like from being labeled as public disclosures. So, if you’ve shared enough information to satisfy the enablement requirement, your disclosure to ChatGPT could be considered public, even if it has not been used to train the model. No need to pull a Terminator-style time travel stunt to fix your mistake. Thankfully, the U.S. has a one-year grace period from the public disclosure date, although this differs in other countries.
The heart of a trade secret’s status is its secrecy. While absolute secrecy is not required, entrusting confidential information to an AI chatbot seems counterintuitive to maintaining secrecy. Whether such a disclosure destroys trade secret status depends on the circumstances, and courts have ruled in favor of both sides when dealing with online disclosures.
A recent case from a Florida district court provides some insight. The court in Hurry Family Revocable Tr. v. Frankel, 8:18-cv-2869-CEH-CPT (MD Fla. Jan. 3, 2023) found alleged trade secret secrets did not lose their secrecy status even though they were posted on the court’s public, electronic docket. The notable portion of the court’s decision is its analysis of the disclosure issue.
The court stated that the intentional publication of material will destroy its trade secret status. The court noted, however, “[p]ublication on the Internet does not necessarily destroy the secret if the publication is sufficiently obscure or transient or otherwise limited so that it does not become generally known to the relevant people, i.e., potential competitors or other persons to whom the information would have some economic value.”
The court also highlighted another significant consideration in its analysis—whether there is evidence that the information, although publicly available, was actually viewed or otherwise disseminated by third parties. The Court explained that while technically public, (1) the alleged trade secrets were not easy to access or locate by Plaintiffs’ competitors. (2) members of the public could not access any documents on the docket without knowing the case number and location of this action, and (3) the alleged trade secrets were unlabeled and are located within a docket entry that contains 28 attachments.
This “needle in a haystack” defense evokes Schrödinger’s cat paradox, raising the question: Is a trade secret alive, dead, or in limbo once submitted to ChatGPT? Until the model is trained on the data, it remains publicly available but not viewable or disseminated by third parties. Therefore, one could argue that under Frankel, data submitted to ChatGPT might still be considered a trade secret. Again, it is important to remember that there is no implied or express assurance of confidentiality for the data provided to ChatGPT. This is because the terms of service explicitly state that the data may be used for future training. Additionally, the terms of service highlight a unilateral confidentiality provision, binding only the user, while ChatGPT has no obligation to keep any user input or content confidential (beyond privacy obligations).
Moreover, once the model “learns” the data, it does not necessarily become more accessible or discoverable. Given the vast amount of training data, the public would need to ask ChatGPT a specific question to uncover the secret. The type of trade secret submitted to ChatGPT also matters, as certain secrets, such as a recipe for a popular soft drink, may be more accessible than others, such as a company’s customer list.
The key lesson we can learn from this case is that the “secrecy” of information might be determined by the surrounding circumstances and the nature of the disclosure, rather than just the fact that the information was posted online. While ChatGPT does not guarantee confidentiality for information submits via non-API access, the information might arguably remain a secret from the public until the model is trained on it. Even then, unearthing the trade secret might be as challenging as finding a virtual diamond in the rough.
It is unwise to rely on the hope that your data (trade secret or invention) will never be discovered when creating your disclosure policy. There are always risks, such as external hackers or computer glitches, that could expose your information. For instance, there have been cases where users have reported seeing other users’ chat histories. Thus, it is important to exercise caution when sharing sensitive information with AI models like ChatGPT to avoid potential legal complications
To ensure data remains confidential, it may be advisable for companies to implement data classification schemes to identify shareable information and review agreements, policies, and vendor contracts for safeguarding sensitive data.
Additionally, companies can evaluate physical and digital data security practices and implement protection measures, while respecting employees’ privacy rights. Consider limiting access to certain websites or applications, keeping in mind limitations may exist if employees can bypass restrictions using personal smartphones.
Lastly, it may help to cultivate a confidentiality culture by incorporating education and training during onboarding such that employees may be regularly reminded of their obligations.