Insights < BACK TO ALL INSIGHTS
Artificial Intelligence Poses Threat to Business Data Privacy and Confidential Information
Artificial Intelligence Poses Threat to Business Data Privacy and Confidential Information
By: Jake Gray
Over the last year, an abundance of headlines detailing innovations in artificial intelligence (AI) technology have hit the news cycle. Beyond mere technical advancements, many reports have discussed AI’s potential to revolutionize innumerable industries and the workplace, whether for better or worse.
The White House is accordingly delving into AI’s role in the workplace, recently announcing that the White House Office of Science and Technology Policy will release a Request for Information (“RFI”) to learn more about automated tools used by employers to “surveil, monitor, evaluate, and manage workers.” This data will be used to create policy, standards, and best practices surrounding the use of AI in the workplace. [1]
While a variety of companies have announced plans to integrate AI generative tools into their workflow, many employees may have already capitalized on the technology – without company knowledge or support – for a potential productivity boost. In such cases, the risk to sensitive business information and overall data privacy could be immense, as AI tools rely on user-input not only to train their models but also for information to share with end-users, potentially exposing confidential data to the public. Without robust security measures and protocols for using AI tools, employees may expose both individuals and organizations to risks such as loss of competitive advantage, reputational damage, and even legal consequences.
Illustrating these risks, a recent study by cybersecurity company, Cyberhaven, found that 11% of data that employees paste into OpenAI’s ChatGPT is confidential and at least 4% of employees have pasted confidential data into ChatGPT. [2] Additional data points show that the average company leaks sensitive data to ChatGPT hundreds of times each week.
As a result, companies have been restricting employees’ use of AI in the workplace or for work-related matters in an effort to combat internal information leaks. Joining companies like Amazon, Bank of America, and Wells Fargo, Samsung recently imposed restrictions on its employees’ use of OpenAI’s ChatGPT. [3] The change in policy stemmed from an accidental leak of sensitive internal source code. Samsung’s rule was communicated to staff in a memo, describing the restriction as temporary while Samsung works to “create a secure environment” to responsibly use AI tools.
Basic Anatomy of AI Data Leaks
Generative AI tools, such as OpenAI’s “ChatGPT,” rely on extensive datasets to hone their algorithms, frequently processing vast quantities of user-generated content to develop the capacity to produce human-like responses. As users—in this case, workers—interact with these AI systems for tasks that may require inputting sensitive or proprietary information to accomplish their work, there is an inherent risk that the AI model might inadvertently integrate or refer to this confidential data in subsequent interactions with different users. The risk of a data leak is especially apparent, however, when workers unwittingly provide the AI with access to sensitive data, which could then be absorbed into the AI’s training set or even be utilized inappropriately in future exchanges. Furthermore, as multiple users engage with the AI tool, the likelihood of such data leaks increases, potentially creating a chain reaction of information breaches.
For example, consider a hypothetical employee who copies and pastes key points from the organization’s internal strategy documents into ChatGPT, requesting it to reformat the content as a PowerPoint slide deck. If an external party inquires about the strategic priorities of the company for the current year, ChatGPT would likely respond based on the information supplied by the executive, especially if that information was not yet publicly available, as ChatGPT would have no other substantial references. [4]
However, these risks can be mitigated if the data is properly handled and secured by the user before being handed to the AI model. Responsible employees, for instance, will implement data anonymization whenever possible, by modifying or transforming data to remove or obscure personally identifiable information (PII) and other sensitive details, ensuring that individuals or entities cannot be identified from the data.
Company Solutions
Beyond barring the use of generative AI tools entirely, companies presently have few easily accessible solutions to protect confidential data from being leaked by employees through AI. This issue is illustrated by the manner in which information is typically input into AI models like ChatGPT. Presently, ChatGPT only processes text input – i.e., content pasted from users’ clipboards (the name for the location where copied values are held to be pasted on a computer) or keyboard entries. This means that common protective measures typically applied to email communications – such as preventing documents flagged as confidential from being sent out-of-organization – are inapplicable and insufficient to protect company data in the context of ChatGPT.
Further, although companies can train their employees to be aware of the consequences of leaking confidential information to AI systems, the potential repercussions may be too downstream for workers to forgo the incentive of increased productivity – especially if there’s no risk of discipline. Additionally, depending on the particulars of the data leaked and other security measures implemented, companies may have a difficult time tracking down the offending employee, especially in a timely manner.
As AI becomes more integrated into the workplace, current monitoring systems will become increasingly ineffective. When AI is not allowed by company policy, a monitoring system can easily catch an offender, just by flagging when they visited the URL. But when entire workflows commonly depend on the use of AI, distinguishing between proper and improper uses of data in AI prompting becomes increasingly difficult given that not all data is confidential or improperly secured by the user.
This is not to say, however, that such efforts are worthless. Regularly reviewing logs and reports from monitoring systems that audit the use of AI in the workplace can significantly bolster policies instructing employees on how best to handle data by flagging potential offenses and making employees aware of the consequences for violation of company policy. To this end, data security policies should also be strictly enforced and updated as the technology advances or the workflow changes.
Even if companies may not be able to mitigate the risk of information leak entirely, they should establish robust security measures and protocols for employees’ use of AI tools, as well as provide training and guidance to employees on the responsible and appropriate use of such tools. In so doing, companies may better safeguard confidential and proprietary information while capitalizing on the productivity benefits offered by generative AI tools.
[1] https://www.whitehouse.gov/wp-content/uploads/2023/05/050123_OSTP_RFI_PREPUBLISH_.pdf
[2] https://www.cyberhaven.com/blog/4-2-of-workers-have-pasted-company-data-into-chatgpt/
[4] https://www.cyberhaven.com/blog/4-2-of-workers-have-pasted-company-data-into-chatgpt/