Samsung Employees Upload Sensitive Code to ChatGPT: Confidential Data Exits the Company

Within weeks of Samsung lifting its ban on ChatGPT, engineers uploaded proprietary semiconductor source code, internal meeting notes, and performance data to ChatGPT — sending confidential data to OpenAI's servers.

Samsung Semiconductor·2023·2 min read

Background

ChatGPT's explosive growth in early 2023 prompted enterprises to grapple with whether to allow employees to use it. Samsung Semiconductor lifted its internal ban in March 2023. Within weeks, multiple incidents of employees inputting sensitive data were reported.

The Attack

At least three Samsung employees uploaded sensitive data to ChatGPT: one uploaded source code from Samsung's semiconductor equipment database while asking ChatGPT to fix a bug; a second uploaded internal meeting notes about Samsung's battery performance; a third uploaded code and asked it to be optimised. ChatGPT uses conversation data to improve its models (unless opted out) — meaning Samsung's proprietary data was sent to OpenAI's servers and potentially incorporated into future model training. Samsung had not yet implemented enterprise ChatGPT controls when these incidents occurred.

Response

Samsung banned ChatGPT company-wide again after discovering the leaks. The company began developing internal AI tools. Samsung notified affected teams. OpenAI provides enterprise plans with data opt-out provisions — which Samsung had not been using. The incidents led Samsung to develop strict AI tool policies.

Outcome

The Samsung incidents became the canonical example of AI tool data leakage and triggered a global wave of enterprise AI policy development. It demonstrated that employees use new tools before policies are in place, and that the boundary between "asking for help" and "sending confidential data to a third party" was unclear to many employees.

Key Takeaways

  1. AI tools that process user inputs must be evaluated as data processors — all data sent is shared with the vendor
  2. Enterprise AI tool policies must be established before lifting bans — not after incidents
  3. Technical controls (DLP blocking of AI site inputs) can prevent data leakage even when policy alone fails
  4. Employees need training on what constitutes confidential data before using any AI assistant
ChatGPTAI data leaksource codegenerative AIemployee negligence