What 38 AI Companies Actually Say About Your Data (2026)

Every AI company says it cares about your privacy. We wanted to see if that holds up when you read the documents.

ConductAtlas monitors the privacy policies and terms of service of 38 AI platforms daily. We track every clause, every change, every obligation. Below is what the documents say, organized around four questions: Does this company train on your data? How long do they keep it? Can you take them to court? And what happens if something goes wrong?

Data current as of April 29, 2026. ConductAtlas monitors these policies daily. If anything changes tomorrow, we will detect it within 24 hours.

The summary

Platform	Trains on your data?	How long they keep it	Forced arbitration?	Liability cap
ChatGPT (OpenAI)	Yes, by default	30 days post-deletion	Yes + class action waiver	$100 or fees paid (whichever is greater)
Gemini (Google)	Yes	18 months to 3 years	No	Capped at fees paid; disclaims AI errors
Character.AI	Yes, all conversations	Content survives deletion	Yes + class action waiver	$100 total, ever
Copilot (Microsoft)	Yes	Standard Microsoft policy	Yes + class action waiver	Capped at fees paid; disclaims AI errors
Midjourney	Perpetual license to everything	Minimal disclosure	Yes + jury waiver	$100 or fees paid; 1-year claim deadline
Claude (Anthropic)	Free tier only	30-day backend purge	No	Fees paid in prior 12 months
Stability AI	Yes	Not specified	No	Fees paid in prior 12 months
Hugging Face	Yes	Not specified	No	Greater of $100 or fees paid
Cohere	Enterprise opt-out available	30 days (SaaS)	No	Fees paid in prior 12 months
Mistral AI	Paid: no. Free: opt-out	Varies by data type	No	Fees paid in prior 12 months
DeepL, Cursor, Replit, Groq, and 20+ others	No (under standard terms)	Varies	No	Varies; see individual terms

Red rows flag the platforms with the most aggressive terms across all four categories. See the full cross-platform comparison.

Three clauses worth reading yourself

Out of 845 provisions across 38 platforms, three stand out.

Character.AI caps its total liability at $100. That is not per incident. That is the ceiling. If the platform produces harmful content, if it affects your child, if your data is breached, the most Character.AI will ever owe you is $100. That applies regardless of the severity of harm, on a platform popular with teenagers. Read the actual clause.

Google Gemini keeps your conversations for up to 3 years. The default retention period is 18 months. Some data stays for 3 years. Deleting a conversation from your Gemini history does not remove it from Google's systems right away. A health question you asked in January 2026 could still be on Google's servers in 2029. Read the actual clause.

OpenAI trains on your conversations unless you find the setting. The default is on. To turn it off: open ChatGPT, tap your profile icon, go to Settings, tap Data Controls, toggle off "Improve the model for everyone." Most people never find this. Everything typed before you toggle it off has already been used. Read the actual clause.

Which AI companies train on your conversations?

At least 8 of the 38 platforms we track grant themselves the right to train on what you type.

ChatGPT. Your conversations feed future model training by default. OpenAI does not surface this during signup. They also reserve the right to have employees read your conversations for safety and quality. A person at OpenAI could see what you wrote. See the provision.

Gemini. Same basic approach, but Google holds onto it longer. Your conversations train their AI and human reviewers can access them. The 18-month default retention is the longest of any major AI platform we track.

Claude. Anthropic draws a line most others do not. If you pay for the API, your data stays out of training entirely. Free tier data may be used. There is a detail most people miss: clicking thumbs up or thumbs down on a response flags that entire conversation for storage. The feedback is not anonymous and the conversation is kept. See the provision.

Character.AI. Everything trains the model. The license is perpetual and irrevocable. There is no opt-out. See the provision.

Midjourney. Every prompt and every generated image is licensed to Midjourney forever, royalty-free, for any purpose. See the provision.

Most enterprise-focused platforms do not train on your data under standard terms. Cohere, Mistral AI, DeepL, Cursor, Replit, Databricks, and Groq either default to no training or restrict controls to enterprise contracts.

How long do AI companies keep your data?

Google Gemini is the outlier. 18 months is the default. Some categories go to 3 years. A question you asked Gemini about a health concern in January 2026 could still be on Google's servers in July 2028. See the provision.

OpenAI and Anthropic both commit to 30-day deletion windows. But OpenAI has a catch: data already built into a trained model stays in the model. Your words shaped how the AI responds. Deleting your account does not undo that influence.

Character.AI has its own version of this problem. Delete your account, and the AI characters you created can still live on the platform. Other people keep talking to a personality you built, even after you are gone. See the provision.

Midjourney barely addresses retention at all. If you are evaluating this platform for business use, the absence of a clear retention schedule is the kind of gap that should show up in a vendor assessment.

Can you sue an AI company if something goes wrong?

Four of the 38 force arbitration: OpenAI, Character.AI, Microsoft Copilot, and Midjourney.

Arbitration means a private proceeding instead of court. No jury. No public record. All four also include class action waivers. If the same problem affects a thousand people, each one has to fight individually.

ChatGPT generates false claims about you and they spread? You cannot go to court. Character.AI produces harmful content directed at your child? You cannot join with other parents. Midjourney uses your creative work without permission? Individual arbitration is the only path. Compare arbitration clauses across all platforms.

The other 34 platforms, including Anthropic, Google Gemini, Cohere, and Mistral AI, do not include mandatory arbitration in their consumer terms.

How hard is it to opt out of training?

One toggle: Anthropic has a clear setting. OpenAI has one too, buried under Settings > Data Controls > "Improve the model for everyone."

No opt-out: Character.AI and Midjourney. Content is licensed upon submission. No setting changes that.

Enterprise contracts only: Cohere, Databricks, and Scale AI limit training controls to paid enterprise agreements with separate DPAs. Free and standard plans get the defaults.

Off by default: Mistral AI does not train on paid customer data. Training is opt-in. This is still the exception in the industry, not the rule.

What does the law say about this?

GDPR Articles 13 and 22. Any AI platform processing data of people in the EU must disclose what it does with that data and whether automated decision-making is involved. Training on conversations is processing. Several platforms in this list may not fully meet these transparency requirements. See GDPR coverage.

CCPA and CPRA. If you are in California, you have the right to know what is collected, to delete it, and to opt out of its sale or sharing. Using conversations to train models likely qualifies as a "business purpose" requiring disclosure. It may also qualify as a "sale" requiring opt-out rights. See CCPA/CPRA coverage.

EU AI Act. Enforcement began in phases in 2025. Providers of general-purpose AI must disclose training data sources. Platforms training on conversations without clear disclosure face growing compliance exposure as enforcement scales. See EU AI Act coverage.

If you use AI tools personally

Turn off training in ChatGPT now. Open ChatGPT. Tap your profile icon. Settings. Data Controls. Toggle off "Improve the model for everyone." Takes 30 seconds. Everything typed before the toggle was already used, but you stop it going forward.

Check Claude's settings if you use the free tier. Go to claude.ai, open settings, review your privacy preferences. If you pay for the API, training is already off.

If you need to ask about something personal, know which platforms protect you most. Not all AI chatbots treat your data the same way. Anthropic's paid tier does not train on your inputs. Mistral does not train on paid customer data. OpenAI and Google train by default. If you are going to discuss health, money, or legal questions with an AI, choose the one that is not feeding those conversations into its next model.

Check your arbitration opt-out window. OpenAI, Character.AI, and Midjourney all have opt-out periods. They are typically 30 to 60 days from account creation. If you are still in the window, send your opt-out in writing. Once the window closes, you are locked in.

Request your data. Under GDPR, email the company's privacy contact and ask for a copy of everything they hold on you. Under CCPA, look for the "Do Not Sell My Personal Information" link in their privacy policy footer. What you get back is often more revealing than the policy itself.

If your organization uses these as vendors

Review whether your DPA covers AI training. If your team uses ChatGPT, Gemini, or Copilot and the data processing agreement does not explicitly exclude training on your inputs, it may be happening. Enterprise agreements sometimes include this carve-out, but only if someone negotiated for it.

Add these to your vendor due diligence questionnaire:

Does the platform train on our inputs and outputs? Under which conditions?
Can human reviewers at the vendor access our data?
What is the data retention period after deletion or account closure?
Does the agreement include mandatory arbitration or a class action waiver?
What is the dollar liability cap? Does it cover AI output errors specifically?

Flag Character.AI and Midjourney for internal review. A $100 liability cap and a perpetual irrevocable content license are provisions that should trigger a risk assessment before either platform is approved for business use. If employees use these on company devices or with company data, that is a vendor risk issue worth raising.

Compare enterprise terms before procurement. Anthropic, Cohere, and Mistral AI currently offer the clearest contractual separation between consumer and enterprise data handling. This is based on their published terms as of April 2026. Terms change. Verify current terms before relying on this for procurement decisions. Browse all 204 monitored platforms.

How we verified this

ConductAtlas archives the full text of privacy policies and terms of service for 204 platforms daily, including 38 AI companies. Every version is cryptographically hashed, timestamped, and assigned a stable record identifier. When a document changes, we generate a structured diff, extract the affected clauses, and produce analysis with specific regulatory citations.

This post is based on 845 active provisions across 55 AI platform documents. Every claim links to the provision page with the original clause language and full analysis.