ConductAtlas reviewed the published terms of service, privacy policies, and acceptable use policies of 10 major AI platforms to document how each describes the use of user data, inputs, outputs, or interactions for AI model training, improvement, and development. This comparison is based on archived provisions in the ConductAtlas governance archive. Documents were captured between May 2 and May 12, 2026.
Platforms reviewed: OpenAI, Anthropic, Google Gemini, GitHub Copilot, Midjourney, xAI (Grok), Perplexity, Cursor, Meta, and Hugging Face.
Documents reviewed included consumer terms of service, privacy policies, API terms, platform policies, acceptable use policies, and product-specific AI terms. 34 documents archived across 10 platforms.
Comparative Overview
Several reviewed platforms describe training-related provisions that apply unless users disable applicable controls or opt-out settings. The scope of what each platform includes, whether conversations, code, prompts, generated outputs, or uploaded files, varies by platform and product tier.
Opt-out structures differ across platforms. Some provide account-level controls. Others limit controls to specific product tiers or authentication states. Several platforms describe exceptions where certain data may still be used for safety review or model improvement even when opt-out controls are enabled.
Content licensing provisions interact with training provisions but operate separately. Several platforms grant perpetual, worldwide, royalty-free licenses to user-submitted content, which may authorize uses beyond model training.
Enterprise and API access frequently operates under different terms than consumer products. OpenAI excludes API-submitted data from training by default. GitHub describes repository-level controls. These distinctions mean the same platform may apply different training provisions depending on the access method.
Provision Comparison
| Platform | What terms authorize for training | Opt-out | Key conditions |
|---|---|---|---|
| OpenAI | Conversations, files, inputs used to improve services and train models | Yes, via account settings. API excluded by default | Training-related provisions apply by default for standard consumer usage unless users disable applicable controls |
| Anthropic | Inputs and outputs used to train models and improve services | Yes, via account settings | Safety review exception: flagged conversations may still be used even with opt-out enabled |
| Google Gemini | Conversations saved and used to improve AI models when Gemini Apps Activity is on. Human reviewers access a subset | Disabling Gemini Apps Activity does not fully prevent certain data uses | The documentation instructs users not to submit confidential information |
| GitHub Copilot | Personal data including AI outputs used to train and improve AI/ML models | Repository-level controls | Data shared with Microsoft for AI development |
| Midjourney | Prompts, images, voice-derived inputs, uploaded content | Not identified in reviewed provisions | Perpetual, royalty-free, irrevocable license granted |
| xAI (Grok) | User content used to improve products and train models | Yes, for logged-in users only | Unauthenticated users have no documented control |
| Perplexity | Queries and interaction content used to train and develop AI models | Not identified in reviewed provisions | Queries and interaction content included in stated training provisions |
| Cursor | Terms state content will NOT be used for training unless user explicitly agrees | Explicit opt-in required | Security review exception for flagged inputs |
| Meta | Perpetual, worldwide, sublicensable license for content shared on products | Developer restrictions separate from consumer terms | Consumer content license provisions. Third-party training restrictions for developers |
| Hugging Face | Public repositories receive perpetual, irrevocable license once published | License described as non-revocable once public | Private content under standard platform license |
Governance Control Matrix
| Platform | Consumer Training | API Excluded | Opt-Out Type | Safety Exception | Human Review |
|---|---|---|---|---|---|
| OpenAI | Yes | Yes | Account-level | Not described | Not described |
| Anthropic | Yes | Enterprise distinctions | Account-level | Yes | Not described |
| Google Gemini | Yes (when Activity on) | N/A | Activity toggle | Not described | Yes |
| GitHub | Yes | Repository-level | Repository settings | Not described | Not described |
| Midjourney | Yes | N/A | Not identified | Not described | Not described |
| xAI | Yes | Not described | Logged-in only | Not described | Not described |
| Perplexity | Yes | Not described | Not identified | Not described | Not described |
| Cursor | No (opt-in only) | N/A | Explicit opt-in | Security exception | Not described |
| Meta | Content license | N/A | N/A | N/A | Not described |
| Hugging Face | Public content only | N/A | Irrevocable once public | Not described | Not described |
Observed Governance Patterns
Across the reviewed platforms, the following structural patterns appear in how training-related provisions are described:
Account-based opt-out controls. OpenAI, Anthropic, and xAI describe account-level settings that allow users to disable training-related data use, subject to stated exceptions. [CA-P-8f958ce7, CA-P-d50b2c5c]
API and enterprise carve-outs. OpenAI and Cursor describe separate treatment for API-submitted data. GitHub describes repository-level distinctions between public and private content.
Safety and security review exceptions. Anthropic and Cursor describe provisions where opted-out data may still be used when flagged for safety or security review. [CA-P-216f1f6a, CA-P-18a0658c]
Authentication-dependent controls. xAI limits training opt-out to logged-in users. Unauthenticated interactions operate under different terms. [CA-P-24c2bbb0]
Human review disclosures. Google Gemini describes human reviewer access to a subset of conversations. Other reviewed platforms do not include comparable disclosures. [CA-P-138b06f4]
Perpetual content licensing. Midjourney, xAI, Meta, and Hugging Face describe perpetual, irrevocable, or royalty-free content licenses that operate independently of training-specific provisions.
Platform-Level Notes
OpenAI — The published privacy policy states that content provided by users may be used to improve services, including training the models that power ChatGPT. Opt-out controls are available through account settings. API-submitted data is excluded from training by default under separate terms. [OpenAI Privacy Policy, captured May 2026]
Anthropic — The privacy policy states that inputs and outputs may be used to train models and improve services unless users opt out through account settings. A stated exception provides that conversations flagged for safety review may still be used for model improvement regardless of opt-out status. Anthropic also discloses training on third-party data sources including publicly available information and licensed datasets. [Anthropic Privacy Policy, captured May 2026]
Google Gemini — The privacy notice states that conversations are saved and used to improve Google's AI models when Gemini Apps Activity is enabled. A subset of conversations is reviewed by human annotators. The documentation instructs users not to submit confidential information. The documentation states that disabling Gemini Apps Activity does not fully prevent certain data uses for improvement purposes. [Gemini Apps Privacy Notice, captured May 2026]
GitHub Copilot — The privacy statement authorizes use of personal data, including AI-generated outputs, to train and improve AI and machine learning models. Data may be shared with affiliates including Microsoft for AI development purposes. Separate product-level terms govern Copilot-specific data handling. [GitHub Privacy Statement, captured May 2026]
Midjourney — The privacy policy states that prompts, images, voice-derived inputs, and uploaded content are collected and may be used for AI training. The terms of service grant Midjourney a perpetual, worldwide, non-exclusive, sublicensable, royalty-free, irrevocable copyright license to user content. [Midjourney Privacy Policy, captured May 2026]
xAI (Grok) — The published terms state that logged-in users can select whether their content is used for training. This control is available only to authenticated users. The terms grant xAI an irrevocable, perpetual, transferable, sublicensable, royalty-free, worldwide right to user content. [xAI Terms of Service, captured May 2026]
Perplexity — The privacy policy states that queries submitted and content interacted with may be used to train, improve, and develop AI models and services. The reviewed provisions do not describe a training-specific opt-out control. [Perplexity AI Privacy Policy, captured May 2026]
Cursor — The published terms state that Anysphere will not use content to train, or allow any third party to train, any AI models unless the user has explicitly agreed. The privacy policy describes an exception for inputs flagged for security review. [Cursor Terms of Service, captured May 2026]
Meta — The terms of service grant a non-exclusive, transferable, sub-licensable, royalty-free, worldwide license for content shared on Meta products. The platform policy separately restricts third-party developers from using platform data for AI model training without authorization. [Meta Terms of Service, captured May 2026]
Hugging Face — Public repositories receive a perpetual, irrevocable, worldwide, royalty-free, non-exclusive license once published. The terms describe this license as non-revocable once content is made public. Private content is subject to a standard platform license for service operation. [Hugging Face Terms of Service, captured May 2026]
Scope and Limitations
This review documents what each platform's published terms state regarding AI training data use. It does not assess:
- Actual operational data handling practices
- Internal model architecture or training pipelines
- Regulatory compliance status
- Enforceability of specific provisions
- Unpublished enterprise or custom agreements
- Regional implementation differences
Provisions may vary by product, region, account type, or enterprise agreement. Terms are subject to change.
Methodology
All provisions referenced in this analysis are archived in the ConductAtlas governance archive with stable record identifiers, capture timestamps, and SHA-256 content hashes. ConductAtlas provides governance documentation and operational comparison. It does not provide legal advice or make determinations about compliance.
Documents archived: 34
Platforms reviewed: 10
Capture period: May 2–12, 2026
Archive references: ConductAtlas governance archive
Capture method: Automated scheduled archival capture with structured provision extraction