What is the severity of this clause?

ConductAtlas classifies this Training Data from Publicly Available Internet Sources clause as medium severity. Severity reflects the magnitude of rights affected, the breadth of users impacted, and the degree of discretion the platform retains.

Training Data from Publicly Available Internet Sources — Mistral AI

Share 𝕏 Share in Share 🔒 PDF

Recent governance activity Mistral AI recorded 4 documented changes in the last 30 days.

Start monitoring updates

Monitor governance changes for Mistral AI Create a free account to receive the weekly governance digest and monitor one platform for governance changes.

Create free account No credit card required.

ⓘ

This analysis describes what Mistral AI's agreement states, permits, or reserves. It does not constitute a legal determination about enforceability. Regulatory applicability and practical outcomes may vary by jurisdiction, enforcement context, and individual circumstances. Read our methodology

ConductAtlas Analysis

Why it matters (compliance & governance perspective)

The provision establishes the operational basis for Mistral AI's model training methodology and defines the sources from which training data is sourced. It establishes that personal data filtering occurs at multiple levels (both by third parties and by Mistral AI) but does not guarantee complete removal of personal information from training datasets.

Consumer impact (what this means for users)

Under this clause, individuals whose personal data appears in publicly available Internet sources or third-party datasets may have that data included in Mistral AI's model training processes, subject to filtration practices that are described as good faith efforts but not as absolute safeguards. Users are informed that personal data may remain present in training datasets despite these filtering practices.

How other platforms handle this

DeepL Medium

To improve the quality of our services, we analyse texts submitted for translation. We ensure that this analysis cannot be traced back to individual users by anonymising the data before analysis. DeepL Pro subscribers' texts are not used to train our machine translation systems.

Roblox Medium

We are simplifying our Terms of Use, including clarifications around the use of AI tools, and their data use. We have moved the terms that describe AI Features, which were previously written for a Creator audience and located under the AI-Based Tools Supplemental Terms and Disclaimer, into the User ...

DocuSign Medium

We may use aggregated, de-identified data derived from your use of our services, including document metadata and usage patterns, to develop, train, and improve our artificial intelligence and machine learning models and product features.

See all platforms with this clause type →

Monitoring

Mistral AI has changed this document before.

Receive same-day alerts, structured change summaries, and monitoring for up to 25 platforms.

Start Monitor free trial Or create a free account →

▸ View Original Clause Language DOCUMENT RECORD

"
Data publicly available on the Internet. Our artificial intelligence models are trained on data that is publicly available on the Internet by third parties, which may contain personal data, even if we use good practices to filter out such personal data. [...] Training Datasets. In some cases, we access datasets provided by third parties for our model training purposes. These datasets may include personal data (even if such third parties and Mistral AI use good practices to filter out such personal data), proprietary data, or public data.

— Excerpt from Mistral AI's Mistral AI Privacy Policy

Applicable regulations

EU AI Act

European Union

California AB 2013 AI Training Data Transparency

US-CA

Colorado AI Act

US-CO

EU AI Act - High Risk Provisions

Trump Executive Order on AI Policy Framework

Provision details

Document information

Document

Mistral AI Privacy Policy

Entity

Mistral AI

Document last updated

May 5, 2026

Tracking information

First tracked

May 11, 2026

Last verified

May 11, 2026

Record ID

CA-P-007014

Document ID

CA-D-00443

Evidence Provenance

Source URL

https://legal.mistral.ai/terms/privacy-policy

Wayback Machine

View archived versions →

Content hash (SHA-256)

a3774c814d80737846c7ac8379ec7dcc1c55ee8e0300de40dccee951ff5d0230

Analysis generated

May 11, 2026 05:55 UTC

Methodology

summarize_document-v8

Evidence

✓ Snapshot stored ✓ Hash verified

Citation Record

Entity: Mistral AI
Document: Mistral AI Privacy Policy
Record ID: CA-P-007014
Captured: 2026-05-11 05:55:06 UTC
SHA-256: a3774c814d807378…
URL: https://conductatlas.com/platform/mistral-ai/mistral-ai-privacy-policy/training-data-from-publicly-available-internet-sources/
Accessed: June 27, 2026

Permanent archival reference. Stable identifier suitable for legal filings, compliance documentation, and research citation.

Classification

Severity

Medium

Other risks in this policy

Model Training Use of User Inputs and Outputs medium
Memory Feature and Sensitive Data Storage medium
Data Retention for Le Chat Inputs and Outputs medium
Third-Party Training Datasets medium
IP Address Processing and Location-Based Output Personalization low
Data Subject Rights and DPO Contact low

Related Analysis

Three AI Governance Restructuring Patterns ConductAtlas Detected in May 2026
How Meta, TikTok, and Supabase restructured governance language across documents, jurisdictions, and consent frameworks through incremental document updates.
AI Training Data Provisions Across Major Platforms: A Provision-Level Comparison
How 10 AI platforms describe the use of user data for model training, improvement, and development, based on archived governance provisions.

Compliance Governance Intelligence

Need to monitor specific governance provisions?

Compliance includes provision-level monitoring, governance timelines, regulatory mapping, and audit-ready analysis.

Arbitration clauses AI governance Data rights Indemnification Retention policies

Start Compliance free trial

Or start with Monitor →

Built from archived source documents, structured governance mappings, and historical version tracking.

Frequently Asked Questions

What does Mistral AI's Training Data from Publicly Available Internet Sources clause do?

How does this clause affect you?

Is ConductAtlas affiliated with Mistral AI?

No. ConductAtlas is an independent monitoring service. We are not affiliated with, endorsed by, or sponsored by Mistral AI.