Mistral AI trains its AI models using datasets from third parties and publicly available internet data, which may contain your personal information even after filtering attempts.
Personal data about you that exists publicly online — such as on social media, news articles, or public records — may have been used to train Mistral AI's models, and there is no guarantee it was successfully filtered out despite stated efforts.
Cross-platform context
See how other platforms handle Third-Party Training Dataset Disclosure and similar clauses.
Compare across platforms →Your personal data may have been included in AI training datasets scraped from the internet without your knowledge, and 'good practices to filter' does not guarantee removal.
(1) REGULATORY FRAMEWORK: This provision implicates GDPR Art. 14 (transparency obligations for data collected from third parties), Art. 6 (lawful basis for training data processing), and Art. 17 (right to erasure from training datasets — a practically complex right). The EU AI Act (Regulation 2024/1689) Art. 53 imposes specific transparency and documentation obligations on general-purpose AI model providers regarding training data, including copyright and personal data governance documentation. The CNIL's 2024 framework on AI and personal data is directly applicable. (2)
Compliance intelligence locked
Regulatory citations, enforcement risk, and due diligence action items.
Watcher: regulatory citations. Professional: full compliance memo.