Character.AI collects publicly available information from the internet to train its AI models, in addition to data collected directly from users.
This analysis describes what Character.AI's agreement states, permits, or reserves. It does not constitute a legal determination about enforceability. Regulatory applicability and practical outcomes may vary by jurisdiction, enforcement context, and individual circumstances. Read our methodology
The use of publicly available internet data for commercial AI model training has become a subject of regulatory and legal scrutiny, including questions about intellectual property rights and whether publicly available data retains privacy protections under applicable law.
Interpretive note: The policy does not specify what types of publicly available data are collected or from which sources, creating uncertainty about the scope of this collection practice and the applicable compliance obligations.
Information about you that is publicly available online may be collected and used by Character.AI for AI model training purposes, beyond what you directly provide to the platform.
How other platforms handle this
Data publicly available on the Internet. Our artificial intelligence models are trained on data that is publicly available on the Internet by third parties, which may contain personal data, even if we use good practices to filter out such personal data. [...] Training Datasets. In some cases, we acc...
Writer does not use Customer Data to train its AI models without explicit customer permission. Customer Data means the data, content, and information that customers and their end users submit to or through the Services.
We may use the content you provide to us, including prompts and generated images, to train and improve our AI models and services.
Monitoring
Character.AI has changed this document before.
Receive same-day alerts, structured change summaries, and monitoring for up to 10 platforms.
"We also collect information that is available on the Internet or from other publicly available sources to evaluate and improve our Services, including for model training and development.— Excerpt from Character.AI's Character.ai Privacy Policy
REGULATORY LANDSCAPE: The collection of publicly available data for AI model training engages GDPR Article 6 lawful basis requirements and Article 14 transparency obligations for data not collected directly from data subjects, as well as emerging EU AI Act training data governance provisions. In the US, this practice interacts with FTC guidance on commercial data practices and state privacy law definitions of personal information. The European Data Protection Board has issued guidance relevant to whether publicly available data retains personal data status under GDPR. GOVERNANCE EXPOSURE: Medium. Scraping publicly available data for AI model training is a widespread industry practice but has attracted regulatory scrutiny in the EU regarding GDPR Article 14 notification obligations and in the UK from the ICO. The policy's brief disclosure does not specify what types of publicly available data are collected or from which sources, limiting the ability to assess compliance exposure without additional information. JURISDICTION FLAGS: EU and UK users whose information appears in publicly available sources may have Article 14 notification rights under GDPR that require the data controller to provide transparency disclosures within a reasonable time. California users may have CCPA rights over personal information collected from public sources depending on how the data is categorized. The breadth of the disclosure, referencing internet and other publicly available sources without limitation, creates uncertainty about scope. CONTRACT AND VENDOR IMPLICATIONS: If publicly available data collection is conducted by third-party data providers or web scraping services, those relationships should be reviewed for compliance with applicable terms of service and privacy laws. Data provenance documentation is increasingly expected by regulators reviewing AI training data practices. COMPLIANCE CONSIDERATIONS: Compliance teams should document the categories of publicly available data collected, the sources, and the legal basis under GDPR and applicable US law. GDPR Article 14 notification obligations should be assessed and, if applicable, a mechanism for providing those notifications should be developed. Intellectual property review of training data sources should also be considered given current litigation trends in this area.
Full compliance analysis
Regulatory citations, enforcement risk, and due diligence action items.
Free: track 1 platform + weekly digest. Watcher: 10 platforms + same-day alerts. No credit card required.
How 10 AI platforms describe the use of user data for model training, improvement, and development, based on archived governance provisions.
Professional Governance Intelligence
Need to monitor specific governance provisions?
Professional includes provision-level monitoring, governance timelines, regulatory mapping, and audit-ready analysis.
Built from archived source documents, structured governance mappings, and historical version tracking.
The use of publicly available internet data for commercial AI model training has become a subject of regulatory and legal scrutiny, including questions about intellectual property rights and whether publicly available data retains privacy protections under applicable law.
Information about you that is publicly available online may be collected and used by Character.AI for AI model training purposes, beyond what you directly provide to the platform.
No. ConductAtlas is an independent monitoring service. We are not affiliated with, endorsed by, or sponsored by Character.AI.