Character.AI collects publicly available information from the internet to train its AI models, in addition to data collected directly from users.
This analysis describes what Character.AI's agreement states, permits, or reserves. It does not constitute a legal determination about enforceability. Regulatory applicability and practical outcomes may vary by jurisdiction, enforcement context, and individual circumstances. Read our methodology
The use of publicly available internet data for commercial AI model training has become a subject of regulatory and legal scrutiny, including questions about intellectual property rights and whether publicly available data retains privacy protections under applicable law.
Interpretive note: The policy does not specify what types of publicly available data are collected or from which sources, creating uncertainty about the scope of this collection practice and the applicable compliance obligations.
This new provision clarifies that Character.AI uses publicly available internet data for model training, expanding the scope of data sources beyond user-provided content.
View full change record →Information about you that is publicly available online may be collected and used by Character.AI for AI model training purposes, beyond what you directly provide to the platform.
How other platforms handle this
We may share your personal information with our affiliates, meaning entities that control, are controlled by, or are under common control with Consensys. We also share information with service providers who assist in operating our services, subject to confidentiality obligations.
At Ledger, earning and maintaining our users' trust is a top priority. That's why we are deeply committed not only to protecting your privacy and securing your personal data, but also to being fully transparent about how we handle it.
RedCard. We share information with our financial partners to operate the Target RedCard program.
Monitoring
Character.AI has changed this document before.
Receive same-day alerts, structured change summaries, and monitoring for up to 25 platforms.
"We also collect information that is available on the Internet or from other publicly available sources to evaluate and improve our Services, including for model training and development.— Excerpt from Character.AI's Character.ai Privacy Policy
REGULATORY LANDSCAPE: The collection of publicly available data for AI model training engages GDPR Article 6 lawful basis requirements and Article 14 transparency obligations for data not collected directly from data subjects, as well as emerging EU AI Act training data governance provisions. In the US, this practice interacts with FTC guidance on commercial data practices and state privacy law definitions of personal information. The European Data Protection Board has issued guidance relevant to whether publicly available data retains personal data status under GDPR. GOVERNANCE EXPOSURE: Medium. Scraping publicly available data for AI model training is a widespread industry practice but has attracted regulatory scrutiny in the EU regarding GDPR Article 14 notification obligations and in the UK from the ICO. The policy's brief disclosure does not specify what types of publicly available data are collected or from which sources, limiting the ability to assess compliance exposure without additional information. JURISDICTION FLAGS: EU and UK users whose information appears in publicly available sources may have Article 14 notification rights under GDPR that require the data controller to provide transparency disclosures within a reasonable time. California users may have CCPA rights over personal information collected from public sources depending on how the data is categorized. The breadth of the disclosure, referencing internet and other publicly available sources without limitation, creates uncertainty about scope. CONTRACT AND VENDOR IMPLICATIONS: If publicly available data collection is conducted by third-party data providers or web scraping services, those relationships should be reviewed for compliance with applicable terms of service and privacy laws. Data provenance documentation is increasingly expected by regulators reviewing AI training data practices. COMPLIANCE CONSIDERATIONS: Compliance teams should document the categories of publicly available data collected, the sources, and the legal basis under GDPR and applicable US law. GDPR Article 14 notification obligations should be assessed and, if applicable, a mechanism for providing those notifications should be developed. Intellectual property review of training data sources should also be considered given current litigation trends in this area.
Full compliance analysis
Regulatory citations, enforcement risk, and due diligence action items.
Free: track 1 platform + weekly digest. Monitor: 25 platforms + same-day alerts. No credit card required.
ConductAtlas detected a major restructuring of Meta’s privacy policy that removed detailed consumer rights disclosures and relocated them to separate documents.
Your genetic data may be transferred to a new owner as a business asset. Here is what the Terms of Service actually say and what you can do right now.
Compliance Governance Intelligence
Need to monitor specific governance provisions?
Compliance includes provision-level monitoring, governance timelines, regulatory mapping, and audit-ready analysis.
Built from archived source documents, structured governance mappings, and historical version tracking.
The use of publicly available internet data for commercial AI model training has become a subject of regulatory and legal scrutiny, including questions about intellectual property rights and whether publicly available data retains privacy protections under applicable law.
Information about you that is publicly available online may be collected and used by Character.AI for AI model training purposes, beyond what you directly provide to the platform.
No. ConductAtlas is an independent monitoring service. We are not affiliated with, endorsed by, or sponsored by Character.AI.