How does ConductAtlas capture platform policy documents?

ConductAtlas runs daily monitoring across 352+ platforms. Each tracked document is fetched, hashed with SHA-256, compared to the prior version, and archived with timestamp, HTTP headers, source URL, and independent Wayback Machine backup. Documents requiring JavaScript rendering are captured via headless Chromium (Playwright).

How are policy changes detected?

Every captured document is cryptographically hashed. If the hash matches the prior version, a verification event is logged with no new version. If the hash differs, a textual diff confirms the change is substantive (not dynamic UI). Substantive changes produce a new document version and trigger classification and notification.

How are provisions classified?

Each document is parsed into individual provisions — distinct clauses creating rights, obligations, or limitations. Provisions are classified into 20 canonical types (arbitration, data collection, liability, etc.) and assigned severity (high/medium/low) based on magnitude of rights affected, opt-out availability, user breadth, legal exposure, and platform discretion.

What are ConductAtlas record IDs?

Every record in the archive has a stable permanent identifier for citation: CA-E-XXXXXX for entities, CA-D-XXXXXX for documents, CA-V-XXXXXX for versions, CA-C-XXXXXX for changes, and CA-P-XXXXXX for provisions. IDs are never reassigned and can be used in legal filings, academic research, and regulatory submissions.

Is ConductAtlas data legally citable?

ConductAtlas records are designed to meet citation-grade standards: SHA-256 cryptographic hash, UTC timestamp, stable permalink, HTTP metadata, Wayback Machine independent backup, and structured cite-as format. Records support legal and compliance review. ConductAtlas is not a law firm and does not provide legal opinions.

How are errors in classification corrected?

If you identify a misclassification, outdated capture, missing provision, or factual inaccuracy, email contact@conductatlas.com. Corrections are investigated promptly. Where appropriate, records are annotated with correction notes preserving both the original and updated state.

What technology stack does ConductAtlas use?

Application and database run on Railway (United States). Documents are stored on Cloudflare R2. AI analysis uses Anthropic Claude models. Email delivery via Resend. Payment via Stripe. Analytics via Plausible Analytics (EU, cookieless).

Is ConductAtlas independent from tracked platforms?

Yes. ConductAtlas is independent — not affiliated with, endorsed by, or sponsored by any tracked platform. Revenue comes from subscription fees (Monitor, Analyst, Compliance, Enterprise). We do not accept advertising, platform partnerships, or data brokerage income.

Methodology

How ConductAtlas captures, verifies, archives, and classifies platform policy documents. This page describes our data standards, archival methodology, and the structure of the evidence we provide.

Why this page exists: Compliance teams, legal counsel, and researchers need to know where our data comes from, how we verify it, and why our records are citable. This page answers those questions in detail.

What we monitor

ConductAtlas tracks publicly available policy documents from more than 170 platforms, including terms of service, privacy policies, community guidelines, acceptable use policies, fee schedules, cookie policies, data processing addenda, and related governance documents. Our focus is on consumer-facing and business-facing platforms in categories including financial services, consumer technology, social media, AI services, healthcare technology, and e-commerce.

We capture every new version of a monitored document as it changes. We do not edit or modify the source content — we archive it exactly as the platform published it.

360+

Platforms monitored

850+

Documents archived

61,310+

Provisions classified

Daily

Capture frequency

The capture process

Continuous monitoring

Our monitoring service runs daily, with every tracked document fetched and compared against the prior version on record. Most captures run in a single batch at approximately 06:00 UTC; time-sensitive documents can be captured more frequently.

For documents rendered by JavaScript or protected by anti-bot measures, we use a headless browser environment (Playwright) that renders pages the same way a real browser would, including executing JavaScript and handling modern web application frameworks.

Change detection

Every captured document is hashed using SHA-256. If the hash matches the prior version, we record a verification event but do not create a new version row. If the hash differs, we perform a textual diff to confirm the change is substantive (rather than, for example, a dynamically generated element unrelated to policy content). Substantive changes produce a new document version and trigger downstream processing.

Evidence preservation

Every captured version is stored with:

The raw HTML or PDF content, unmodified
A SHA-256 cryptographic hash of the content
Timestamp of capture in UTC
Source URL at the time of capture
HTTP response headers and status
A reference to an independent archive (Wayback Machine) where available

This evidence chain is designed for legal and regulatory use. A document captured on April 14, 2026, with a specific SHA-256 hash can be independently verified against the Wayback Machine and against the platform's own historical record if they maintain one.

Classification and analysis

Provision extraction

Each document is parsed to identify individual provisions — distinct clauses or sections that create rights, obligations, or limitations. Provisions are extracted with their location in the source document and classified into one of 20 canonical types covering areas such as arbitration, data collection, data sharing, liability limitation, account control, platform discretion, and enforcement actions.

Severity classification

Each provision is classified as high, medium, or low severity based on its impact on users. Severity reflects factors including:

The magnitude of rights waived or obligations imposed
Whether mitigation or opt-out is available
The breadth of who is affected
The financial or legal exposure created
The degree of discretion retained by the platform

Severity classifications are produced by AI-assisted analysis using Anthropic's Claude models, reviewed against established compliance frameworks, and refined through our canonical taxonomy. We document our classification criteria and welcome feedback on specific classifications from domain experts.

Confidence architecture

Each provision includes signals describing the confidence and basis of its classification.

Claim basis indicates whether a provision's classification derives from explicit document language or is inferred from context. Provisions marked explicit document language contain direct textual support in the source document. Provisions marked inferred from context are derived from the operational implications of surrounding language.

Interpretive confidence reflects how unambiguous the clause language is. High confidence indicates clear, specific language with little room for alternative interpretation. Medium confidence indicates language that is reasonably clear but may depend on jurisdiction or enforcement context. Low confidence indicates ambiguous, vague, or context-dependent language where multiple interpretations are plausible.

Uncertainty notes appear on provisions where the analysis involves genuine ambiguity. These notes describe what is uncertain and why, rather than defaulting to false precision. An honest low-confidence flag is more valuable than certainty theater.

Confidence signals are generated by AI-assisted analysis and reflect the model's assessment of the source language. They are not legal determinations.

Interpretive boundaries

ConductAtlas maps governance language to potentially relevant regulatory frameworks. Throughout the platform, interpretive boundary notices clarify the scope of our analysis. ConductAtlas analysis describes what agreements state, authorize, or permit. It does not determine enforceability, legality, or operational implementation. Regulatory applicability may vary by jurisdiction, enforcement context, and individual circumstances.

Corrections and revisions

ConductAtlas maintains a public corrections log documenting substantive revisions to classifications, regulatory mappings, severity assessments, and interpretive analysis. Serious governance infrastructure evolves transparently. When corrections are made, the original classification and the reason for revision are preserved in the public record.

Institutional analysis

For changes to documents in regulated sectors, we produce institutional-level analysis including regulatory exposure mapping (GDPR, CCPA, CPRA, HIPAA, FTC, SEC, FINRA, and other applicable frameworks), enforcement history context where publicly available, and comparison to peer platforms' approaches. This analysis is intended to support — not replace — legal and compliance review.

Record identifiers

Every record in the ConductAtlas archive has a stable identifier suitable for citation in legal filings, academic research, and regulatory submissions. These IDs are permanent and never reassigned.

CA-E-XXXXXX

Entity record (a monitored platform or organization)

CA-D-XXXXXX

Document record (a canonical policy document that is tracked over time)

CA-V-XXXXXX

Version record (a specific captured version of a document on a specific date)

CA-C-XXXXXX

Change record (a detected and verified change between two versions)

CA-P-XXXXXX

Provision record (a specific clause within a document)

These identifiers can be used in legal filings, research citations, and regulatory submissions. Every public page on ConductAtlas displays the relevant record ID, and our API returns these identifiers in all responses.

Data quality

Where we fall short

We believe in transparency about limitations. ConductAtlas is a young project and our coverage is uneven.

Full text extraction is not complete for every archived provision. We are actively backfilling provision-level excerpts. Pages where clause text is pending extraction are labeled as such.
Platform coverage is growing. If a platform you care about is not in our archive, email contact@conductatlas.com and we prioritize based on demand.
AI-assisted classification has error rates. Edge cases, novel provision types, and documents with unusual structure can produce classification errors. We welcome corrections.
Jurisdictional coverage is currently US-primary, with growing coverage of EU, UK, and global platforms. Analysis is more reliable for platforms operating under US law.

How we correct errors

If you identify an error — a misclassification, an outdated capture, a missing provision, a factual inaccuracy in our analysis — email contact@conductatlas.com. We investigate corrections promptly and, where appropriate, annotate the record with a correction note preserving the original and updated state.

Technology and infrastructure

For transparency about the systems that process your data and our archive:

Application and database: Railway (United States)
Document storage: Cloudflare R2 (global, US-primary)
AI analysis: Anthropic Claude models (United States)
Email delivery: Resend (United States)
Payment processing: Stripe (United States)
Analytics: Plausible Analytics (European Union, privacy-friendly, no cookies)

Full details of how we handle your personal data are in our Privacy Policy.

Independence

ConductAtlas is independent. We are not affiliated with, endorsed by, or sponsored by any of the platforms we track. Platform names and marks are property of their respective owners. We do not receive compensation from platforms for coverage, and no platform has editorial control over our analysis.

Our business model is subscription-based: free for individuals, paid tiers for professionals and institutions. We do not monetize through advertising, data brokerage, or platform partnerships.

Research access

Academic researchers, journalists, nonprofit advocacy organizations, and public-interest projects can request free or discounted access to ConductAtlas. Email contact@conductatlas.com with a brief description of your use case.

Contact and corrections

For methodology questions, corrections, coverage requests, or research partnerships: contact@conductatlas.com.

This methodology document is itself versioned. As our processes evolve, we update this page and record the update date. Suggestions for improvement are welcome.

Methodology

What we monitor

The capture process

Continuous monitoring

Change detection

Evidence preservation

Classification and analysis

Provision extraction

Severity classification

Confidence architecture

Interpretive boundaries

Corrections and revisions

Institutional analysis

Record identifiers

Data quality

Where we fall short

How we correct errors

Technology and infrastructure

Independence

Research access

Contact and corrections

Frequently Asked Questions