GitHub · GitHub Privacy Statement · View original document ↗

AI/ML Model Training Data Use

High severity Unique · 0 of 343 platforms
Share 𝕏 Share in Share 🔒 PDF
Recent governance activity GitHub recorded 2 documented changes in the last 30 days.
Start monitoring updates
Monitor governance changes for GitHub Create a free account to receive the weekly governance digest and monitor one platform for governance changes.
Create free account No credit card required.

This analysis describes what GitHub's agreement states, permits, or reserves. It does not constitute a legal determination about enforceability. Regulatory applicability and practical outcomes may vary by jurisdiction, enforcement context, and individual circumstances. Read our methodology

ConductAtlas Analysis

Why it matters (compliance & governance perspective)

The clause establishes the operational scope of GitHub's data use for model development and product improvement, while providing a mechanism for users to restrict participation in AI training activities through account settings.

Recent Activity

This document changed recently

High Apr 28, 2026

The updated terms now explicitly authorize GitHub to collect AI outputs generated within the platform alongside user-provided code and content, and to share personal data with Microsoft and other GitHub affiliates for purposes including training and improving artificial intelligence and machine learning technologies. The privacy statement indicates that aggregate and de-identified data will be used where feasible, but the updated language establishes broader authority for affiliate data sharing and AI model development than the previous version stated. The revised terms also remove specific disclosure of the conditions under which GitHub personnel may access private repositories, replacing that detail with a cross-reference to the Terms of Service, which means the scope of internal GitHub access to private repositories is now defined in a separate contract document rather than the privacy statement itself.

View change record →

Clause Stability Stable

0
Changes
3
Months Monitored
May 7, 2026
First Seen
May 7, 2026
Last Seen
This clause type exists across 261 other provisions on other platforms.

Consumer impact (what this means for users)

Under this provision, personal data is used to train machine learning models unless the user affirmatively opts out through their settings. The availability of the opt-out mechanism means data use for AI training is not automatic upon account creation but requires the user to take no action for the default practice to apply.

How other platforms handle this

DeepL Medium

To improve the quality of our services, we analyse texts submitted for translation. We ensure that this analysis cannot be traced back to individual users by anonymising the data before analysis. DeepL Pro subscribers' texts are not used to train our machine translation systems.

Roblox Medium

We are simplifying our Terms of Use, including clarifications around the use of AI tools, and their data use. We have moved the terms that describe AI Features, which were previously written for a Creator audience and located under the AI-Based Tools Supplemental Terms and Disclaimer, into the User ...

Mistral AI Medium

Data publicly available on the Internet. Our artificial intelligence models are trained on data that is publicly available on the Internet by third parties, which may contain personal data, even if we use good practices to filter out such personal data. [...] Training Datasets. In some cases, we acc...

See all platforms with this clause type →

Monitoring

GitHub has changed this document before.

Receive same-day alerts, structured change summaries, and monitoring for up to 25 platforms.

Start Monitor free trial Or create a free account →
▸ View Original Clause Language DOCUMENT RECORD
"
We may use the personal data we collect to improve our Services, develop new Services, and conduct research. This includes using the data to train and improve AI and machine learning models for features like GitHub Copilot. You can opt out of your personal data being used to train these models by adjusting your settings.

— Excerpt from GitHub's GitHub Privacy Statement

Applicable regulations

EU AI Act
European Union
Colorado AI Act
US-CO
GDPR
European Union
Texas AI Act
Texas, USA
UK GDPR
United Kingdom

Provision details

Document information
Document
GitHub Privacy Statement
Entity
GitHub
Document last updated
May 5, 2026
Tracking information
First tracked
May 10, 2026
Last verified
May 12, 2026
Record ID
CA-P-005601
Document ID
CA-D-00254
Evidence Provenance
Source URL
Wayback Machine
Content hash (SHA-256)
d21b58443ca0b4402240dbd06996ada072c72ed842fcccc6b13acab2d7bc6c4d
Analysis generated
May 10, 2026 09:46 UTC
Methodology
Evidence
✓ Snapshot stored   ✓ Hash verified
Citation Record
Entity: GitHub
Document: GitHub Privacy Statement
Record ID: CA-P-005601
Captured: 2026-05-10 09:46:36 UTC
SHA-256: d21b58443ca0b440…
URL: https://conductatlas.com/platform/github/github-privacy-statement/aiml-model-training-data-use/
Accessed: June 29, 2026
Permanent archival reference. Stable identifier suitable for legal filings, compliance documentation, and research citation.
Classification
Severity
High
Categories

Other risks in this policy

Related Analysis

Compliance Governance Intelligence

Need to monitor specific governance provisions?

Compliance includes provision-level monitoring, governance timelines, regulatory mapping, and audit-ready analysis.

Arbitration clauses AI governance Data rights Indemnification Retention policies
Start Compliance free trial

Or start with Monitor →

Built from archived source documents, structured governance mappings, and historical version tracking.

Frequently Asked Questions

What does GitHub's AI/ML Model Training Data Use clause do?

The clause establishes the operational scope of GitHub's data use for model development and product improvement, while providing a mechanism for users to restrict participation in AI training activities through account settings.

How does this clause affect you?

Under this provision, personal data is used to train machine learning models unless the user affirmatively opts out through their settings. The availability of the opt-out mechanism means data use for AI training is not automatic upon account creation but requires the user to take no action for the default practice to apply.

Is ConductAtlas affiliated with GitHub?

No. ConductAtlas is an independent monitoring service. We are not affiliated with, endorsed by, or sponsored by GitHub.