What is the severity of this clause?

ConductAtlas classifies this AI/ML Model Training Data Use clause as high severity. Severity reflects the magnitude of rights affected, the breadth of users impacted, and the degree of discretion the platform retains.

AI/ML Model Training Data Use — GitHub

Share 𝕏 Share in Share 🔒 PDF

Recent governance activity GitHub recorded 2 documented changes in the last 30 days.

Start monitoring updates

Monitor governance changes for GitHub Create a free account to receive the weekly governance digest and monitor one platform for governance changes.

Create free account No credit card required.

ⓘ

This analysis describes what GitHub's agreement states, permits, or reserves. It does not constitute a legal determination about enforceability. Regulatory applicability and practical outcomes may vary by jurisdiction, enforcement context, and individual circumstances. Read our methodology

ConductAtlas Analysis

Why it matters (compliance & governance perspective)

The clause establishes the operational scope of GitHub's data use for model development and product improvement, while providing a mechanism for users to restrict participation in AI training activities through account settings.

Recent Activity

This document changed recently

High Apr 28, 2026

The updated terms now explicitly authorize GitHub to collect AI outputs generated within the platform alongside user-provided code and content, and to share personal data with Microsoft and other GitHub affiliates for purposes including training and improving artificial intelligence and machine learning technologies. The privacy statement indicates that aggregate and de-identified data will be used where feasible, but the updated language establishes broader authority for affiliate data sharing and AI model development than the previous version stated. The revised terms also remove specific disclosure of the conditions under which GitHub personnel may access private repositories, replacing that detail with a cross-reference to the Terms of Service, which means the scope of internal GitHub access to private repositories is now defined in a separate contract document rather than the privacy statement itself.

View change record →

Clause Stability Stable

Changes

Months Monitored

May 7, 2026

First Seen

May 7, 2026

Last Seen

This clause type exists across 261 other provisions on other platforms.

Consumer impact (what this means for users)

Under this provision, personal data is used to train machine learning models unless the user affirmatively opts out through their settings. The availability of the opt-out mechanism means data use for AI training is not automatic upon account creation but requires the user to take no action for the default practice to apply.

How other platforms handle this

DeepL Medium

To improve the quality of our services, we analyse texts submitted for translation. We ensure that this analysis cannot be traced back to individual users by anonymising the data before analysis. DeepL Pro subscribers' texts are not used to train our machine translation systems.

Roblox Medium

We are simplifying our Terms of Use, including clarifications around the use of AI tools, and their data use. We have moved the terms that describe AI Features, which were previously written for a Creator audience and located under the AI-Based Tools Supplemental Terms and Disclaimer, into the User ...

Mistral AI Medium

Data publicly available on the Internet. Our artificial intelligence models are trained on data that is publicly available on the Internet by third parties, which may contain personal data, even if we use good practices to filter out such personal data. [...] Training Datasets. In some cases, we acc...

See all platforms with this clause type →

Monitoring

GitHub has changed this document before.

Receive same-day alerts, structured change summaries, and monitoring for up to 25 platforms.

Start Monitor free trial Or create a free account →

▸ View Original Clause Language DOCUMENT RECORD

"
We may use the personal data we collect to improve our Services, develop new Services, and conduct research. This includes using the data to train and improve AI and machine learning models for features like GitHub Copilot. You can opt out of your personal data being used to train these models by adjusting your settings.

— Excerpt from GitHub's GitHub Privacy Statement

Applicable regulations

Provision details

Document information

Document

GitHub Privacy Statement

Entity

GitHub

Document last updated

May 5, 2026

Tracking information

First tracked

May 10, 2026

Last verified

May 12, 2026

Record ID

CA-P-005601

Document ID

CA-D-00254

Evidence Provenance

Source URL

https://docs.github.com/en/site-policy/privacy-policies/github-general-privacy-statement

Wayback Machine

View archived versions →

Content hash (SHA-256)

d21b58443ca0b4402240dbd06996ada072c72ed842fcccc6b13acab2d7bc6c4d

Analysis generated

May 10, 2026 09:46 UTC

Methodology

summarize_document-v8

Evidence

✓ Snapshot stored ✓ Hash verified

Citation Record

Entity: GitHub
Document: GitHub Privacy Statement
Record ID: CA-P-005601
Captured: 2026-05-10 09:46:36 UTC
SHA-256: d21b58443ca0b440…
URL: https://conductatlas.com/platform/github/github-privacy-statement/aiml-model-training-data-use/
Accessed: June 29, 2026

Permanent archival reference. Stable identifier suitable for legal filings, compliance documentation, and research citation.

Classification

Severity

High

Other risks in this policy

Data Collection Scope medium
Affiliate Data Sharing with Microsoft high
AI Product Improvement and Copilot Data Use high
Public Repository Content Visibility medium
International Data Transfers and Standard Contractual Clauses high
User Rights: Access, Deletion, Portability, and Objection medium

Related Analysis

Three AI Governance Restructuring Patterns ConductAtlas Detected in May 2026
How Meta, TikTok, and Supabase restructured governance language across documents, jurisdictions, and consent frameworks through incremental document updates.
AI Training Data Provisions Across Major Platforms: A Provision-Level Comparison
How 10 AI platforms describe the use of user data for model training, improvement, and development, based on archived governance provisions.

Compliance Governance Intelligence

Need to monitor specific governance provisions?

Compliance includes provision-level monitoring, governance timelines, regulatory mapping, and audit-ready analysis.

Arbitration clauses AI governance Data rights Indemnification Retention policies

Start Compliance free trial

Or start with Monitor →

Built from archived source documents, structured governance mappings, and historical version tracking.

Frequently Asked Questions

What does GitHub's AI/ML Model Training Data Use clause do?

How does this clause affect you?

Is ConductAtlas affiliated with GitHub?

No. ConductAtlas is an independent monitoring service. We are not affiliated with, endorsed by, or sponsored by GitHub.