The model card metadata schema includes a structured evaluation results section that allows model publishers to report benchmark performance metrics linked to specific tasks, datasets, and configuration parameters. These structured results are parsed by the Hub and used to populate model comparison and leaderboard features.
This analysis describes what Hugging Face's agreement states, permits, or reserves. It does not constitute a legal determination about enforceability. Regulatory applicability and practical outcomes may vary by jurisdiction, enforcement context, and individual circumstances. Read our methodology
This provision establishes the structured format through which model performance claims are disclosed and indexed on the Hub, making the accuracy and completeness of evaluation result fields relevant to how users and automated systems assess and compare model capabilities.
Interpretive note: The document describes the evaluation results schema but does not specify verification standards or accuracy obligations for reported metric values.
Severity downgraded from medium to low, and guidance shifted from general description to specific YAML field structure (model-index) with detailed subfield requirements.
View full change record →Under this framework, structured evaluation results in model card metadata are surfaced in Hub search and comparison features, meaning users may rely on these fields when selecting models for specific tasks. The document does not state that Hugging Face independently verifies or audits the accuracy of reported evaluation metrics.
How other platforms handle this
Advertisers who wish to run political advertising on Snapchat must complete Snap's political advertiser authorization process, comply with applicable election advertising laws, and include required disclosures identifying the funding source of political ads.
XXII. Generative AI Terms of Use
Wise is not a bank. Your funds are not held in a bank account and are not insured by the Federal Deposit Insurance Corporation (FDIC). Wise safeguards your funds by holding them in a bank account in Wise's name or in US Treasury securities, separate from Wise's own operating funds.
Monitoring
Hugging Face has changed this document before.
Receive same-day alerts, structured change summaries, and monitoring for up to 25 platforms.
"model-index contains results which is a list of evaluation results. Each result includes: task, dataset, and metrics fields. The metrics field contains a list of metric results. Each metric result includes: type, value, name, and config fields.— Excerpt from Hugging Face's Hugging Face Model Card Guidelines
(1) REGULATORY LANDSCAPE: Accuracy of evaluation result claims in model card metadata may engage FTC guidance on truthful representation of AI system performance, particularly where metric values are used in commercial contexts to represent model capabilities. EU AI Act provisions on technical documentation for high-risk AI systems may also require verified performance documentation beyond self-reported Hub metadata. (2) GOVERNANCE EXPOSURE: Medium. Self-reported evaluation metrics that are inaccurate, selectively reported, or based on non-standard configurations could mislead downstream users conducting model selection due diligence, creating potential misrepresentation exposure for model publishers. (3) JURISDICTION FLAGS: Commercial AI deployments in EU/EEA jurisdictions where model performance claims influence purchasing or deployment decisions may face scrutiny under consumer protection and AI transparency regulations if evaluation metrics are inaccurate or incomplete. (4) CONTRACT AND VENDOR IMPLICATIONS: Enterprise teams should treat Hub evaluation results metadata as a starting reference rather than independently verified performance data; third-party evaluation or internal validation testing should be conducted for models used in production or regulated applications. (5) COMPLIANCE CONSIDERATIONS: Organizations should document their own evaluation methodology and results for AI models used in regulated applications, supplementing any Hub model card evaluation data with internally verified benchmarks appropriate to their specific use case and risk profile.
Full compliance analysis
Regulatory citations, enforcement risk, and due diligence action items.
Free: track 1 platform + weekly digest. Monitor: 25 platforms + same-day alerts. No credit card required.
Compliance Governance Intelligence
Need to monitor specific governance provisions?
Compliance includes provision-level monitoring, governance timelines, regulatory mapping, and audit-ready analysis.
Built from archived source documents, structured governance mappings, and historical version tracking.
This provision establishes the structured format through which model performance claims are disclosed and indexed on the Hub, making the accuracy and completeness of evaluation result fields relevant to how users and automated systems assess and compare model capabilities.
Under this framework, structured evaluation results in model card metadata are surfaced in Hub search and comparison features, meaning users may rely on these fields when selecting models for specific tasks. The document does not state that Hugging Face independently verifies or audits the accuracy of reported evaluation metrics.
No. ConductAtlas is an independent monitoring service. We are not affiliated with, endorsed by, or sponsored by Hugging Face.