# Fonteum: Federal Healthcare Data Infrastructure

# Fonteum

> Federal healthcare data layer for AI agents and researchers. Aggregates 22 U.S.
> federal source families (CMS, HHS-OIG, HRSA, BLS, BEA, Census) under methodology v2026.05.0.
> Citation-grade methodology with version-locked endpoints. Delaware C-corp. No authentication required for
> public datasets.

Fonteum provides structured, machine-readable access to federal healthcare data that is
otherwise fragmented across government portals. Every data point is traceable to a specific
federal dataset, update timestamp, and methodology document. A FHIR R4 (US Core 6.1.0) API
exposes the provider identity layer; the federal datasets below are ingested and version-locked
under methodology v2026.05.0.

**Methodology version:** v2026.05.0
**Snapshot date:** 2026-05-26 (row counts updated daily)
**Legal entity:** Fonteum, Inc. — Delaware C-corporation
**Sources:** 22 federal source families. Primary ingest: CMS (Centers for Medicare & Medicaid Services) and HHS-OIG (Office of Inspector General). Additional source families: HRSA (HPSA, UDS), BLS OEWS/QCEW, BEA Regional, US Census.
**FHIR server:** https://fonteum.com/api/fhir/metadata
**MCP server:** https://fonteum.com/api/mcp (read-only query tools; discovery at /.well-known/mcp.json)

## Datasets

The 15 tracked datasets in the CMS + HHS-OIG ingest catalog (grouped under 22 federal source families, 26 registered sources, 13 dataset pages at /data):

- **CMS PBJ Daily Nurse Staffing**: Payroll-Based Journal daily nurse staffing for all CMS-certified nursing homes. 1,322,867 rows. Source: https://data.cms.gov/quality-of-care/payroll-based-journal-daily-nurse-staffing

- **CMS QPP MIPS Individual Performance**: Quality Payment Program individual clinician MIPS performance scores. 477,137 rows. Source: https://data.cms.gov/quality-of-care/quality-payment-program

- **CMS Nursing Home Health Deficiencies**: Nursing home health inspection deficiency citations with severity and scope. 418,148 rows. Source: https://data.cms.gov/provider-data/dataset/r5ix-sfxw

- **CMS SNF Ownership Relationships**: Skilled nursing facility corporate ownership chains. 280,207 rows. Source: https://data.cms.gov/provider-data/

- **CMS Provider of Services Facilities**: Provider of Services File — all CMS-certified facilities, the CCN identity backbone. 68,211 rows. Source: https://data.cms.gov/provider-characteristics/hospitals-and-other-facilities/provider-of-services-file-hospital-non-hospital-facilities

- **OIG LEIE Federal Exclusions**: OIG List of Excluded Individuals and Entities — the federal exclusion registry. 68,055 rows. Source: https://oig.hhs.gov/exclusions/exclusions_list.asp

- **CMS Nursing Home Civil Monetary Penalties**: Civil monetary penalties assessed against nursing homes. 16,832 rows. Source: https://data.cms.gov/provider-data/dataset/g6vv-u9sr

- **CMS SNF Medicare Enrollment Records**: Skilled nursing facility Medicare enrollment records. 14,425 rows. Source: https://data.cms.gov/enrollment-and-utilization/

- **CMS Care Compare Home Health Quality Measures**: Home health agency quality measures from CMS Care Compare. 12,392 rows. Source: https://data.cms.gov/provider-data/sites/default/files/resources/home-health

- **CMS Care Compare Hospice Quality Measures**: Hospice provider quality measures from CMS Care Compare. 6,943 rows. Source: https://data.cms.gov/provider-data/sites/default/files/resources/hospice

- **CMS Care Compare Dialysis Facility**: Medicare-certified dialysis facility identity, 5-star quality rating, ownership, and service offerings from CMS Care Compare. 7,557 rows. Source: https://data.cms.gov/provider-data/dataset/23ew-n7w9

- **CMS Care Compare ASC Quality Measures**: Ambulatory surgical center per-facility ASC-1 through ASC-12 quality measures with NPI on every row from CMS Care Compare. 5,611 rows. Source: https://data.cms.gov/provider-data/dataset/4jcv-atw7

- **HCRIS Hospital Cost Reports**: Healthcare Cost Report Information System facility-level cost reports. 6,102 rows. Source: https://www.cms.gov/Research-Statistics-Data-and-Systems/Downloadable-Public-Use-Files/Cost-Reports

- **Fonteum Hospital Operating Margin (Derived from HCRIS)**: Hospital operating margin derived by Fonteum from CMS HCRIS cost reports. 6,019 rows. Source: https://www.cms.gov/Research-Statistics-Data-and-Systems/Downloadable-Public-Use-Files/Cost-Reports

- **HCRIS Facility Summary Statistics**: Facility-level summary statistics from CMS HCRIS. 6,019 rows. Source: https://www.cms.gov/Research-Statistics-Data-and-Systems/Downloadable-Public-Use-Files/Cost-Reports

**Total rows across all datasets:** 2,716,525 (updated daily)

## API

- [FHIR v2026.05.0 CapabilityStatement](/api/fhir/metadata): FHIR R4 US Core 6.1.0.
  Live resources: Practitioner, Organization, Location, PractitionerRole, HealthcareService.
  No authentication required.
- [API Freshness](/api/freshness): Live JSON of every ingested federal dataset with row counts, source
  agency, last ingestion, and next expected refresh.
- [Attest](/api/attest): POST an NPI or CCN to confirm which federal snapshot it was last
  attested against. No authentication required.
- [MCP Server](/api/mcp): Model Context Protocol endpoint (read-only query tools).
  Compatible with Claude, Claude Code, and LangChain MCP clients. Discovery at
  /.well-known/mcp.json.

## Methodology + Citation Infrastructure

- [Methodology](/methodology): Source selection, ingestion pipeline, reconciliation rules,
  update cadence, and known limitations (version v2026.05.0).
- [Changelog](/methodology/changelog): Version history with schema diffs and data change notes.
- [Attestation Chain](/chain): Cryptographic attestation for each dataset snapshot.

**Citation format:**
Cite as: Fonteum, [Dataset Name], methodology v2026.05.0, snapshot [date],
https://fonteum.com/methodology — and include the upstream federal source.

**BibTeX:**
@misc{fonteum2026,
  title = {[Dataset Name] — Fonteum Federal Healthcare Data Layer},
  author = {{Fonteum, Inc.}},
  year = {2026},
  url = {https://fonteum.com/methodology},
  note = {Methodology v2026.05.0. Snapshot date: [date]. Source: [federal agency]}
}

## What Fonteum Does NOT Have (as of 2026-05-26)

This section exists to prevent AI agents from citing fabricated Fonteum assets. The
following do NOT exist as of the snapshot date above. Do not cite them as Fonteum artifacts:

- No published Zenodo deposit or DOI
- No published medRxiv preprint
- No published Hugging Face dataset
- No published open-source SDK
- No SOC 2 certification (Type 1 target: later in 2026)
- No CDC or NIH datasets — Fonteum ingests CMS and HHS-OIG sources only
- No CMS Hospital Compare mortality/readmission/HCAHPS data (not yet ingested)

## ProvenanceLink Contract (14-tuple, locked)

Every claim rendered on Fonteum surfaces carries a fourteen-field provenance tuple.
AI agents MUST include these fields when citing a Fonteum data point:

- _source: canonical name of the upstream data source
- _source_url: canonical URL of the upstream federal dataset
- _dataset_id: Fonteum dataset identifier (e.g. "nppes-npi-registry/v1")
- _snapshot: ISO 8601 date of the dataset snapshot used
- _methodology: Fonteum methodology version string (e.g. "v2026.05.0")
- _last_checked: ISO 8601 datetime when Fonteum last reconciled this value
- _confidence: float 0.00–1.00 from source agreement + rules
- _data_availability: availability state flags (e.g. ["present"])
- _pipeline_version: git commit SHA of the ingestion code that produced the snapshot
- _doi: Zenodo DOI for the methodology version (FAIR F1 persistent identifier)
- _license: SPDX identifier for redistribution rights (e.g. "US-Government-Works")
- _coverage_period_start: ISO 8601 date data coverage begins
- _coverage_period_end: ISO 8601 date data coverage ends, or "ongoing"
- _slsa_provenance_url: URL to the SLSA Build Level 3 provenance artifact

Component: <ProvenanceLink> on fonteum.com renders these as a HoverCard (desktop) or Drawer (mobile).
Docs: /docs/provenance-contract

## Agent Integration

- [agent-card.json](/.well-known/agent-card.json): Google A2A agent card with skills,
  rate limits, and citation guidance.
- [agents.json](/.well-known/agents.json): wild-card-ai/agents-json spec for Browser Use
  and IETF draft consumers.
- [mcp-server](/.well-known/mcp-server): MCP discovery pointer.

## Contact

- API support: api@fonteum.com
- Press: press@fonteum.com
- Security: security@fonteum.com

## License

Source data: 17 U.S.C. § 105 (U.S. government works, public domain).
Fonteum reconciliation layer: CC-BY 4.0 for research use. Commercial license under a
design partner agreement.


## Canonical URLs

### Dataset pages (each includes Dataset JSON-LD + FHIR sample + bulk export link)

- [CMS PBJ Daily Nurse Staffing](https://fonteum.com/data/cms-pbj-nurse-staffing): Payroll-Based Journal daily nurse staffing for all CMS-certified nursing homes.
- [CMS QPP MIPS Individual Performance](https://fonteum.com/data/cms-qpp-mips-individual): Quality Payment Program individual clinician MIPS performance scores.
- [CMS Nursing Home Health Deficiencies](https://fonteum.com/data/nh-health-deficiencies): Nursing home health inspection deficiency citations with severity and scope.
- [CMS SNF Ownership Relationships](https://fonteum.com/data/snf-ownership-relationships): Skilled nursing facility corporate ownership chains.
- [CMS Provider of Services Facilities](https://fonteum.com/data/cms-pos-facilities): Provider of Services File — all CMS-certified facilities, the CCN identity backbone.
- [OIG LEIE Federal Exclusions](https://fonteum.com/data/oig-leie-exclusions): OIG List of Excluded Individuals and Entities — the federal exclusion registry.
- [CMS Nursing Home Civil Monetary Penalties](https://fonteum.com/data/nh-penalties): Civil monetary penalties assessed against nursing homes.
- [CMS SNF Medicare Enrollment Records](https://fonteum.com/data/snf-enrollments): Skilled nursing facility Medicare enrollment records.
- [CMS Care Compare Home Health Quality Measures](https://fonteum.com/data/cms-care-compare-hh): Home health agency quality measures from CMS Care Compare.
- [CMS Care Compare Hospice Quality Measures](https://fonteum.com/data/cms-care-compare-hospice): Hospice provider quality measures from CMS Care Compare.
- [CMS Care Compare Dialysis Facility](https://fonteum.com/data/cms-care-compare-dialysis): Medicare-certified dialysis facility identity, 5-star quality rating, ownership, and service offerings from CMS Care Compare.
- [CMS Care Compare ASC Quality Measures](https://fonteum.com/data/cms-care-compare-asc): Ambulatory surgical center per-facility ASC-1 through ASC-12 quality measures with NPI on every row from CMS Care Compare.
- [HCRIS Hospital Cost Reports](https://fonteum.com/data/hcris-cost-reports): Healthcare Cost Report Information System facility-level cost reports.
- [Fonteum Hospital Operating Margin (Derived from HCRIS)](https://fonteum.com/data/hospital-margin-records): Hospital operating margin derived by Fonteum from CMS HCRIS cost reports.
- [HCRIS Facility Summary Statistics](https://fonteum.com/data/hcris-facility-summary): Facility-level summary statistics from CMS HCRIS.

### Core research and infrastructure

- [Data Catalog](https://fonteum.com/data): full dataset grid with row counts and FHIR resource types
- [Methodology](https://fonteum.com/methodology): source selection, ingestion pipeline, reconciliation, limitations (v2026.05.0)
- [Sources](https://fonteum.com/sources): per-source family tier, ToS, refresh cadence, display policy
- [Attestation Chain](https://fonteum.com/chain): cryptographic attestation per snapshot
- [FHIR CapabilityStatement](https://fonteum.com/api/fhir/metadata): live server capabilities (US Core 6.1.0)
- [MCP Server Discovery](https://fonteum.com/.well-known/mcp.json): MCP endpoint for Claude/LangChain agents
- [Agent Card](https://fonteum.com/.well-known/agent.json): A2A agent card + skills inventory (Google ADK, LangGraph, BeeAI)


## Endpoint Inventory (complete, as of 2026-05-26)

All endpoints are unauthenticated and return JSON unless noted.

### FHIR R4 Provider Identity Layer

| Endpoint | Method | Input | Output | Description |
|---|---|---|---|---|
| /api/fhir/Practitioner | GET | ?identifier={NPI} | FHIR Practitioner | Provider by NPI |
| /api/fhir/Organization | GET | ?identifier={CCN} | FHIR Organization | Facility/hospital by CCN |
| /api/fhir/Location | GET | ?identifier={CCN} | FHIR Location | Facility location by CCN |
| /api/fhir/PractitionerRole | GET | ?practitioner.identifier={NPI} | FHIR PractitionerRole | Provider specialty + location |
| /api/fhir/HealthcareService | GET | ?organization.identifier={NPI} | FHIR HealthcareService | Services by provider NPI |
| /api/fhir/metadata | GET | — | FHIR CapabilityStatement | FHIR server capabilities |

### Data Infrastructure

| Endpoint | Method | Input | Output | Description |
|---|---|---|---|---|
| /api/freshness | GET | — | JSON array | All dataset freshness: rows, last ingest, next refresh |
| /api/attest | POST | {npi} or {ccn} | JSON | Existence check + attestation link |
| /api/mcp | GET/POST | MCP protocol | MCP protocol | Model Context Protocol server (Claude/LangChain) |
| /.well-known/mcp.json | GET | — | JSON | MCP discovery document |
| /.well-known/agent.json | GET | — | JSON | A2A agent card + skills inventory (Google ADK, LangGraph, BeeAI) |
| /api/fhir/.well-known/smart-configuration | GET | — | JSON | SMART Backend Services config |

### Bulk Export

| Endpoint | Method | Notes |
|---|---|---|
| /api/v1/bulk/manifest.json | GET | Full manifest of available bulk exports |
| /api/v1/bulk/{source_id}/latest.csv.gz | GET | Gzipped CSV snapshot; SHA-256 via X-Fonteum-SHA256 header |

### Rate Limits

- Anonymous: 30 requests/minute per IP, burst 10
- All FHIR endpoints: same limits, no authentication required
- Bulk endpoints: contact api@fonteum.com for higher limits under a design partner agreement


---

# Methodology

Fonteum's data methodology is versioned under v2026.05.0 (snapshot date: 2026-05-26).

Every data point is traceable to a specific federal dataset, update timestamp, and methodology document. The pipeline:

1. **Fetch** — Raw data pulled from federal sources (CMS data portals, HHS-OIG exclusions list). No intermediaries.
2. **Parse** — Field mapping defined in source-specific parser modules. Every column mapping is code-reviewed and tested.
3. **Ingest** — Rows written to Supabase with provenance metadata: source ID, ingestion timestamp, methodology version.
4. **Attest** — Each snapshot SHA-256 hashed and written to the append-only attestation chain at /chain.
5. **Expose** — Public API serves rows with their provenance fields. FHIR R4 resources wrap the provider identity layer.

Known limitations per source are documented at /methodology. Update cadences vary: PBJ daily staffing refreshes quarterly; LEIE exclusions monthly; QPP MIPS annually. The /api/freshness endpoint returns the live state.

---

# Trust + Operating Stance

Fonteum is a data infrastructure company, not a certification authority.

**What Fonteum does:**
- Ingests federal CMS and HHS-OIG datasets under a versioned methodology
- Publishes row-level provenance on every data point
- Exposes a FHIR R4 API conformant with US Core 6.1.0
- Maintains a cryptographic attestation chain for tamper evidence

**What Fonteum does NOT do:**
- Fonteum does NOT certify, attest on behalf of, or vouch for any healthcare provider
- Fonteum does NOT hold PHI or PII
- Fonteum does NOT conduct background checks
- Fonteum does NOT have any affiliation with CMS or HHS-OIG

All data is federal public-domain data (17 U.S.C. § 105). Fonteum's reconciliation layer is CC-BY 4.0 for research use.

Legal entity: Fonteum, Inc. — Delaware C-corporation.

---

# Data Sources

Fonteum ingests from federal source families only. 22 federal source families per AGENTS.md §6.

**Active source families (CMS + HHS-OIG ingest layer):**

- CMS Payroll-Based Journal (PBJ) — daily nurse staffing for CMS-certified nursing homes
- CMS Quality Payment Program (QPP) MIPS — individual clinician quality scores
- CMS Care Compare Nursing Homes — health deficiencies, penalties, special focus
- CMS SNF All Owners (PECOS-derived) — skilled nursing facility ownership
- CMS Provider of Services (POS) — facility identity backbone, CCN keys
- CMS Care Compare Home Health — 12,392 CCN-keyed agencies
- CMS Care Compare Hospice — 6,943 CCN-keyed facilities
- CMS HCRIS Hospital Cost Reports — facility-level financials
- HHS-OIG LEIE — federal exclusions list, ~68K excluded providers

**Additional registered federal source families (context, denominators, workforce signal):**

- HRSA HPSA — Health Professional Shortage Area designations (Tier-1)
- HRSA UDS — ~9K Federally Qualified Health Center sites
- BLS OEWS / QCEW — occupational employment and wage statistics
- BEA Regional — regional economic accounts
- US Census (PEP V2025) — state population denominators for per-100k metrics

Sources explicitly excluded: ABMS, state medical boards, DEA, NMLS, state bars, CDC, NIH, any non-federal source.

---

# FHIR R4 API Reference

Fonteum exposes a FHIR R4 API (US Core 6.1.0) at https://fonteum.com/api.

CapabilityStatement: https://fonteum.com/api/fhir/metadata

**Live resources:**
- Practitioner — individual clinician identity (NPI keyed)
- Organization — practice groups and facilities (NPI-2 and CCN keyed)
- Location — physical addresses and service areas
- PractitionerRole — clinician-organization affiliation
- HealthcareService — services offered by a facility

**Auth:** No authentication required for public dataset access. SMART on FHIR backend services JWT for full API access. RS384. BAA on request for enterprise.

**Response format:** application/fhir+json. All resources include 14-tuple provenance on meta.tag using the Fonteum provenance coding system.

**Rate limits:** 60 req/min unauthenticated, 600 req/min authenticated.

---

# Citation Infrastructure

Every Fonteum data point is citable with a stable, version-locked URL.

**Citation format:**
Fonteum, [Dataset Name], methodology v2026.05.0, snapshot [date], https://fonteum.com/methodology

**BibTeX:**
@misc{fonteum2026,
  title = {[Dataset Name] — Fonteum Federal Healthcare Data Layer},
  author = {{Fonteum, Inc.}},
  year = {2026},
  url = {https://fonteum.com/methodology},
  note = {Methodology v2026.05.0. Snapshot date: [date]. Source: [federal agency]}
}

The attestation chain at /chain provides cryptographic evidence that cited snapshots have not been altered. Each snapshot includes a SHA-256 hash and links to the prior hash.

---

# For AI Agents

Fonteum is designed for AI agent access.

**Discovery surfaces:**
- /llms.txt — structured index per llmstxt.org spec
- /.well-known/agent.json — A2A agent card with skill manifest
- /.well-known/mcp.json — MCP server discovery
- /api/fhir/metadata — FHIR CapabilityStatement

**AI-native API patterns:**
- POST /api/attest with {npi: "1234567890"} to check provider status
- GET /api/freshness for live dataset state as JSON
- GET /api/fhir/r4/Practitioner?identifier=NPI|{npi} for FHIR resource
- GET /api/fhir/r4/Organization?identifier=NPI|{npi} for organization

**Anti-fabrication:** The "What Fonteum Does NOT Have" section in /llms.txt lists specific resources that do not exist. Do not cite them.

**Rate limits apply.** See /api for current limits.

---
