Skip to content

Due 2/18: Document API Success Rate Over Time (OIG Q4b) and the API error rate threshold methodology (OIG Q5) #5710

@SueValente

Description

@SueValente

User Story

In order to respond to OIG follow-up question #4b, the Data.gov team wants to compile and document the API success rate over time using available monitoring data, and publish this on resources.data.gov.

In order to respond to OIG follow-up question #5, the Data.gov team wants to document how it determines what constitutes a reasonable API error rate and compare with industry standards.

In order to respond to OIG follow-up question #6, the Data.gov team wants to identify and document federal guidance or industry standards for acceptable API error rates or success rates, and publish findings on resources.data.gov.

Background

Source: OIG follow-up question #4b, received after 2/6/26 GSA response.
OIG wants to understand how Data.gov can demonstrate API reliability over time beyond a single snapshot.
Potential approaches:

  • Export historical API metrics from monitoring tools
  • Generate time-series charts/reports from logging infrastructure
  • Compile uptime monitoring history with error rate calculations
  • Automated reporting

Source: OIG follow-up question #5, received after 2/6/26 GSA response.
This question asks Data.gov to articulate its standards for API reliability. If no formal threshold exists today, the response should acknowledge that and describe what the team uses in practice (e.g., monitoring alerts, SLA targets, comparison to industry norms).
Considerations:

  • Does the Data.gov ATO or SSP define acceptable error rate thresholds?
  • Are there SLA commitments in the devops support contract?
  • Does cloud.gov provide baseline reliability metrics?
  • What industry standards exist (e.g., 99.9% availability targets)?

Source: OIG follow-up question #6, received after 2/6/26 GSA response.
OIG correctly identified that the Uptime.com reference in the 2/6/26 response addresses system uptime (is the server responding?) rather than API error rates (are API responses correct and complete?). These are distinct metrics.
Areas to research:

  • NIST SP 800-53 (SA-4, SI-2, SI-5 controls related to system monitoring)
  • FedRAMP continuous monitoring requirements
  • OMB Circular A-130 or M-23-22 guidance
  • Industry standards: Google SRE handbook targets, AWS/Azure SLA benchmarks
  • cloud.gov platform SLA commitments
  • GSA IT standards or TTS operational guidelines

OIG Requirement: Explain methodology and Provide Documentation.

Give to Sue by 2/18

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    Status

    📟 Sprint Backlog [7]

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions