// Features

SLO Tracking

Two ways to spell a latency contract. Declare an SLO target through the CRUD API and Xerotier tracks compliance over time, or pass a budget header per request to hint the router on the spot.

Overview

SLO tracking ships two surfaces. They share a vocabulary but answer different questions, and they do not share state.

// path 1

SLO Management API

Project-scoped CRUD. Declares a target, a metric, a window, and a comparison. Compliance is computed on a schedule and queryable via history and summary endpoints. State persists.

Path
/v1/slos
Cardinality
Up to 50 per project
Read-back
History, summary, latest compliance
Read the API reference
// path 2

Per-request SLO headers

Inference-path hint. Names a TTFT or TPOT budget for a single chat completion. The router prefers backends predicted to meet the budget; no record of the request is kept under /v1/slos.

Path
/v1/chat/completions
Headers
X-SLO-TTFT-Ms, X-SLO-TPOT-Ms
Behavior
Hint only, never enforced
Read the header reference

SLO Management API

The SLO management routes are project-scoped and live at the project root. They do not include an endpoint slug segment:

https://api.xerotier.ai/proj_ABC123/v1/slos

Per-request SLO headers (see SLO Request Headers) ride on the endpoint-slug-scoped inference path https://api.xerotier.ai/proj_ABC123/ENDPOINT_SLUG/v1/chat/completions.

Endpoints

Method Path Description
POST /v1/slos Create an SLO definition
GET /v1/slos List SLO definitions
GET /v1/slos/{id} Get an SLO definition
PUT /v1/slos/{id} Update an SLO definition
DELETE /v1/slos/{id} Delete an SLO definition
GET /v1/slos/{id}/history Get compliance history
POST /v1/slos/{id}/calculate Trigger manual calculation
GET /v1/slos/summary Get project SLO summary
GET /v1/endpoints/{endpoint_id}/slos List SLOs scoped to an endpoint (by UUID); also includes project-wide SLOs (those with no endpoint_id)

Create SLO

POST /v1/slos

Parameter Type Description
namerequired string Human-readable name (1-128 ASCII printable characters).
descriptionoptional string Optional description (max 512 characters).
metricrequired string Metric to track. See Supported Metrics.
targetrequired number Target threshold value (must be positive).
comparisonrequired string Comparison operator. See Comparisons.
window_daysrequired integer Rolling evaluation window in days (1-90).
endpoint_idoptional string Endpoint UUID to scope the SLO. Omit for project-wide SLOs.
curl
curl -X POST https://api.xerotier.ai/proj_ABC123/v1/slos \ -H "Authorization: Bearer xero_myproject_your_api_key" \ -H "Content-Type: application/json" \ -d '{ "name": "API Availability", "description": "99.9% availability for production API", "metric": "availability", "target": 99.9, "comparison": "greater_than_or_equal", "window_days": 30 }'

Response

{ "id": "00000000-1111-0000-1111-000000000000", "object": "slo", "name": "API Availability", "description": "99.9% availability for production API", "metric": "availability", "target": 99.9, "comparison": "greater_than_or_equal", "window_days": 30, "endpoint_id": null, "is_active": true, "latest_compliance": null, "created_at": 1709000000, "updated_at": 1709000000 }

List SLOs

GET /v1/slos

Returns a paginated list of SLO definitions. Use limit and after for pagination.

curl
curl https://api.xerotier.ai/proj_ABC123/v1/slos?limit=10 \ -H "Authorization: Bearer xero_myproject_your_api_key"

Get SLO

GET /v1/slos/{slo_id}

Returns the SLO definition with its latest compliance data.

curl
curl https://api.xerotier.ai/proj_ABC123/v1/slos/00000000-1111-0000-1111-000000000000 \ -H "Authorization: Bearer xero_myproject_your_api_key"

Response (with compliance data)

{ "id": "00000000-1111-0000-1111-000000000000", "object": "slo", "name": "API Availability", "metric": "availability", "target": 99.9, "comparison": "greater_than_or_equal", "window_days": 30, "is_active": true, "latest_compliance": { "measured_value": 99.95, "compliance_percentage": 100.0, "is_met": true, "total_requests": 150000, "conforming_requests": 150000, "calculated_at": 1709050000 }, "created_at": 1709000000, "updated_at": 1709000000 }

Update SLO

PUT /v1/slos/{slo_id}

All fields are optional. Only provided fields are updated.

Parameter Type Description
nameoptional string Updated name.
descriptionoptional string Updated description. Omit the field to leave the existing value unchanged; explicit null is currently treated the same as omitting the field.
targetoptional number Updated target value.
comparisonoptional string Updated comparison operator.
window_daysoptional integer Updated window (1-90 days).
is_activeoptional boolean Enable or disable the SLO.
endpoint_idoptional string Updated endpoint scope as an endpoint UUID. Omit the field to leave the existing value unchanged; explicit null is currently treated the same as omitting the field.
curl
curl -X PUT https://api.xerotier.ai/proj_ABC123/v1/slos/00000000-1111-0000-1111-000000000000 \ -H "Authorization: Bearer xero_myproject_your_api_key" \ -H "Content-Type: application/json" \ -d '{ "target": 99.95, "window_days": 7 }'

Delete SLO

DELETE /v1/slos/{slo_id}

Permanently deletes the SLO definition and all associated history.

curl
curl -X DELETE https://api.xerotier.ai/proj_ABC123/v1/slos/00000000-1111-0000-1111-000000000000 \ -H "Authorization: Bearer xero_myproject_your_api_key"

Get Compliance History

GET /v1/slos/{slo_id}/history

Returns a paginated list of compliance calculation entries for the SLO.

Query Parameter Type Description
limitoptional integer Maximum number of entries to return.
afteroptional string Pagination cursor; pass the previous page's last_id.
startoptional string Lower bound for period_start; Unix seconds or ISO 8601 timestamp.
endoptional string Upper bound for period_end; Unix seconds or ISO 8601 timestamp.
curl
curl "https://api.xerotier.ai/proj_ABC123/v1/slos/00000000-1111-0000-1111-000000000000/history?limit=10" \ -H "Authorization: Bearer xero_myproject_your_api_key"

Response

{ "object": "list", "data": [ { "id": "00000000-2222-0000-2222-000000000000", "object": "slo.history", "slo_id": "00000000-1111-0000-1111-000000000000", "period_start": 1708900000, "period_end": 1709000000, "measured_value": 99.95, "total_requests": 150000, "conforming_requests": 150000, "compliance_percentage": 100.0, "is_met": true, "calculated_at": 1709000100 } ], "first_id": "00000000-2222-0000-2222-000000000000", "last_id": "00000000-2222-0000-2222-000000000000", "has_more": false }

Trigger Calculation

POST /v1/slos/{slo_id}/calculate

Triggers an on-demand compliance calculation for the SLO and returns the result.

curl
curl -X POST https://api.xerotier.ai/proj_ABC123/v1/slos/00000000-1111-0000-1111-000000000000/calculate \ -H "Authorization: Bearer xero_myproject_your_api_key"

Project Summary

GET /v1/slos/summary

Returns an overview of all active SLOs for the project with their current compliance status.

curl
curl https://api.xerotier.ai/proj_ABC123/v1/slos/summary \ -H "Authorization: Bearer xero_myproject_your_api_key"

Response

{ "object": "slo.summary", "total_active": 3, "total_met": 2, "total_not_met": 1, "total_unevaluated": 0, "slos": [ { "id": "00000000-1111-0000-1111-000000000000", "name": "API Availability", "metric": "availability", "target": 99.9, "status": "met", "compliance_percentage": 99.98, "measured_value": 99.98, "last_calculated_at": 1709050000 }, { "id": "00000000-1111-0000-1111-000000000001", "name": "P95 Latency", "metric": "total_latency_ms", "target": 500, "status": "not_met", "compliance_percentage": 94.2, "measured_value": 520.0, "last_calculated_at": 1709050000 } ] }

Limits

  • Max 50 SLO definitions per project.
  • Window range: 1-90 days (window_days is a required integer in that range).
  • Name length: 1-128 ASCII printable characters.
  • Description length: Max 512 characters.
  • Metric values: 10 accepted identifiers (see Supported Metrics).
  • Comparison values: 4 accepted operators (see Comparison Operators).
  • Header value range: X-SLO-TTFT-Ms and X-SLO-TPOT-Ms accept non-negative integer milliseconds.

SLO Request Headers

Headers hint, they do not enforce. SLO request headers ride only on POST /v1/chat/completions and are ignored everywhere else, including /v1/responses. A missed target does not fail the request, raise an alert, or write to /v1/slos history. If you need recorded compliance, use the management API.

Include SLO targets as request headers on POST /v1/chat/completions requests. When present, the router prefers backends predicted to meet the budget. When absent, the router uses its default routing strategy without SLO-aware adjustments.

Header Type Description
X-SLO-TTFT-Ms integer Target time-to-first-token in milliseconds (e.g., 500 for 500ms TTFT target).
X-SLO-TPOT-Ms integer Target time-per-output-token in milliseconds (e.g., 30 for 30ms per token).

You can send one or both headers. If no backend is predicted to meet the target, the router still selects the best available backend; the request continues as it would without the header.

Supported Metrics

Ten identifiers across two families. Inference metrics measure the chat-completions pipeline. Execution metrics measure the XEM tool-execution lease. Identifiers are case-sensitive and must match one of the values below exactly.

// inference Request-path metrics

Metric Description Unit
ttft_ms Time to first token milliseconds
tpot_ms Time per output token milliseconds
total_latency_ms Total end-to-end request latency milliseconds
availability Percentage of successful requests percent (0-100)
error_rate Percentage of failed requests percent (0-100)
throughput_rps Average requests per second requests/second

// xem execution Tool-execution metrics

Metric Description Unit
exec_availability XEM execution agent uptime (lease-renewal based) percent (0-100)
exec_duration_ms Per-call XEM tool execution duration milliseconds
exec_error_rate Fraction of failed XEM executions over total executions percent (0-100)
exec_approval_latency_ms Time from exec.approval_requested to the terminal exec.approved or exec.rejected transition milliseconds

For percentage metrics (availability, error_rate, exec_availability, exec_error_rate), both the target and the returned compliance_percentage are expressed on a 0-100 scale (e.g., 99.9 for 99.9%).

Comparison Operators

Comparison Description Typical Use
less_than Metric must be less than target Latency, error rate
less_than_or_equal Metric must be at most target Latency, error rate
greater_than Metric must exceed target Throughput
greater_than_or_equal Metric must be at least target Availability

Compliance Calculation

SLO compliance is measured as the percentage of requests within the evaluation window that meet the defined target:

compliance_percentage = (conforming_requests / total_requests) * 100

A request is "conforming" if the measured metric value satisfies the comparison operator against the target. For example, with metric=availability, target=99.9, comparison=greater_than_or_equal, the SLO is met when the measured availability is 99.9% or higher.

For percentage metrics such as availability and error_rate, the measured value is reduced from per-request boolean samples (success vs. error) over the window before the comparison is applied, and compliance_percentage is reported on a 0-100 scale.

Compliance calculations run automatically on a periodic schedule. Use the POST /v1/slos/{id}/calculate endpoint to trigger an on-demand calculation.

Status values

Both GET /v1/slos/summary and the per-SLO summary entries report a status string drawn from three values:

  • met Latest compliance evaluation passed the target.
  • not_met Latest compliance evaluation fell short of the target.
  • unevaluated No completed evaluation yet; window is still warming up or the scheduler has not run.

Examples

Create an Availability SLO

curl
curl -X POST https://api.xerotier.ai/proj_ABC123/v1/slos \ -H "Authorization: Bearer xero_myproject_your_api_key" \ -H "Content-Type: application/json" \ -d '{ "name": "Production Availability", "metric": "availability", "target": 99.9, "comparison": "greater_than_or_equal", "window_days": 30 }'

Create a Latency SLO

curl
curl -X POST https://api.xerotier.ai/proj_ABC123/v1/slos \ -H "Authorization: Bearer xero_myproject_your_api_key" \ -H "Content-Type: application/json" \ -d '{ "name": "P95 TTFT Target", "metric": "ttft_ms", "target": 500, "comparison": "less_than_or_equal", "window_days": 7 }'

Per-Request SLO Headers

curl
curl https://api.xerotier.ai/proj_ABC123/my-endpoint/v1/chat/completions \ -H "Authorization: Bearer xero_myproject_your_api_key" \ -H "Content-Type: application/json" \ -H "X-SLO-TTFT-Ms: 500" \ -H "X-SLO-TPOT-Ms: 30" \ -d '{ "model": "llama-3.1-8b", "messages": [{"role": "user", "content": "Hello!"}] }'

Python

Python
import requests # SLO management API is project-scoped (no endpoint slug) mgmt_base = "https://api.xerotier.ai/proj_ABC123/v1" # Inference path includes the endpoint slug inference_base = "https://api.xerotier.ai/proj_ABC123/my-endpoint/v1" headers = {"Authorization": "Bearer xero_myproject_your_api_key"} # Create an SLO slo = requests.post(f"{mgmt_base}/slos", headers=headers, json={ "name": "Chat Latency", "metric": "total_latency_ms", "target": 1000, "comparison": "less_than", "window_days": 7 }).json() print(f"Created SLO: {slo['id']}") # Check compliance summary summary = requests.get(f"{mgmt_base}/slos/summary", headers=headers).json() for entry in summary["slos"]: print(f"{entry['name']}: {entry['status']} ({entry['compliance_percentage']}%)") # Send a request with SLO headers response = requests.post( f"{inference_base}/chat/completions", headers={ **headers, "Content-Type": "application/json", "X-SLO-TTFT-Ms": "500", "X-SLO-TPOT-Ms": "30", }, json={ "model": "llama-3.1-8b", "messages": [{"role": "user", "content": "Hello!"}], }, )

Node.js

Node.js
// SLO management API is project-scoped (no endpoint slug) const mgmtBase = "https://api.xerotier.ai/proj_ABC123/v1"; // Inference path includes the endpoint slug const inferenceBase = "https://api.xerotier.ai/proj_ABC123/my-endpoint/v1"; const headers = { "Authorization": "Bearer xero_myproject_your_api_key", "Content-Type": "application/json" }; // Create an SLO const sloResponse = await fetch(`${mgmtBase}/slos`, { method: "POST", headers, body: JSON.stringify({ name: "Chat Latency", metric: "total_latency_ms", target: 1000, comparison: "less_than", window_days: 7 }) }); const slo = await sloResponse.json(); console.log(`Created SLO: ${slo.id}`); // Check compliance summary const summaryResponse = await fetch(`${mgmtBase}/slos/summary`, { headers }); const summary = await summaryResponse.json(); for (const entry of summary.slos) { console.log(`${entry.name}: ${entry.status} (${entry.compliance_percentage}%)`); } // Send a request with SLO headers const inferenceResponse = await fetch(`${inferenceBase}/chat/completions`, { method: "POST", headers: { ...headers, "X-SLO-TTFT-Ms": "500", "X-SLO-TPOT-Ms": "30" }, body: JSON.stringify({ model: "llama-3.1-8b", messages: [{ role: "user", content: "Hello!" }] }) });

See Also

  • The Router, the routing engine that consumes SLO targets and header hints when scoring backends.
  • Service Tiers, the tier you select shapes the latency baseline an SLO target sits against.
  • Usage Tracking & Billing, the consumption surface where you confirm an SLO target was met across the window you care about.
  • Chat Completions, the inference path that parses X-SLO-TTFT-Ms and X-SLO-TPOT-Ms.
  • Error Handling, error responses returned by the SLO management API when validation fails.