SLO Tracking - Xerotier

Overview

SLO tracking ships two surfaces. They share a vocabulary but answer different questions, and they do not share state.

// path 1

SLO Management API

Project-scoped CRUD. Declares a target, a metric, a window, and a comparison. Compliance is computed on a schedule and queryable via history and summary endpoints. State persists.

Path: /v1/slos
Cardinality: Up to 50 per project
Read-back: History, summary, latest compliance

Read the API reference

// path 2

Per-request SLO headers

Inference-path hint. Names a TTFT or TPOT budget for a single chat completion. The router prefers backends predicted to meet the budget; no record of the request is kept under /v1/slos.

Path: /v1/chat/completions
Headers: X-SLO-TTFT-Ms, X-SLO-TPOT-Ms
Behavior: Hint only, never enforced

Read the header reference

SLO Management API

The SLO management routes are project-scoped and live at the project root. They do not include an endpoint slug segment:

https://api.xerotier.ai/proj_ABC123/v1/slos

Per-request SLO headers (see SLO Request Headers) ride on the endpoint-slug-scoped inference path https://api.xerotier.ai/proj_ABC123/ENDPOINT_SLUG/v1/chat/completions.

Endpoints

Method	Path	Description
POST	/v1/slos	Create an SLO definition
GET	/v1/slos	List SLO definitions
GET	/v1/slos/{id}	Get an SLO definition
PUT	/v1/slos/{id}	Update an SLO definition
DELETE	/v1/slos/{id}	Delete an SLO definition
GET	/v1/slos/{id}/history	Get compliance history
POST	/v1/slos/{id}/calculate	Trigger manual calculation
GET	/v1/slos/summary	Get project SLO summary
GET	/v1/endpoints/{endpoint_id}/slos	List SLOs scoped to an endpoint (by UUID); also includes project-wide SLOs (those with no `endpoint_id`)

Create SLO

POST /v1/slos

Parameter	Type	Description
namerequired	string	Human-readable name (1-128 ASCII printable characters).
descriptionoptional	string	Optional description (max 512 characters).
metricrequired	string	Metric to track. See Supported Metrics.
targetrequired	number	Target threshold value (must be positive).
comparisonrequired	string	Comparison operator. See Comparisons.
window_daysrequired	integer	Rolling evaluation window in days (1-90).
endpoint_idoptional	string	Endpoint UUID to scope the SLO. Omit for project-wide SLOs.

curl

                    curl -X POST https://api.xerotier.ai/proj_ABC123/v1/slos \
  -H "Authorization: Bearer xero_myproject_your_api_key" \
  -H "Content-Type: application/json" \
  -d '{
    "name": "API Availability",
    "description": "99.9% availability for production API",
    "metric": "availability",
    "target": 99.9,
    "comparison": "greater_than_or_equal",
    "window_days": 30
  }'
                

Response

                        {
  "id": "00000000-1111-0000-1111-000000000000",
  "object": "slo",
  "name": "API Availability",
  "description": "99.9% availability for production API",
  "metric": "availability",
  "target": 99.9,
  "comparison": "greater_than_or_equal",
  "window_days": 30,
  "endpoint_id": null,
  "is_active": true,
  "latest_compliance": null,
  "created_at": 1709000000,
  "updated_at": 1709000000
}
                    

List SLOs

GET /v1/slos

Returns a paginated list of SLO definitions. Use limit and after for pagination.

curl

                    curl https://api.xerotier.ai/proj_ABC123/v1/slos?limit=10 \
  -H "Authorization: Bearer xero_myproject_your_api_key"
                

Get SLO

GET /v1/slos/{slo_id}

Returns the SLO definition with its latest compliance data.

curl

                    curl https://api.xerotier.ai/proj_ABC123/v1/slos/00000000-1111-0000-1111-000000000000 \
  -H "Authorization: Bearer xero_myproject_your_api_key"
                

Response (with compliance data)

                        {
  "id": "00000000-1111-0000-1111-000000000000",
  "object": "slo",
  "name": "API Availability",
  "metric": "availability",
  "target": 99.9,
  "comparison": "greater_than_or_equal",
  "window_days": 30,
  "is_active": true,
  "latest_compliance": {
    "measured_value": 99.95,
    "compliance_percentage": 100.0,
    "is_met": true,
    "total_requests": 150000,
    "conforming_requests": 150000,
    "calculated_at": 1709050000
  },
  "created_at": 1709000000,
  "updated_at": 1709000000
}
                    

Update SLO

PUT /v1/slos/{slo_id}

All fields are optional. Only provided fields are updated.

Parameter	Type	Description
nameoptional	string	Updated name.
descriptionoptional	string	Updated description. Omit the field to leave the existing value unchanged; explicit `null` is currently treated the same as omitting the field.
targetoptional	number	Updated target value.
comparisonoptional	string	Updated comparison operator.
window_daysoptional	integer	Updated window (1-90 days).
is_activeoptional	boolean	Enable or disable the SLO.
endpoint_idoptional	string	Updated endpoint scope as an endpoint UUID. Omit the field to leave the existing value unchanged; explicit `null` is currently treated the same as omitting the field.

curl

                    curl -X PUT https://api.xerotier.ai/proj_ABC123/v1/slos/00000000-1111-0000-1111-000000000000 \
  -H "Authorization: Bearer xero_myproject_your_api_key" \
  -H "Content-Type: application/json" \
  -d '{
    "target": 99.95,
    "window_days": 7
  }'
                

Delete SLO

DELETE /v1/slos/{slo_id}

Permanently deletes the SLO definition and all associated history.

curl

                    curl -X DELETE https://api.xerotier.ai/proj_ABC123/v1/slos/00000000-1111-0000-1111-000000000000 \
  -H "Authorization: Bearer xero_myproject_your_api_key"
                

Get Compliance History

GET /v1/slos/{slo_id}/history

Returns a paginated list of compliance calculation entries for the SLO.

Query Parameter	Type	Description
limitoptional	integer	Maximum number of entries to return.
afteroptional	string	Pagination cursor; pass the previous page's `last_id`.
startoptional	string	Lower bound for `period_start`; Unix seconds or ISO 8601 timestamp.
endoptional	string	Upper bound for `period_end`; Unix seconds or ISO 8601 timestamp.

curl

                    curl "https://api.xerotier.ai/proj_ABC123/v1/slos/00000000-1111-0000-1111-000000000000/history?limit=10" \
  -H "Authorization: Bearer xero_myproject_your_api_key"
                

Response

                        {
  "object": "list",
  "data": [
    {
      "id": "00000000-2222-0000-2222-000000000000",
      "object": "slo.history",
      "slo_id": "00000000-1111-0000-1111-000000000000",
      "period_start": 1708900000,
      "period_end": 1709000000,
      "measured_value": 99.95,
      "total_requests": 150000,
      "conforming_requests": 150000,
      "compliance_percentage": 100.0,
      "is_met": true,
      "calculated_at": 1709000100
    }
  ],
  "first_id": "00000000-2222-0000-2222-000000000000",
  "last_id": "00000000-2222-0000-2222-000000000000",
  "has_more": false
}
                    

Trigger Calculation

POST /v1/slos/{slo_id}/calculate

Triggers an on-demand compliance calculation for the SLO and returns the result.

curl

                    curl -X POST https://api.xerotier.ai/proj_ABC123/v1/slos/00000000-1111-0000-1111-000000000000/calculate \
  -H "Authorization: Bearer xero_myproject_your_api_key"
                

Project Summary

GET /v1/slos/summary

Returns an overview of all active SLOs for the project with their current compliance status.

curl

                    curl https://api.xerotier.ai/proj_ABC123/v1/slos/summary \
  -H "Authorization: Bearer xero_myproject_your_api_key"
                

Response

                        {
  "object": "slo.summary",
  "total_active": 3,
  "total_met": 2,
  "total_not_met": 1,
  "total_unevaluated": 0,
  "slos": [
    {
      "id": "00000000-1111-0000-1111-000000000000",
      "name": "API Availability",
      "metric": "availability",
      "target": 99.9,
      "status": "met",
      "compliance_percentage": 99.98,
      "measured_value": 99.98,
      "last_calculated_at": 1709050000
    },
    {
      "id": "00000000-1111-0000-1111-000000000001",
      "name": "P95 Latency",
      "metric": "total_latency_ms",
      "target": 500,
      "status": "not_met",
      "compliance_percentage": 94.2,
      "measured_value": 520.0,
      "last_calculated_at": 1709050000
    }
  ]
}
                    

Limits

Max 50 SLO definitions per project.
Window range: 1-90 days (window_days is a required integer in that range).
Name length: 1-128 ASCII printable characters.
Description length: Max 512 characters.
Metric values: 10 accepted identifiers (see Supported Metrics).
Comparison values: 4 accepted operators (see Comparison Operators).
Header value range: X-SLO-TTFT-Ms and X-SLO-TPOT-Ms accept non-negative integer milliseconds.

SLO Request Headers

Headers hint, they do not enforce. SLO request headers ride only on POST /v1/chat/completions and are ignored everywhere else, including /v1/responses. A missed target does not fail the request, raise an alert, or write to /v1/slos history. If you need recorded compliance, use the management API.

Include SLO targets as request headers on POST /v1/chat/completions requests. When present, the router prefers backends predicted to meet the budget. When absent, the router uses its default routing strategy without SLO-aware adjustments.

Header	Type	Description
`X-SLO-TTFT-Ms`	integer	Target time-to-first-token in milliseconds (e.g., 500 for 500ms TTFT target).
`X-SLO-TPOT-Ms`	integer	Target time-per-output-token in milliseconds (e.g., 30 for 30ms per token).

You can send one or both headers. If no backend is predicted to meet the target, the router still selects the best available backend; the request continues as it would without the header.

Supported Metrics

Ten identifiers across two families. Inference metrics measure the chat-completions pipeline. Execution metrics measure the XEM tool-execution lease. Identifiers are case-sensitive and must match one of the values below exactly.

// inference Request-path metrics

Metric	Description	Unit
`ttft_ms`	Time to first token	milliseconds
`tpot_ms`	Time per output token	milliseconds
`total_latency_ms`	Total end-to-end request latency	milliseconds
`availability`	Percentage of successful requests	percent (0-100)
`error_rate`	Percentage of failed requests	percent (0-100)
`throughput_rps`	Average requests per second	requests/second

// xem execution Tool-execution metrics

Metric	Description	Unit
`exec_availability`	XEM execution agent uptime (lease-renewal based)	percent (0-100)
`exec_duration_ms`	Per-call XEM tool execution duration	milliseconds
`exec_error_rate`	Fraction of failed XEM executions over total executions	percent (0-100)
`exec_approval_latency_ms`	Time from `exec.approval_requested` to the terminal `exec.approved` or `exec.rejected` transition	milliseconds

For percentage metrics (availability, error_rate, exec_availability, exec_error_rate), both the target and the returned compliance_percentage are expressed on a 0-100 scale (e.g., 99.9 for 99.9%).

Comparison Operators

Comparison	Description	Typical Use
`less_than`	Metric must be less than target	Latency, error rate
`less_than_or_equal`	Metric must be at most target	Latency, error rate
`greater_than`	Metric must exceed target	Throughput
`greater_than_or_equal`	Metric must be at least target	Availability

Compliance Calculation

SLO compliance is measured as the percentage of requests within the evaluation window that meet the defined target:

compliance_percentage = (conforming_requests / total_requests) * 100

A request is "conforming" if the measured metric value satisfies the comparison operator against the target. For example, with metric=availability, target=99.9, comparison=greater_than_or_equal, the SLO is met when the measured availability is 99.9% or higher.

For percentage metrics such as availability and error_rate, the measured value is reduced from per-request boolean samples (success vs. error) over the window before the comparison is applied, and compliance_percentage is reported on a 0-100 scale.

Compliance calculations run automatically on a periodic schedule. Use the POST /v1/slos/{id}/calculate endpoint to trigger an on-demand calculation.

Status values

Both GET /v1/slos/summary and the per-SLO summary entries report a status string drawn from three values:

met Latest compliance evaluation passed the target.
not_met Latest compliance evaluation fell short of the target.
unevaluated No completed evaluation yet; window is still warming up or the scheduler has not run.

Examples

Create an Availability SLO

curl

                    curl -X POST https://api.xerotier.ai/proj_ABC123/v1/slos \
  -H "Authorization: Bearer xero_myproject_your_api_key" \
  -H "Content-Type: application/json" \
  -d '{
    "name": "Production Availability",
    "metric": "availability",
    "target": 99.9,
    "comparison": "greater_than_or_equal",
    "window_days": 30
  }'
                

Create a Latency SLO

curl

                    curl -X POST https://api.xerotier.ai/proj_ABC123/v1/slos \
  -H "Authorization: Bearer xero_myproject_your_api_key" \
  -H "Content-Type: application/json" \
  -d '{
    "name": "P95 TTFT Target",
    "metric": "ttft_ms",
    "target": 500,
    "comparison": "less_than_or_equal",
    "window_days": 7
  }'
                

Per-Request SLO Headers

curl

                    curl https://api.xerotier.ai/proj_ABC123/my-endpoint/v1/chat/completions \
  -H "Authorization: Bearer xero_myproject_your_api_key" \
  -H "Content-Type: application/json" \
  -H "X-SLO-TTFT-Ms: 500" \
  -H "X-SLO-TPOT-Ms: 30" \
  -d '{
    "model": "llama-3.1-8b",
    "messages": [{"role": "user", "content": "Hello!"}]
  }'
                

Python

                    import requests

# SLO management API is project-scoped (no endpoint slug)
mgmt_base = "https://api.xerotier.ai/proj_ABC123/v1"
# Inference path includes the endpoint slug
inference_base = "https://api.xerotier.ai/proj_ABC123/my-endpoint/v1"
headers = {"Authorization": "Bearer xero_myproject_your_api_key"}

# Create an SLO
slo = requests.post(f"{mgmt_base}/slos", headers=headers, json={
    "name": "Chat Latency",
    "metric": "total_latency_ms",
    "target": 1000,
    "comparison": "less_than",
    "window_days": 7
}).json()
print(f"Created SLO: {slo['id']}")

# Check compliance summary
summary = requests.get(f"{mgmt_base}/slos/summary", headers=headers).json()
for entry in summary["slos"]:
    print(f"{entry['name']}: {entry['status']} ({entry['compliance_percentage']}%)")

# Send a request with SLO headers
response = requests.post(
    f"{inference_base}/chat/completions",
    headers={
        **headers,
        "Content-Type": "application/json",
        "X-SLO-TTFT-Ms": "500",
        "X-SLO-TPOT-Ms": "30",
    },
    json={
        "model": "llama-3.1-8b",
        "messages": [{"role": "user", "content": "Hello!"}],
    },
)
                

Node.js

                    // SLO management API is project-scoped (no endpoint slug)
const mgmtBase = "https://api.xerotier.ai/proj_ABC123/v1";
// Inference path includes the endpoint slug
const inferenceBase = "https://api.xerotier.ai/proj_ABC123/my-endpoint/v1";
const headers = {
    "Authorization": "Bearer xero_myproject_your_api_key",
    "Content-Type": "application/json"
};

// Create an SLO
const sloResponse = await fetch(`${mgmtBase}/slos`, {
    method: "POST",
    headers,
    body: JSON.stringify({
        name: "Chat Latency",
        metric: "total_latency_ms",
        target: 1000,
        comparison: "less_than",
        window_days: 7
    })
});
const slo = await sloResponse.json();
console.log(`Created SLO: ${slo.id}`);

// Check compliance summary
const summaryResponse = await fetch(`${mgmtBase}/slos/summary`, { headers });
const summary = await summaryResponse.json();
for (const entry of summary.slos) {
    console.log(`${entry.name}: ${entry.status} (${entry.compliance_percentage}%)`);
}

// Send a request with SLO headers
const inferenceResponse = await fetch(`${inferenceBase}/chat/completions`, {
    method: "POST",
    headers: {
        ...headers,
        "X-SLO-TTFT-Ms": "500",
        "X-SLO-TPOT-Ms": "30"
    },
    body: JSON.stringify({
        model: "llama-3.1-8b",
        messages: [{ role: "user", content: "Hello!" }]
    })
});
                

Overview

SLO Management API

Per-request SLO headers

SLO Management API

Endpoints

Create SLO

Response

List SLOs

Get SLO

Response (with compliance data)

Update SLO

Delete SLO

Get Compliance History

Response

Trigger Calculation

Project Summary

Response

Limits

SLO Request Headers

Supported Metrics

// inference Request-path metrics

// xem execution Tool-execution metrics

Comparison Operators

Compliance Calculation

Status values

Examples

Create an Availability SLO

Create a Latency SLO

Per-Request SLO Headers

Python

Node.js

See Also