Usage Tracking & Billing
Xerotier.ai tracks every inference request and provides detailed usage analytics, dual billing models, and exportable reports for cost management.
Overview
Every inference request processed through Xerotier.ai generates a usage record. These records power the usage dashboard, billing calculations, and export functionality. Usage data is retained indefinitely and remains available even after endpoints are deleted.
Xerotier.ai supports two billing models that can operate simultaneously within a single project:
- Per-token billing for shared (platform-managed) agents
- Hourly billing for XIM (private) nodes
Your total cost is the sum of token costs from shared agents plus hourly costs from XIM nodes.
What Is Tracked
Each inference request records the following metrics:
| Field | Type | Description |
|---|---|---|
| inputTokens | integer | Number of input (prompt) tokens consumed |
| outputTokens | integer | Number of output (completion) tokens generated |
| cachedTokens | integer | Input tokens served from the prefix cache (reduces latency) |
| cost | number | Token cost in dollars (shared agents only; 0 for XIM) |
| ttftMs | integer | Time to first token in milliseconds |
| totalLatencyMs | integer | Total request latency in milliseconds |
| statusCode | integer | HTTP status code of the response |
| isXim | boolean | Whether the request was served by a XIM node |
| endpointSlug | string | Endpoint identifier (preserved after endpoint deletion) |
| modelName | string | Model used for inference (preserved after endpoint deletion) |
Deleted Endpoints: When an endpoint is deleted, its usage history is preserved. The endpoint slug and model name are retained on each usage record, so historical data remains accessible and displays as "Deleted" in the usage table.
Billing Models
Xerotier supports two billing models that can operate simultaneously within a single project. For subscription management, credit purchases, and invoice details, see Billing & Subscriptions.
Per-Token Billing (Shared Agents)
Requests served by platform-managed shared agents are billed per token. The cost is calculated at request time based on the model's token pricing and recorded in the cost field of each usage record.
- Cost is calculated per request based on input and output token counts
- Pricing varies by model and service tier
Hourly Billing (XIM Nodes)
XIM nodes are billed based on connected uptime rather than token usage. The cost field for these requests is always 0 -- billing is calculated separately from uptime tracking.
- Billing is based on the time your agent is connected to the platform
- Rates are determined by the agent's service tier
- Only connected time is billed -- disconnected periods are not charged
- Billing periods are 730-hour intervals anchored to your project creation date
Billing Period: Each billing period is exactly 730 hours (approximately one month), starting from the date your project was created. The current period's start and end dates are shown on the usage dashboard.
Usage Dashboard
The usage dashboard at /usage provides a comprehensive view of your project's consumption. It includes:
Summary Cards
- Token usage: Total input and output tokens, segmented by shared vs XIM
- Estimated cost: Combined token cost (shared) and hourly cost (XIM)
- Cache performance: Cache hit rate and total cached tokens
- Credits remaining: Current project credit balance (see Credits)
Charts
- 7-day usage chart: Daily token usage segmented by deployment type
- 7-day cost chart: Daily cost segmented by billing model
- Cache hit rate chart: Daily prefix cache hit rate trend
- Uptime charts: Daily connected/disconnected hours and weekly trend (XIM only)
Endpoint Usage Table
A paginated table showing per-endpoint usage breakdown including:
- Request count, input/output/cached tokens, and cache hit rate
- Token cost and hourly cost (where applicable)
- Endpoint status (Active, Deleted, Disconnected, Failed, or provisioning states)
- Connected hours and uptime percentage (for hourly-billed endpoints)
Agent Uptime Table
For projects with XIM nodes, the dashboard shows agent-level uptime data:
- Agent name and tier
- Hourly rate, connected hours, and uptime percentage
- Hourly cost and online/offline status
Usage APIs
All usage APIs require project authentication.
List Usage Events
GET/usage/events
Retrieve cursor-paginated usage events.
| Parameter | Type | Description |
|---|---|---|
| limit optional | integer | Number of events to return (max 100) |
| cursor optional | string | Base64-encoded cursor for pagination (format: timestamp:id) |
Response
{
"items": [
{
"id": "evt_abc123",
"timestamp": "2025-06-15T14:30:00Z",
"endpointSlug": "my-endpoint",
"modelName": "llama-3.1-8b-instruct",
"inputTokens": 512,
"outputTokens": 128,
"cachedTokens": 384,
"cacheHitRate": 0.75,
"cost": 0.0012,
"ttftMs": 45,
"totalLatencyMs": 890,
"statusCode": 200,
"isXim": false
}
],
"next_cursor": "base64_encoded_cursor",
"has_more": true
}
import requests
headers = {"Authorization": "Bearer xero_my-project_abc123"}
response = requests.get(
"https://xerotier.ai/usage/events",
headers=headers,
params={"limit": 50}
)
data = response.json()
for event in data["items"]:
print(f"{event['modelName']}: {event['inputTokens']}in/{event['outputTokens']}out")
const response = await fetch(
"https://xerotier.ai/usage/events?limit=50",
{
headers: { "Authorization": "Bearer xero_my-project_abc123" }
}
);
const data = await response.json();
data.items.forEach(event => {
console.log(`${event.modelName}: ${event.inputTokens}in/${event.outputTokens}out`);
});
Each item in the response includes cachedTokens (integer count of input tokens served from the prefix cache) and cacheHitRate (ratio of cached tokens to total input tokens, 0.0-1.0). These fields are present on every usage event regardless of whether caching was active for that request.
List Request Logs
GET/usage/logs
Retrieve cursor-paginated request logs with filtering. Supports bidirectional pagination for infinite scroll.
| Parameter | Type | Description |
|---|---|---|
| limit optional | integer | Number of events to return |
| cursor optional | string | Pagination cursor |
| direction optional | string | forward or backward for bidirectional pagination |
| endpoint optional | string | Filter by endpoint slug |
| status optional | string | Filter by status code range (2xx, 3xx, 4xx, 5xx) or specific code |
| from optional | string | Start date filter (ISO 8601 format) |
| to optional | string | End date filter (ISO 8601 format) |
Response
{
"items": [ ... ],
"next_cursor": "cursor_string",
"prev_cursor": "cursor_string",
"has_more": true
}
List Endpoint Usage
GET/usage/endpoints
Retrieve paginated endpoint usage summaries with optional search.
| Parameter | Type | Description |
|---|---|---|
| offset optional | integer | Offset for pagination (default: 0) |
| limit optional | integer | Number of results to return (max 50) |
| search optional | string | Filter by endpoint slug |
Response
{
"items": [
{
"slug": "my-endpoint",
"requests": 1500,
"inputTokens": 450000,
"outputTokens": 120000,
"cachedTokens": 85000,
"cacheHitRate": 0.189,
"cost": 12.50,
"hourlyCost": 0.0,
"statusLabel": "Active"
}
],
"total": 5,
"offset": 0,
"limit": 50,
"hasMore": false
}
Export Uptime CSV
GET/usage/export/uptime
Download a CSV file of uptime billing data for XIM nodes. Defaults to the current 730-hour billing period if no date range is specified.
| Parameter | Type | Description |
|---|---|---|
| from optional | string | Start date (ISO 8601). Defaults to current billing period start. |
| to optional | string | End date (ISO 8601). Defaults to current billing period end. |
CSV Columns
Resource Type,Resource Name,Resource ID,Tier,Hourly Rate,Connected At,Disconnected At,Connected Hours,Cost
Uptime Billing
XIM nodes use an uptime-based billing model where you are charged for the time your agent is connected to the platform.
How It Works
- Your XIM node connects to the Xerotier.ai control plane
- Connection and disconnection timestamps are recorded automatically
- Only connected time is billed -- gaps between connections are free
- Costs are calculated using the hourly rate from your agent's service tier
Billing Periods
Billing periods are 730-hour intervals (approximately 30.4 days) anchored to your project creation date. For example, if your project was created on January 1st at 00:00 UTC:
- Period 1: Jan 1 00:00 - Jan 31 10:00 UTC (730 hours)
- Period 2: Jan 31 10:00 - Mar 2 20:00 UTC (730 hours)
- And so on...
Tier Hourly Rates
Each service tier defines an hourly rate for XIM nodes. The rate is displayed on the usage dashboard. See Service Tiers for current pricing.
CSV Export
Use the uptime CSV export to download billing data for reconciliation or record-keeping. The export includes one row per connection session for each agent, with calculated costs.
Example
curl -H "Authorization: Bearer xero_my-project_abc123" \
"https://api.xerotier.ai/usage/export/uptime?from=2025-01-01T00:00:00Z&to=2025-02-01T00:00:00Z" \
-o uptime-report.csv
The CSV file is suitable for import into spreadsheet applications or billing systems.
Prefix Cache Impact
When prefix caching is enabled on your endpoint, some input tokens may be served from cache rather than being recomputed. These are tracked as cachedTokens in usage records.
- Cache hit rate is displayed on the usage dashboard as a daily trend chart
- Per-endpoint cache hit rate is shown in the endpoint usage table
- Cached tokens reduce latency (especially TTFT) but are still counted as input tokens for billing
See Prefix Caching for details on how to enable and optimize caching.
Frequently Asked Questions
What happens to usage data when I delete an endpoint?
Usage data is preserved. Each usage record retains the endpoint slug and model name, so historical data remains accessible. Deleted endpoints appear with a "Deleted" status label in the usage table.
How are billing periods calculated?
Billing periods are 730-hour intervals starting from your project creation date. The current period's start and end dates are displayed on the usage dashboard.
Can I have both shared and XIM nodes in the same project?
Yes. Shared agents are billed per token and XIM nodes are billed per hour of connected uptime. The usage dashboard segments these separately so you can see costs from each billing model.
Do cached tokens cost money?
Yes, cached tokens are still counted as input tokens for billing purposes. However, they significantly reduce latency by avoiding recomputation of the KV cache for previously seen prompt prefixes.
What if my XIM node disconnects temporarily?
Only connected time is billed. Disconnection gaps are not charged. Each connection and disconnection event is recorded to calculate your actual connected hours.
How do credits and subscriptions work?
Credits are used for per-token inference billing on shared agents. For details on purchasing credits, managing subscriptions, and handling delinquent accounts, see Billing & Subscriptions.