Storage - Xerotier

All storage types share a unified billing pool per project. There is no separate metering for each content type; your total storage usage across all types determines your billable amount.

Tier Windows

Two retention windows, one per service tier. The hot window is in-memory TTL; the cold window is durable retention before automatic expiration. Every section below references this matrix rather than restating it.

Service tier hot and cold retention windows
Service Tier	Hot (Cache)	Cold (Object)
Free	1 hour	1 day
CPU	5 hours	7 days
GPU Shared	7 hours	7 days
GPU Dedicated	24 hours	14 days
Self-Hosted	48 hours	14 days

Storage Types

The following content types are stored by the platform:

Stored content types and their payloads
Type	Description	What Is Stored
Models	Uploaded model weights	GGUF/safetensors files
Completions	Chat completion results	Request/response pairs
Responses	Responses API results	Input/output content
Conversations	Conversation items	Message content
Batch Files	Batch API input/output	JSONL files
Uploads	Multi-part uploads	Assembled files

Hot Tier (Cache)

In-memory layer for recently created or recently accessed content. Reads are typically sub-millisecond. Items are written here on creation and re-populated automatically on cold tier hits.

TTL: Per service tier, see Tier Windows.
Behavior: Auto-populated on write, evicted after TTL.
Availability: Best-effort; cache misses fall through to cold.
Encryption: Same AES-256-GCM payload as cold.

Cold Tier (Object Storage)

Durable layer. All content is written here on creation (dual-write with the hot tier) and persists for a tier-dependent window before automatic expiration. Each project's objects live in their own container with per-project encryption keys.

Retention: Per service tier, see Tier Windows.
Encryption: AES-256-GCM at rest.
Isolation: Per-project containers, per-project keys. Shared object-store infrastructure, isolated at the container and key-prefix level.

Retrieval

When stored content is requested, the system uses an automatic waterfall retrieval strategy. This is transparent to API consumers, the same endpoint serves content regardless of which tier it resides in.

Hot tier check, The in-memory cache is checked first for sub-millisecond access.
Cold tier fallback, If the item is not in the hot tier (e.g., TTL expired), it is loaded from cold storage and decrypted.
Cache re-population, On a cold tier hit, the item is automatically promoted back to the hot tier so subsequent reads are fast.

Note: If content has expired from both tiers (beyond the cold storage retention period and evicted from cache), requests return a 404. Database rows for stored items persist by default, though some cleanup paths (notably Files and Uploads) may soft-delete the metadata row; see each type's documentation for specifics.

Encryption

All stored content is encrypted at rest using AES-256-GCM in both tiers. Encryption and decryption are handled transparently, content is encrypted before writing to either tier and decrypted automatically on retrieval.

Algorithm: AES-256-GCM.
Scope: Per-project encryption keys.
Integrity: Checksum verification on every read.
Key rotation: Supported with zero downtime. Existing ciphertext decrypts against retained prior key versions.

Storage Billing

All storage types contribute to a single billable total per project. There is no per-type billing; combined usage across every storage type determines the bill.

Storage billing pool, formula, and monitoring
Property	Details
Pool	Shared, all storage types contribute to a single billable total per project
Minimum charge (display)	1 GB minimum is applied to the Usage dashboard rollup as soon as any storage is used
Formula (display)	Displayed billable GB = max(1 GB, actual total usage) when usage > 0; 0 when nothing stored. The underlying time-weighted credit deduction is computed from actual bytes, the 1 GB floor is a dashboard rounding rule, not a ledger floor.
Rate	Per-GB monthly rate from your service tier (visible on Usage dashboard)
Cost	`billable_gb * rate_per_gb`
Monitoring	Storage breakdown by type visible on the Usage dashboard
Quota	Projects may have storage quotas; usage visible in project settings

Retention & Lifecycle

Lifecycle by layer: hot, cold, and database metadata
Layer	Duration	Eviction	Notes
Hot (Cache)	See Tier Windows	Auto-eviction after TTL	Non-blocking, best-effort
Cold (Object)	See Tier Windows	Automatic expiration	Encrypted at rest
Database Records	Indefinite	Manual deletion only	Metadata persists after storage expiration. Some cleanup paths (see Files, Uploads) may soft-delete the row.

Per-Type Limits

Per content-type item size and other limits
Type	Max Item Size	Other Limits
Models	Per-tier `max_model_size_gb` (Free 48 GB; CPU and Shared GPU unbounded; Dedicated GPU unbounded; Self-Hosted 512 GB)	Delivered via the Uploads API; see Service Tiers
Completions	Varies by model context	-
Responses	Varies by model context	-
Conversations	1 MB per item	100 items/conversation, 16 metadata keys
Batch Files	500 MB per file	-
Uploads	200 MB per chunk (multi-part)	Single-shot file endpoints capped at 200 MB body; multi-part uploads have no documented total-size limit beyond the project storage quota

See the documentation for each API for the full set of constraints: Stored Completions, Responses API, Conversations, Batch API, Files API, Uploads API.

Checking Usage

Storage usage is visible on the Usage dashboard: per-type breakdown (GB), combined total, billable amount (the greater of 1 GB or actual usage when usage is non-zero), and estimated monthly cost at your tier's per-GB rate. See Usage Tracking & Billing for the broader billing system.

Endpoint-level request counts and token usage are exposed via the usage API below. Per-type storage byte counts are dashboard-only.

// tip Press Alt+e to cycle focus through the examples.

Replace {project_id} with your project's external id (e.g. proj_abc123) and set XEROTIER_API_KEY to a valid project API key before running these examples.

curl

                    # Retrieve per-endpoint request counts and token usage
curl "https://xerotier.ai/{project_id}/v1/usage/endpoints" \
  -H "Authorization: Bearer $XEROTIER_API_KEY"
                

Python

                    import os
import requests

project_id = "{project_id}"
headers = {"Authorization": f"Bearer {os.environ['XEROTIER_API_KEY']}"}
response = requests.get(
    f"https://xerotier.ai/{project_id}/v1/usage/endpoints",
    headers=headers,
)
response.raise_for_status()
payload = response.json()
for endpoint in payload["data"]:
    print(
        f"{endpoint['endpointSlug']}: "
        f"{endpoint['requestCount']} requests, "
        f"{endpoint['totalInputTokens']} input tokens"
    )
                

Node.js

                    const projectId = "{project_id}";
const response = await fetch(
    `https://xerotier.ai/${projectId}/v1/usage/endpoints`,
    {
        headers: { "Authorization": `Bearer ${process.env.XEROTIER_API_KEY}` }
    }
);
if (!response.ok) {
    throw new Error(`Usage request failed: ${response.status}`);
}
const payload = await response.json();
payload.data.forEach(ep => {
    console.log(`${ep.endpointSlug}: ${ep.requestCount} requests`);
});