Storage

Platform storage architecture, tiers, encryption, retention, billing, and per-type limits. This page is the single source of truth for all storage-related information across the platform.

Overview

Xerotier uses a two-tier storage architecture for all stored content. Every object -- whether it is a chat completion, a response, a conversation item, a batch file, or an uploaded model -- flows through the same hot/cold pipeline with unified encryption, retention, and billing.

All storage types share a unified billing pool per project. There is no separate metering for each content type; your total storage usage across all types determines your billable amount.

Storage Types

The following content types are stored by the platform:

Type Description What Is Stored
Models Uploaded model weights GGUF/safetensors files
Completions Chat completion results Request/response pairs
Responses Responses API results Input/output content
Conversations Conversation items Message content
Batch Files Batch API input/output JSONL files
Uploads Multi-part uploads Assembled files

Hot Tier (Cache)

Property Value
TTL 24 hours (auto-eviction)
Purpose Fast sub-millisecond access for recent content
Behavior Automatically populated on write, evicted after TTL
Availability Best-effort (cache misses fall through to cold tier)

The hot tier provides sub-millisecond reads for recently created or recently accessed content. Items are written to the hot tier on creation and re-populated automatically when fetched from the cold tier.

Cold Tier (Object Storage)

Property Value
Retention 90 days (auto-expiration)
Purpose Durable long-term storage
Encryption Encrypted at rest (AES-256-GCM)
Isolation Each project has its own isolated storage

The cold tier is the durable layer. All content is written here on creation (dual-write with the hot tier) and persists for 90 days before automatic expiration. Each project has its own isolated storage.

Retrieval

When stored content is requested, the system uses an automatic waterfall retrieval strategy. This is transparent to API consumers -- the same endpoint serves content regardless of which tier it resides in.

  1. Hot tier check -- The in-memory cache is checked first for sub-millisecond access.
  2. Cold tier fallback -- If the item is not in the hot tier (e.g., TTL expired), it is loaded from cold storage and decrypted.
  3. Cache re-population -- On a cold tier hit, the item is automatically promoted back to the hot tier so subsequent reads are fast.

Note: If content has expired from both tiers (beyond 90 days in cold storage and evicted from cache), requests return a 404. Database metadata records persist independently of storage tier expiration.

Encryption

All cold tier content is encrypted at rest using AES-256-GCM. Encryption and decryption are handled transparently by the platform -- content is encrypted before writing to storage and decrypted automatically on retrieval.

Property Value
Algorithm AES-256-GCM
Scope Per-project encryption keys
Integrity Checksum verification on every read
Key rotation Supported (seamless transition between key versions)

Storage Billing

All storage types contribute to a single billable total per project. There is no per-type billing -- your combined usage across completions, responses, conversations, batch files, models, and uploads determines the bill.

Property Details
Pool Shared -- all storage types contribute to a single billable total per project
Minimum charge 1 GB minimum as soon as any storage is used
Formula Billable GB = max(1 GB, actual total usage) when usage > 0; 0 when nothing stored
Rate Per-GB monthly rate from your service tier (visible on Usage dashboard)
Cost Billable GB x rate per GB
Monitoring Storage breakdown by type visible on the Usage dashboard
Quota Projects may have storage quotas; usage visible in project settings

Retention & Lifecycle

Tier Duration Eviction Notes
Hot (Cache) 24 hours Auto-eviction after TTL Non-blocking, best-effort
Cold (Storage) 90 days Automatic expiration Encrypted at rest
Database Records Indefinite Manual deletion only Metadata persists after storage expiration

Note: Database records (metadata, sequence numbers, storage size tracking) persist independently of storage tier expiration. After content expires from both tiers, the database row remains but the content is no longer retrievable.

Per-Type Limits

Type Max Item Size Other Limits
Completions Varies by model context -
Responses Varies by model context -
Conversations 1 MB per item 100 items/conversation, 16 metadata keys
Batch Files 500 MB per file -
Uploads 200 MB per chunk -

See the documentation for each API for the full set of constraints: Stored Completions, Responses API, Conversations, Batch API, Files API, Uploads API.

Checking Storage Usage

Query your project's storage consumption via the usage API.

curl
curl https://xerotier.ai/usage/endpoints \ -H "Authorization: Bearer xero_my-project_abc123"
Python
import requests headers = {"Authorization": "Bearer xero_my-project_abc123"} response = requests.get( "https://xerotier.ai/usage/endpoints", headers=headers ) data = response.json() for endpoint in data["items"]: print(f"{endpoint['slug']}: {endpoint['requests']} requests")
Node.js
const response = await fetch( "https://xerotier.ai/usage/endpoints", { headers: { "Authorization": "Bearer xero_my-project_abc123" } } ); const data = await response.json(); data.items.forEach(ep => { console.log(`${ep.slug}: ${ep.requests} requests`); });

Usage Dashboard

Navigate to your project's Usage page to see a full storage breakdown:

  • Per-type usage (GB) -- How much storage each content type consumes.
  • Total used -- Combined storage across all types.
  • Billable amount -- The greater of 1 GB or actual usage (when usage > 0).
  • Estimated cost -- Billable GB multiplied by your per-GB rate.

The admin consumption page shows platform-wide storage metrics across all projects. See Usage Tracking & Billing for details on the broader billing system.