Documentation

The Xerotier.ai API provides an OpenAI-compatible interface for accessing AI models. Use your existing OpenAI SDK or HTTP client with minimal changes.

Overview

Xerotier.ai is a multi-tenant inference platform that allows you to access open-source AI models through a familiar OpenAI-compatible API. Simply change your base URL and API key to start using Xerotier.ai hosted models.

Key Features

OpenAI SDK Compatible - Use existing OpenAI Python and Node.js SDKs
Self-Hosted Agents - Run inference on your own infrastructure
Custom Model Upload - Upload and serve your own fine-tuned models
Streaming Support - Real-time token streaming via Server-Sent Events
LMCache Integration - Reduced Time-to-First-Token with KV cache sharing

Quick Example

Python

                    from openai import OpenAI

client = OpenAI(
    base_url="https://api.xerotier.ai/proj_ABC123/my-endpoint/v1",
    api_key="xero_myproject_your_api_key"
)

response = client.chat.completions.create(
    model="deepseek-r1-distill-llama-70b",
    messages=[{"role": "user", "content": "Hello!"}]
)

print(response.choices[0].message.content)