Documentation

The Xerotier.ai API provides an OpenAI-compatible interface for accessing AI models. Use your existing OpenAI SDK or HTTP client with minimal changes.

Overview

Xerotier.ai is a multi-tenant inference platform that allows you to access open-source AI models through a familiar OpenAI-compatible API. Simply change your base URL and API key to start using Xerotier.ai hosted models.

Key Features

  • OpenAI SDK Compatible - Use existing OpenAI Python and Node.js SDKs
  • Self-Hosted Agents - Run inference on your own infrastructure
  • Custom Model Upload - Upload and serve your own fine-tuned models
  • Streaming Support - Real-time token streaming via Server-Sent Events
  • LMCache Integration - Reduced Time-to-First-Token with KV cache sharing

Quick Example

Python
from openai import OpenAI client = OpenAI( base_url="https://api.xerotier.ai/proj_ABC123/my-endpoint/v1", api_key="xero_myproject_your_api_key" ) response = client.chat.completions.create( model="deepseek-r1-distill-llama-70b", messages=[{"role": "user", "content": "Hello!"}] ) print(response.choices[0].message.content)

Getting Started

New to Xerotier.ai? Start here to learn the basics.

API Reference

Complete reference for all Xerotier.ai API endpoints.

Guides

Learn best practices and advanced usage patterns.

Infrastructure

Deploy and manage your own inference infrastructure.