Getting Started

Get up and running with the Xerotier.ai API. Learn how to authenticate, make your first request, and understand the basics of the platform.

Introduction

Xerotier.ai provides a multi-tenant inference platform that allows you to access open-source AI models through a familiar OpenAI-compatible API. Simply change your base URL and API key to start using Xerotier.ai hosted models.

Base URL

All API requests use path-based URLs:

URL
https://api.xerotier.ai/proj_ABC123/{endpointSlug}/v1

OpenAI SDK Compatibility

Xerotier.ai is fully compatible with the OpenAI Python and Node.js SDKs. Simply configure the base URL:

Python
from openai import OpenAI client = OpenAI( base_url="https://api.xerotier.ai/proj_ABC123/my-endpoint/v1", api_key="xero_myproject_your_api_key" ) response = client.chat.completions.create( model="deepseek-r1-distill-llama-70b", messages=[{"role": "user", "content": "Hello!"}] )
Node.js
import OpenAI from "openai"; const client = new OpenAI({ baseURL: "https://api.xerotier.ai/proj_ABC123/my-endpoint/v1", apiKey: "xero_myproject_your_api_key" }); const response = await client.chat.completions.create({ model: "deepseek-r1-distill-llama-70b", messages: [{ role: "user", content: "Hello!" }] });
curl
curl https://api.xerotier.ai/proj_ABC123/my-endpoint/v1/chat/completions \ -H "Authorization: Bearer xero_myproject_your_api_key" \ -H "Content-Type: application/json" \ -d '{ "model": "deepseek-r1-distill-llama-70b", "messages": [{"role": "user", "content": "Hello!"}] }'

Quickstart

Get started with Xerotier.ai in three steps:

1. Create an Account

Sign up at xerotier.ai/auth/register to create your free account. No credit card required. You can register with email or sign in with GitHub OAuth if configured.

2. Create an Endpoint

An endpoint is a named inference URL bound to a specific model and service tier. Each endpoint gets its own slug (e.g., my-endpoint) that forms part of the API URL. The service tier determines pricing, rate limits, and timeouts.

From your dashboard, browse available models, select a service tier, and click "Create Endpoint" to generate your unique completion URL.

3. Make Your First Request

Python
from openai import OpenAI client = OpenAI( base_url="https://api.xerotier.ai/proj_ABC123/my-endpoint/v1", api_key="xero_myproject_your_api_key" ) response = client.chat.completions.create( model="deepseek-r1-distill-llama-70b", messages=[ {"role": "system", "content": "You are a helpful assistant."}, {"role": "user", "content": "What is the capital of France?"} ], max_tokens=100 ) print(response.choices[0].message.content)
Node.js
import OpenAI from "openai"; const client = new OpenAI({ baseURL: "https://api.xerotier.ai/proj_ABC123/my-endpoint/v1", apiKey: "xero_myproject_your_api_key" }); const response = await client.chat.completions.create({ model: "deepseek-r1-distill-llama-70b", messages: [ { role: "system", content: "You are a helpful assistant." }, { role: "user", content: "What is the capital of France?" } ], max_tokens: 100 }); console.log(response.choices[0].message.content);
curl
curl https://api.xerotier.ai/proj_ABC123/my-endpoint/v1/chat/completions \ -H "Authorization: Bearer xero_myproject_your_api_key" \ -H "Content-Type: application/json" \ -d '{ "model": "deepseek-r1-distill-llama-70b", "messages": [ {"role": "system", "content": "You are a helpful assistant."}, {"role": "user", "content": "What is the capital of France?"} ], "max_tokens": 100 }'

Your First Streaming Request

Streaming delivers tokens as they are generated, reducing perceived latency. Add "stream": true to your request to enable streaming:

curl
curl https://api.xerotier.ai/proj_ABC123/my-endpoint/v1/chat/completions \ -H "Authorization: Bearer xero_myproject_your_api_key" \ -H "Content-Type: application/json" \ -d '{ "model": "deepseek-r1-distill-llama-70b", "messages": [ {"role": "user", "content": "What is the capital of France?"} ], "stream": true }'
Python
from openai import OpenAI client = OpenAI( base_url="https://api.xerotier.ai/proj_ABC123/my-endpoint/v1", api_key="xero_myproject_your_api_key" ) stream = client.chat.completions.create( model="deepseek-r1-distill-llama-70b", messages=[{"role": "user", "content": "What is the capital of France?"}], stream=True ) for chunk in stream: if chunk.choices and chunk.choices[0].delta.content: print(chunk.choices[0].delta.content, end="", flush=True)
Node.js
import OpenAI from "openai"; const client = new OpenAI({ baseURL: "https://api.xerotier.ai/proj_ABC123/my-endpoint/v1", apiKey: "xero_myproject_your_api_key" }); const stream = await client.chat.completions.create({ model: "deepseek-r1-distill-llama-70b", messages: [{ role: "user", content: "What is the capital of France?" }], stream: true }); for await (const chunk of stream) { const content = chunk.choices?.[0]?.delta?.content; if (content) process.stdout.write(content); }

The response is delivered as Server-Sent Events (SSE). Each event is a data: line containing a JSON chunk with incremental content. The stream ends with data: [DONE].

Response
data: {"id":"chatcmpl-abc","choices":[{"delta":{"content":"The"},"index":0}]} data: {"id":"chatcmpl-abc","choices":[{"delta":{"content":" capital"},"index":0}]} data: [DONE]

For detailed streaming documentation including error handling and client examples, see the Streaming API guide.

Authentication

The Xerotier.ai API uses API keys for authentication. Include your API key in the Authorization header of all requests.

HTTP Header
Authorization: Bearer xero_myproject_your_api_key

Creating API Keys

Generate API keys from your API Keys page. You can create multiple keys with different scopes and revoke them at any time.

Security Note: Keep your API keys secure. Do not share them in public repositories or client-side code. Use environment variables to store your keys. The full key value is only shown once at creation time.

API Key Format

Xerotier.ai API keys follow this format:

Format
xero_{project_slug}_{random_characters}

Each key is scoped to a specific project identified by the slug in the key prefix.

API Key Scopes

When creating an API key, you select one or more scopes that determine which APIs the key can access:

  • inference - Access to the inference API: chat completions, embeddings, reranking, and model listing. Assigned by default.
  • management - Programmatic access to project management operations: key CRUD, agent CRUD, and join-key management. Required by the xeroctl CLI.

Requests to APIs outside a key's granted scopes receive a 403 Forbidden response.

Important: The full API key value is only returned once at creation time. Store it securely as it cannot be retrieved again.

Next Steps

Now that you have made your first request, explore these features:

Topic Description
API Reference Full parameter reference for chat completions, including tool calling, response formats, and SLO headers.
Streaming API Deep-dive into SSE streaming, chunk format, error handling, and client examples.
Service Tiers Understand pricing, rate limits, timeouts, and how to choose the right tier for your workload.
Prefix Caching Learn how to structure prompts for automatic KV cache reuse and faster time-to-first-token.
Error Handling Error codes, retry policies, and troubleshooting guidance.
Authentication API key scopes, IP filtering, key rotation with grace period, rate limit headers, and security best practices.
Usage Guides Streaming, rate limit handling, and error handling code examples in Python and Node.js.
xeroctl CLI Upload models, manage resources, and test endpoints from your terminal.

You can also pass X-SLO-TTFT-Ms and X-SLO-TPOT-Ms request headers to hint at your latency targets. The router uses these to prefer backends that can meet your performance requirements. See the API Reference for details.