Deploy AI models, leverage an OpenAI-compatible API, GPU and CPU support, developer simplicity, and enterprise reliability.
# OpenAI-compatible API - just change your base URL
$ curl https://api.xerotier.ai/your-project/your-endpoint/v1/chat/completions \
-H "Authorization: Bearer $XEROTIER_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "your-agent",
"messages": [{"role": "user", "content": "Hello!"}]
}'
# Response
{
"choices": [{
"message": {"role": "assistant", "content": "Hello! How can I help?"}
}]
}
Use our distributed infrastructure or your own Agents
Enterprise-grade infrastructure with developer-friendly simplicity
Upload safetensors models with hardware-isolated sandboxing.
Push a model, get an endpoint. Automatic scaling from zero.
Runtime, data, and network isolation.
NVIDIA and AMD support giving full featured acceleration, at scale.
Drop-in replacement. Change one line of code.
Real-time request tracking and performance metrics.
Pay per token for managed inference, or bring your own hardware with no metering.
Perfect for testing and evaluation
ZenDNN-optimized AMD EPYC inference
Best balance of price and performance with NVIDIA CUDA
Cost-effective GPU inference with AMD ROCm
Bring your own accelerators
Get started in minutes with our free tier. No credit card required.
Start Building