Opinionated Defaults
Security on by default. Sane rate limits. Automatic failover. You can override everything, but you should not have to.
The AI inference company that does not hate you. Working software, priced by what you use, documented to help you succeed.
The AI infrastructure industry treats customers like obstacles between product and profit. Pricing designed to confuse. Documentation written to generate support tickets. "Enterprise" features that should be standard. Security treated as an upsell.
Xerotier.ai was started because we were tired of being on the receiving end of this. We wanted inference routing that worked out of the box, priced by what you use, documented like the engineers writing it actually wanted you to succeed.
Xero (zero) + tier = zero lock-in, zero bullshit. Your hardware, your rules.
A traffic cop, a worker, and a control plane, engineered to run across multi-vendor GPU silicon without lock-in.
The traffic cop. Routes requests by tier, handles auth, manages rate limits, streams responses. Written in Swift because performance matters.
Per-tier fair queuing, OpenAI-compatible request shape end to end, streaming with backpressure and timeout-aware cancellation.
The worker. Runs alongside vLLM on your GPU boxes. ZeroMQ transport, automatic model loading, health reporting. Deploy anywhere.
The dashboard. Model management, usage analytics, billing, team access. GDPR-compliant by design, not by afterthought.
NVIDIA, AMD, Intel: we do not care whose silicon you prefer. Shared GPU tiers from free CPU on up. No vendor lock-in.
We make decisions so you do not have to. Security on by default. Sane rate limits. Automatic failover. You can override everything, but you should not have to.
Security on by default. Sane rate limits. Automatic failover. You can override everything, but you should not have to.
We build for the person who gets paged at 3 AM, not the person who demos at conferences. Clear errors. Actionable logs.
If a feature requires a tutorial, we failed. If configuration requires a consultant, we failed. If debugging requires our help, we failed.
No circus. We charge money for our product and we use that money to make the product better. Revolutionary, we know.
Lover of Open Source, hater of Nonsense.
Kevin Carter (Cloudnull) founded Xerotier.ai after spending over a decade building infrastructure at scale - the kind that runs data centers, deploys clouds, and handles the traffic that keeps the internet functioning. OpenStack, bare metal, hypervisors. The unsexy stuff that everything else depends on.
The move to AI infrastructure was not a pivot, it was a continuation. The same problems that plague traditional infrastructure plague AI inference: unfair scheduling, wasted resources, security as an afterthought, complexity as a business model. The same solutions work too: fair queuing, graceful degradation, secure defaults, radical simplicity.
Most infrastructure fails not because the problem was hard, but because someone optimized for the wrong thing. They optimized for personal preference instead of operator experience. For deployment speed instead of recovery speed. For looking clever instead of being reliable. We optimize for systems that work when no one is watching.
Xerotier.ai is built by people who have been on call. Who have debugged production at 3 AM. Who have inherited codebases from people who optimized for conference talks instead of maintainability. We build the tools we wish existed.
We are not trying to build a unicorn. We are trying to build a company that makes useful software, charges fair prices, and treats customers like adults. If that sounds unremarkable, consider how rare it actually is.
Sign up, deploy a shared endpoint, send a request. No card required. No demo to schedule.