Batch API
Process large sets of inference requests asynchronously. Upload a JSONL file, create a batch, and retrieve results when processing completes.
Overview
The Batch API enables asynchronous processing of large request sets at lower priority than synchronous traffic. Batches use idle worker capacity and have a 24-hour completion deadline.
The workflow is:
- Upload -- Upload a JSONL file containing your requests via the Files API.
- Create -- Create a batch referencing the uploaded file.
- Poll -- Check batch status until it reaches a terminal state.
- Download -- Download the output file containing results.
Batch requests use separate rate limit quotas (2x synchronous values) and are billed at the base per-token rate for your endpoint's tier. Batch input and output files are stored using the platform's two-tier storage architecture. For details on storage tiers, encryption, retention, and billing, see Storage.
Supported Endpoints
Batch processing supports the following target endpoints:
| Endpoint | Description |
|---|---|
/v1/chat/completions |
Batch chat completion requests |
/v1/embeddings |
Batch embedding generation |
/v1/responses |
Batch Responses API processing |
All lines within a single batch must target the same endpoint. The target endpoint
is specified both in the batch creation request and in each JSONL line's
url field.
JSONL Format
Each line in the input file must be a valid JSON object with the following fields:
| Field | Type | Description |
|---|---|---|
| custom_idrequired | string | Unique correlation ID. Must be unique within the file. |
| methodrequired | string | Must be "POST". |
| urlrequired | string | Target endpoint path (e.g., /v1/chat/completions). |
| bodyrequired | object | Request body matching the target endpoint format. |
Chat Completions Example
{
"custom_id": "req-1",
"method": "POST",
"url": "/v1/chat/completions",
"body": {
"model": "llama-3.1-70b",
"messages": [{"role": "user", "content": "Hello"}],
"max_tokens": 100
}
}
{
"custom_id": "req-2",
"method": "POST",
"url": "/v1/chat/completions",
"body": {
"model": "llama-3.1-70b",
"messages": [{"role": "user", "content": "What is 2+2?"}],
"max_tokens": 100
}
}
Embeddings Example
{
"custom_id": "emb-1",
"method": "POST",
"url": "/v1/embeddings",
"body": {
"model": "bge-large-en-v1.5",
"input": "First document text"
}
}
{
"custom_id": "emb-2",
"method": "POST",
"url": "/v1/embeddings",
"body": {
"model": "bge-large-en-v1.5",
"input": "Second document text"
}
}
Responses API Example
{
"custom_id": "resp-1",
"method": "POST",
"url": "/v1/responses",
"body": {
"model": "llama-3.1-8b",
"input": "Summarize this document."
}
}
{
"custom_id": "resp-2",
"method": "POST",
"url": "/v1/responses",
"body": {
"model": "llama-3.1-8b",
"input": "Translate to French: Hello world"
}
}
Note: Batch Responses API requests do not support
background: true or stream: true. The
store parameter defaults to false for batch responses.
Create Batch
POST /:project_id/v1/batches
| Parameter | Type | Description |
|---|---|---|
| input_file_idrequired | string | ID of the uploaded JSONL file. |
| endpointrequired | string | Target endpoint: /v1/chat/completions, /v1/embeddings, or /v1/responses. |
| completion_windowrequired | string | Must be "24h". |
| metadataoptional | object | Custom key-value pairs (max 16 pairs). |
curl -X POST https://api.xerotier.ai/proj_ABC123/v1/batches \
-H "Authorization: Bearer xero_myproject_your_api_key" \
-H "Content-Type: application/json" \
-d '{
"input_file_id": "file-abc123def456",
"endpoint": "/v1/chat/completions",
"completion_window": "24h",
"metadata": {"description": "nightly evaluation"}
}'
Response
{
"id": "batch_abc123def456",
"object": "batch",
"endpoint": "/v1/chat/completions",
"input_file_id": "file-abc123def456",
"output_file_id": null,
"error_file_id": null,
"status": "validating",
"completion_window": "24h",
"created_at": 1709000000,
"expires_at": 1709086400,
"completed_at": null,
"cancelled_at": null,
"metadata": {"description": "nightly evaluation"},
"request_counts": {
"total": 0,
"completed": 0,
"failed": 0
}
}
List Batches
GET /:project_id/v1/batches
curl https://api.xerotier.ai/proj_ABC123/v1/batches?limit=20 \
-H "Authorization: Bearer xero_myproject_your_api_key"
Get Batch
GET /:project_id/v1/batches/{batch_id}
curl https://api.xerotier.ai/proj_ABC123/v1/batches/batch_abc123def456 \
-H "Authorization: Bearer xero_myproject_your_api_key"
Cancel Batch
POST /:project_id/v1/batches/{batch_id}/cancel
Cancellation stops dispatching new requests. In-flight requests complete normally.
The batch transitions to cancelled once all in-flight requests finish.
curl -X POST https://api.xerotier.ai/proj_ABC123/v1/batches/batch_abc123def456/cancel \
-H "Authorization: Bearer xero_myproject_your_api_key"
Output Format
When a batch completes, the output_file_id references a JSONL file
containing results. Each line corresponds to a request:
Successful Request
{
"id": "batch_req_abc123",
"custom_id": "req-1",
"response": {
"status_code": 200,
"request_id": "req-xyz",
"body": {
"id": "chatcmpl-abc",
"object": "chat.completion",
"choices": [...],
"usage": {...}
}
},
"error": null
}
Failed Request
Failed requests appear in the error_file_id JSONL file:
{
"id": "batch_req_def456",
"custom_id": "req-2",
"response": null,
"error": {
"code": "server_error",
"message": "Model unavailable"
}
}
Download the output file using the Files API:
curl https://api.xerotier.ai/proj_ABC123/v1/files/file-output789/content \
-H "Authorization: Bearer xero_myproject_your_api_key" \
-o results.jsonl
Status Lifecycle
| Status | Description |
|---|---|
validating | Input file is being parsed and validated. |
in_progress | Requests are being processed. |
finalizing | Building output and error files. |
completed | All requests processed. Output files available. |
failed | Unrecoverable error during processing. |
expired | 24-hour deadline exceeded. |
cancelling | Cancel requested, draining in-flight requests. |
cancelled | Cancellation complete. |
CLI Commands
The xeroctl CLI provides batch management commands:
# Upload input file
xeroctl files upload --purpose batch requests.jsonl
# Create batch
xeroctl batches create \
--input-file file-abc123def456 \
--endpoint /v1/chat/completions \
--endpoint-slug my-endpoint
# Check batch status
xeroctl batches get batch_abc123def456 --endpoint-slug my-endpoint
# List all batches
xeroctl batches list --endpoint-slug my-endpoint
# Cancel a batch
xeroctl batches cancel batch_abc123def456 --endpoint-slug my-endpoint
# Download output when complete
xeroctl files download file-output789 -o results.jsonl
Client Examples
Python (requests)
import requests
headers = {
"Authorization": "Bearer xero_myproject_your_api_key",
"Content-Type": "application/json"
}
base = "https://api.xerotier.ai/proj_ABC123/v1"
# Step 1: Upload a JSONL input file
with open("requests.jsonl", "rb") as f:
upload = requests.post(
f"{base}/files",
headers={"Authorization": headers["Authorization"]},
files={"file": ("requests.jsonl", f)},
data={"purpose": "batch"}
).json()
print(f"Uploaded file: {upload['id']}")
# Step 2: Create a batch
batch = requests.post(f"{base}/batches", headers=headers, json={
"input_file_id": upload["id"],
"endpoint": "/v1/chat/completions",
"completion_window": "24h",
"metadata": {"description": "nightly evaluation"}
}).json()
print(f"Batch created: {batch['id']} (status: {batch['status']})")
# Step 3: Poll for completion
import time
while True:
status = requests.get(
f"{base}/batches/{batch['id']}",
headers=headers
).json()
print(f"Status: {status['status']} ({status['request_counts']})")
if status["status"] in ("completed", "failed", "expired", "cancelled"):
break
time.sleep(10)
# Step 4: Download results
if status.get("output_file_id"):
output = requests.get(
f"{base}/files/{status['output_file_id']}/content",
headers={"Authorization": headers["Authorization"]}
)
with open("results.jsonl", "wb") as f:
f.write(output.content)
print("Results saved to results.jsonl")
# List batches
batches = requests.get(f"{base}/batches?limit=20", headers=headers).json()
for b in batches["data"]:
print(f"{b['id']} - {b['status']}")
# Cancel a batch
requests.post(f"{base}/batches/{batch['id']}/cancel", headers=headers)
Node.js (fetch)
import { readFile, writeFile } from "node:fs/promises";
const base = "https://api.xerotier.ai/proj_ABC123/v1";
const headers = {
"Authorization": "Bearer xero_myproject_your_api_key",
"Content-Type": "application/json"
};
// Step 1: Upload a JSONL input file
const fileData = await readFile("requests.jsonl");
const formData = new FormData();
formData.append("file", new Blob([fileData]), "requests.jsonl");
formData.append("purpose", "batch");
const uploadRes = await fetch(`${base}/files`, {
method: "POST",
headers: { "Authorization": headers.Authorization },
body: formData
});
const upload = await uploadRes.json();
console.log(`Uploaded file: ${upload.id}`);
// Step 2: Create a batch
const batchRes = await fetch(`${base}/batches`, {
method: "POST",
headers,
body: JSON.stringify({
input_file_id: upload.id,
endpoint: "/v1/chat/completions",
completion_window: "24h",
metadata: { description: "nightly evaluation" }
})
});
const batch = await batchRes.json();
console.log(`Batch created: ${batch.id} (${batch.status})`);
// Step 3: Poll for completion
let status;
while (true) {
const res = await fetch(`${base}/batches/${batch.id}`, { headers });
status = await res.json();
console.log(`Status: ${status.status}`);
if (["completed", "failed", "expired", "cancelled"].includes(status.status)) break;
await new Promise(r => setTimeout(r, 10000));
}
// Step 4: Download results
if (status.output_file_id) {
const outputRes = await fetch(
`${base}/files/${status.output_file_id}/content`,
{ headers: { "Authorization": headers.Authorization } }
);
const outputData = await outputRes.arrayBuffer();
await writeFile("results.jsonl", Buffer.from(outputData));
console.log("Results saved to results.jsonl");
}
Error Handling
| HTTP Status | Error Code | Description |
|---|---|---|
| 400 | invalid_request |
Malformed request or invalid parameters. |
| 400 | validation_error |
JSONL validation failed (malformed lines, duplicate custom_ids, etc.). |
| 401 | authentication_error |
Invalid or missing API key. |
| 404 | not_found |
Batch or input file not found. |
| 429 | rate_limit_exceeded |
Too many requests. Check the Retry-After header. |