Batch API
Asynchronous inference on idle worker capacity, with a 24-hour completion deadline. Upload a JSONL, create a batch, poll, download. Per-token billed at the same rate as synchronous traffic.
- Base path
/:project_id/v1/batches- Content-Type
application/json- Window
"24h"(string)- Max input
- 100 MB JSONL
The workflow is:
- Upload, Upload a JSONL file containing your requests via the Files API.
- Create, Create a batch referencing the uploaded file.
- Poll, Check batch status until it reaches a terminal state.
- Download, Download the output file containing results.
Batch requests share the same API rate limit quotas as synchronous traffic and are billed at the base per-token rate for your endpoint's tier. Unlike some providers, no batch-tier pricing discount is applied. Batch input and output files are stored using the platform's two-tier storage architecture. For details on storage tiers, encryption, retention, and billing, see Storage.
Input file size limit: Batch input files must not exceed
100 MB. Larger files may upload successfully through the Files API (which
permits up to 500 MB) but will fail batch validation with
status: "failed".
Supported Endpoints
Batch processing supports the following target endpoints:
| Endpoint | Description |
|---|---|
/v1/chat/completions |
Batch chat completion requests |
/v1/embeddings |
Batch embedding generation |
/v1/responses |
Batch Responses API processing |
All lines within a single batch must target the same endpoint. The target endpoint
is specified both in the batch creation request and in each JSONL line's
url field.
JSONL Format
Each line in the input file must be a valid JSON object with the following fields:
| Field | Type | Description |
|---|---|---|
| custom_idrequired | string | Unique correlation ID. Must be unique within the file. |
| methodrequired | string | Must be "POST". |
| urlrequired | string | Target endpoint path (e.g., /v1/chat/completions). |
| bodyrequired | object | Request body matching the target endpoint format. |
Chat Completions Example
{
"custom_id": "req-1",
"method": "POST",
"url": "/v1/chat/completions",
"body": {
"model": "llama-3.1-70b",
"messages": [{"role": "user", "content": "Hello"}],
"max_tokens": 100
}
}
{
"custom_id": "req-2",
"method": "POST",
"url": "/v1/chat/completions",
"body": {
"model": "llama-3.1-70b",
"messages": [{"role": "user", "content": "What is 2+2?"}],
"max_tokens": 100
}
}
Embeddings example
{
"custom_id": "emb-1",
"method": "POST",
"url": "/v1/embeddings",
"body": {
"model": "bge-large-en-v1.5",
"input": "First document text"
}
}
{
"custom_id": "emb-2",
"method": "POST",
"url": "/v1/embeddings",
"body": {
"model": "bge-large-en-v1.5",
"input": "Second document text"
}
}
Responses API example
{
"custom_id": "resp-1",
"method": "POST",
"url": "/v1/responses",
"body": {
"model": "llama-3.1-8b",
"input": "Summarize this document."
}
}
{
"custom_id": "resp-2",
"method": "POST",
"url": "/v1/responses",
"body": {
"model": "llama-3.1-8b",
"input": "Translate to French: Hello world"
}
}
Note: Batch Responses API requests do not support
background: true or stream: true. The
store parameter defaults to false for batch responses.
Line-level errors: Malformed JSONL lines and duplicate
custom_id values do not cause POST /v1/batches to
return 400. The create call succeeds, then the batch advances through
validating and terminates in failed with the
per-line problems surfaced on the batch's top-level errors
field (or, for partial failures during processing, in the
error_file_id JSONL).
Create Batch
POST /:project_id/v1/batches
| Parameter | Type | Description |
|---|---|---|
| input_file_idrequired | string | ID of the uploaded JSONL file. |
| endpointrequired | string | Target endpoint: /v1/chat/completions, /v1/embeddings, or /v1/responses. |
| completion_windowrequired | string | Must be the string "24h", not the integer 24. |
| metadataoptional | object | Custom key-value pairs. The serialized JSON object must not exceed 16 KB. |
curl -X POST https://api.xerotier.ai/proj_ABC123/v1/batches \
-H "Authorization: Bearer xero_myproject_your_api_key" \
-H "Content-Type: application/json" \
-d '{
"input_file_id": "file-abc123def456",
"endpoint": "/v1/chat/completions",
"completion_window": "24h",
"metadata": {"description": "nightly evaluation"}
}'
Response
{
"id": "batch_abc123def456",
"object": "batch",
"endpoint": "/v1/chat/completions",
"input_file_id": "file-abc123def456",
"output_file_id": null,
"error_file_id": null,
"status": "validating",
"completion_window": "24h",
"created_at": 1709000000,
"in_progress_at": null,
"expires_at": 1709086400,
"finalizing_at": null,
"completed_at": null,
"failed_at": null,
"expired_at": null,
"cancelling_at": null,
"cancelled_at": null,
"metadata": {"description": "nightly evaluation"},
"request_counts": {
"total": 0,
"completed": 0,
"failed": 0
},
"errors": null,
"usage": null
}
Response Fields
| Field | Type | Description |
|---|---|---|
| id | string | External batch identifier (e.g., batch_abc123). |
| status | string | Current batch status. See Status Lifecycle. |
| input_file_id | string | External ID of the input JSONL file. |
| output_file_id | string | null | External ID of the output JSONL file. Set once the output upload completes, typically while the batch is in the finalizing state and always by the time it reaches completed. |
| error_file_id | string | null | External ID of the error JSONL file. Set if any requests failed. |
| in_progress_at | integer | null | Unix timestamp when processing started. |
| expires_at | integer | null | Unix timestamp when the 24h completion window expires. |
| finalizing_at | integer | null | Unix timestamp when the batch entered the finalizing state. |
| completed_at | integer | null | Unix timestamp when the batch completed successfully. |
| failed_at | integer | null | Unix timestamp when the batch failed. |
| expired_at | integer | null | Unix timestamp when the batch actually expired (distinct from expires_at). |
| cancelling_at | integer | null | Unix timestamp when cancellation was requested. |
| cancelled_at | integer | null | Unix timestamp when cancellation completed. |
| request_counts | object | Processing counts: total, completed, failed. |
| errors | object | null | Top-level error information if the batch itself failed. Contains object: "list" and data array of error records. |
| usage | object | null | Aggregate token usage when requests have completed: total_tokens, prompt_tokens, completion_tokens. |
List Batches
GET /:project_id/v1/batches
curl https://api.xerotier.ai/proj_ABC123/v1/batches?limit=20 \
-H "Authorization: Bearer xero_myproject_your_api_key"
Get Batch
GET /:project_id/v1/batches/{batch_id}
curl https://api.xerotier.ai/proj_ABC123/v1/batches/batch_abc123def456 \
-H "Authorization: Bearer xero_myproject_your_api_key"
Cancel Batch
POST /:project_id/v1/batches/{batch_id}/cancel
curl -X POST https://api.xerotier.ai/proj_ABC123/v1/batches/batch_abc123def456/cancel \
-H "Authorization: Bearer xero_myproject_your_api_key"
Cancellation stops dispatching new requests. In-flight requests complete normally.
The batch transitions to cancelled once all in-flight requests finish.
Output Format
When a batch completes, the output_file_id references a JSONL file
containing results. Each line corresponds to a request:
Successful Request
{
"id": "batch_req_abc123",
"custom_id": "req-1",
"response": {
"status_code": 200,
"request_id": "req-xyz",
"body": {
"id": "chatcmpl-abc",
"object": "chat.completion",
"choices": [...],
"usage": {...}
}
},
"error": null
}
Failed Request
Failed requests appear in the error_file_id JSONL file:
{
"id": "batch_req_def456",
"custom_id": "req-2",
"response": null,
"error": {
"code": "model_not_found",
"message": "Requested model is not available on this endpoint",
"param": "model",
"line": 42
}
}
| Field | Type | Description |
|---|---|---|
| code | string | Canonical machine-readable code. See Error Handling. |
| message | string | Human-readable explanation of the failure. |
| paramoptional | string | Offending request field when applicable. |
| lineoptional | integer | 1-based line number in the original input JSONL. |
Download the output file using the Files API (see Download File Content):
curl https://api.xerotier.ai/proj_ABC123/v1/files/file-output789/content \
-H "Authorization: Bearer xero_myproject_your_api_key" \
-o results.jsonl
Status Lifecycle
| Status | Description |
|---|---|
validating | Input file is being parsed and validated. |
in_progress | Requests are being processed. |
finalizing | Building output and error files. |
completed | All requests processed. Output files available. |
failed | Unrecoverable error during processing. |
expired | 24-hour deadline exceeded. |
cancelling | Cancel requested, draining in-flight requests. |
cancelled | Cancellation complete. |
CLI Commands
The xeroctl CLI provides batch management commands. The full flag surface lives in xeroctl batches.
# Upload input file
xeroctl files --upload requests.jsonl --purpose batch
# Submit a batch against an endpoint slug
xeroctl batches --endpoint my-endpoint --input requests.jsonl
# Check batch status
xeroctl batches --endpoint my-endpoint --get batch_abc123def456
# List all batches for an endpoint
xeroctl batches --endpoint my-endpoint --list
# Cancel a batch
xeroctl batches --endpoint my-endpoint --cancel batch_abc123def456
# Download output when complete
xeroctl files --download file-output789 --output results.jsonl
CLI endpoint coverage: The xeroctl batches
command currently submits batches against
/v1/chat/completions only. The HTTP API supports
/v1/embeddings and /v1/responses as batch targets
as well (see Supported Endpoints); to
submit those today, call the REST API directly using the curl or client
examples on this page. Confirm the exact flag spelling for your build with
xeroctl batches --help and xeroctl files --help.
Client Examples
Python (requests)
import requests
headers = {
"Authorization": "Bearer xero_myproject_your_api_key",
"Content-Type": "application/json"
}
base = "https://api.xerotier.ai/proj_ABC123/v1"
# Step 1: Upload a JSONL input file
with open("requests.jsonl", "rb") as f:
upload = requests.post(
f"{base}/files",
headers={"Authorization": headers["Authorization"]},
files={"file": ("requests.jsonl", f)},
data={"purpose": "batch"}
).json()
print(f"Uploaded file: {upload['id']}")
# Step 2: Create a batch
batch = requests.post(f"{base}/batches", headers=headers, json={
"input_file_id": upload["id"],
"endpoint": "/v1/chat/completions",
"completion_window": "24h",
"metadata": {"description": "nightly evaluation"}
}).json()
print(f"Batch created: {batch['id']} (status: {batch['status']})")
# Step 3: Poll for completion
import time
while True:
status = requests.get(
f"{base}/batches/{batch['id']}",
headers=headers
).json()
print(f"Status: {status['status']} ({status['request_counts']})")
if status["status"] in ("completed", "failed", "expired", "cancelled"):
break
time.sleep(10)
# Step 4: Download results
if status.get("output_file_id"):
output = requests.get(
f"{base}/files/{status['output_file_id']}/content",
headers={"Authorization": headers["Authorization"]}
)
with open("results.jsonl", "wb") as f:
f.write(output.content)
print("Results saved to results.jsonl")
# List batches
batches = requests.get(f"{base}/batches?limit=20", headers=headers).json()
for b in batches["data"]:
print(f"{b['id']} - {b['status']}")
# Cancel a batch
requests.post(f"{base}/batches/{batch['id']}/cancel", headers=headers)
Node.js (fetch)
import { readFile, writeFile } from "node:fs/promises";
const base = "https://api.xerotier.ai/proj_ABC123/v1";
const headers = {
"Authorization": "Bearer xero_myproject_your_api_key",
"Content-Type": "application/json"
};
// Step 1: Upload a JSONL input file
const fileData = await readFile("requests.jsonl");
const formData = new FormData();
formData.append("file", new Blob([fileData]), "requests.jsonl");
formData.append("purpose", "batch");
const uploadRes = await fetch(`${base}/files`, {
method: "POST",
headers: { "Authorization": headers.Authorization },
body: formData
});
const upload = await uploadRes.json();
console.log(`Uploaded file: ${upload.id}`);
// Step 2: Create a batch
const batchRes = await fetch(`${base}/batches`, {
method: "POST",
headers,
body: JSON.stringify({
input_file_id: upload.id,
endpoint: "/v1/chat/completions",
completion_window: "24h",
metadata: { description: "nightly evaluation" }
})
});
const batch = await batchRes.json();
console.log(`Batch created: ${batch.id} (${batch.status})`);
// Step 3: Poll for completion
let status;
while (true) {
const res = await fetch(`${base}/batches/${batch.id}`, { headers });
status = await res.json();
console.log(`Status: ${status.status}`);
if (["completed", "failed", "expired", "cancelled"].includes(status.status)) break;
await new Promise(r => setTimeout(r, 10000));
}
// Step 4: Download results
if (status.output_file_id) {
const outputRes = await fetch(
`${base}/files/${status.output_file_id}/content`,
{ headers: { "Authorization": headers.Authorization } }
);
const outputData = await outputRes.arrayBuffer();
await writeFile("results.jsonl", Buffer.from(outputData));
console.log("Results saved to results.jsonl");
}
Error Handling
| HTTP Status | Error Code | Description |
|---|---|---|
| 400 | invalid_request |
Malformed request or invalid parameters. |
| 400 | validation_error |
JSONL validation failed (malformed lines, duplicate custom_ids, etc.). |
| 401 | authentication_error |
Invalid or missing API key. |
| 404 | not_found |
Batch or input file not found. |
| 429 | rate_limit_exceeded |
Too many requests. Check the Retry-After header. |