// API Reference

Batch API

Asynchronous inference on idle worker capacity, with a 24-hour completion deadline. Upload a JSONL, create a batch, poll, download. Per-token billed at the same rate as synchronous traffic.

Base path
/:project_id/v1/batches
Content-Type
application/json
Window
"24h" (string)
Max input
100 MB JSONL

The workflow is:

  • Upload, Upload a JSONL file containing your requests via the Files API.
  • Create, Create a batch referencing the uploaded file.
  • Poll, Check batch status until it reaches a terminal state.
  • Download, Download the output file containing results.

Batch requests share the same API rate limit quotas as synchronous traffic and are billed at the base per-token rate for your endpoint's tier. Unlike some providers, no batch-tier pricing discount is applied. Batch input and output files are stored using the platform's two-tier storage architecture. For details on storage tiers, encryption, retention, and billing, see Storage.

Input file size limit: Batch input files must not exceed 100 MB. Larger files may upload successfully through the Files API (which permits up to 500 MB) but will fail batch validation with status: "failed".

Supported Endpoints

Batch processing supports the following target endpoints:

Endpoint Description
/v1/chat/completions Batch chat completion requests
/v1/embeddings Batch embedding generation
/v1/responses Batch Responses API processing

All lines within a single batch must target the same endpoint. The target endpoint is specified both in the batch creation request and in each JSONL line's url field.

JSONL Format

Each line in the input file must be a valid JSON object with the following fields:

Field Type Description
custom_idrequired string Unique correlation ID. Must be unique within the file.
methodrequired string Must be "POST".
urlrequired string Target endpoint path (e.g., /v1/chat/completions).
bodyrequired object Request body matching the target endpoint format.

Chat Completions Example

JSONL
{ "custom_id": "req-1", "method": "POST", "url": "/v1/chat/completions", "body": { "model": "llama-3.1-70b", "messages": [{"role": "user", "content": "Hello"}], "max_tokens": 100 } } { "custom_id": "req-2", "method": "POST", "url": "/v1/chat/completions", "body": { "model": "llama-3.1-70b", "messages": [{"role": "user", "content": "What is 2+2?"}], "max_tokens": 100 } }
Embeddings example
JSONL
{ "custom_id": "emb-1", "method": "POST", "url": "/v1/embeddings", "body": { "model": "bge-large-en-v1.5", "input": "First document text" } } { "custom_id": "emb-2", "method": "POST", "url": "/v1/embeddings", "body": { "model": "bge-large-en-v1.5", "input": "Second document text" } }
Responses API example
JSONL
{ "custom_id": "resp-1", "method": "POST", "url": "/v1/responses", "body": { "model": "llama-3.1-8b", "input": "Summarize this document." } } { "custom_id": "resp-2", "method": "POST", "url": "/v1/responses", "body": { "model": "llama-3.1-8b", "input": "Translate to French: Hello world" } }

Note: Batch Responses API requests do not support background: true or stream: true. The store parameter defaults to false for batch responses.

Line-level errors: Malformed JSONL lines and duplicate custom_id values do not cause POST /v1/batches to return 400. The create call succeeds, then the batch advances through validating and terminates in failed with the per-line problems surfaced on the batch's top-level errors field (or, for partial failures during processing, in the error_file_id JSONL).

Create Batch

POST /:project_id/v1/batches

Parameter Type Description
input_file_idrequired string ID of the uploaded JSONL file.
endpointrequired string Target endpoint: /v1/chat/completions, /v1/embeddings, or /v1/responses.
completion_windowrequired string Must be the string "24h", not the integer 24.
metadataoptional object Custom key-value pairs. The serialized JSON object must not exceed 16 KB.
curl
curl -X POST https://api.xerotier.ai/proj_ABC123/v1/batches \ -H "Authorization: Bearer xero_myproject_your_api_key" \ -H "Content-Type: application/json" \ -d '{ "input_file_id": "file-abc123def456", "endpoint": "/v1/chat/completions", "completion_window": "24h", "metadata": {"description": "nightly evaluation"} }'

Response

{ "id": "batch_abc123def456", "object": "batch", "endpoint": "/v1/chat/completions", "input_file_id": "file-abc123def456", "output_file_id": null, "error_file_id": null, "status": "validating", "completion_window": "24h", "created_at": 1709000000, "in_progress_at": null, "expires_at": 1709086400, "finalizing_at": null, "completed_at": null, "failed_at": null, "expired_at": null, "cancelling_at": null, "cancelled_at": null, "metadata": {"description": "nightly evaluation"}, "request_counts": { "total": 0, "completed": 0, "failed": 0 }, "errors": null, "usage": null }

Response Fields

Field Type Description
id string External batch identifier (e.g., batch_abc123).
status string Current batch status. See Status Lifecycle.
input_file_id string External ID of the input JSONL file.
output_file_id string | null External ID of the output JSONL file. Set once the output upload completes, typically while the batch is in the finalizing state and always by the time it reaches completed.
error_file_id string | null External ID of the error JSONL file. Set if any requests failed.
in_progress_at integer | null Unix timestamp when processing started.
expires_at integer | null Unix timestamp when the 24h completion window expires.
finalizing_at integer | null Unix timestamp when the batch entered the finalizing state.
completed_at integer | null Unix timestamp when the batch completed successfully.
failed_at integer | null Unix timestamp when the batch failed.
expired_at integer | null Unix timestamp when the batch actually expired (distinct from expires_at).
cancelling_at integer | null Unix timestamp when cancellation was requested.
cancelled_at integer | null Unix timestamp when cancellation completed.
request_counts object Processing counts: total, completed, failed.
errors object | null Top-level error information if the batch itself failed. Contains object: "list" and data array of error records.
usage object | null Aggregate token usage when requests have completed: total_tokens, prompt_tokens, completion_tokens.

List Batches

GET /:project_id/v1/batches

curl
curl https://api.xerotier.ai/proj_ABC123/v1/batches?limit=20 \ -H "Authorization: Bearer xero_myproject_your_api_key"

Get Batch

GET /:project_id/v1/batches/{batch_id}

curl
curl https://api.xerotier.ai/proj_ABC123/v1/batches/batch_abc123def456 \ -H "Authorization: Bearer xero_myproject_your_api_key"

Cancel Batch

POST /:project_id/v1/batches/{batch_id}/cancel

curl
curl -X POST https://api.xerotier.ai/proj_ABC123/v1/batches/batch_abc123def456/cancel \ -H "Authorization: Bearer xero_myproject_your_api_key"

Cancellation stops dispatching new requests. In-flight requests complete normally. The batch transitions to cancelled once all in-flight requests finish.

Output Format

When a batch completes, the output_file_id references a JSONL file containing results. Each line corresponds to a request:

Successful Request

JSONL
{ "id": "batch_req_abc123", "custom_id": "req-1", "response": { "status_code": 200, "request_id": "req-xyz", "body": { "id": "chatcmpl-abc", "object": "chat.completion", "choices": [...], "usage": {...} } }, "error": null }

Failed Request

Failed requests appear in the error_file_id JSONL file:

JSONL
{ "id": "batch_req_def456", "custom_id": "req-2", "response": null, "error": { "code": "model_not_found", "message": "Requested model is not available on this endpoint", "param": "model", "line": 42 } }
Field Type Description
code string Canonical machine-readable code. See Error Handling.
message string Human-readable explanation of the failure.
paramoptional string Offending request field when applicable.
lineoptional integer 1-based line number in the original input JSONL.

Download the output file using the Files API (see Download File Content):

curl
curl https://api.xerotier.ai/proj_ABC123/v1/files/file-output789/content \ -H "Authorization: Bearer xero_myproject_your_api_key" \ -o results.jsonl

Status Lifecycle

Status Description
validatingInput file is being parsed and validated.
in_progressRequests are being processed.
finalizingBuilding output and error files.
completedAll requests processed. Output files available.
failedUnrecoverable error during processing.
expired24-hour deadline exceeded.
cancellingCancel requested, draining in-flight requests.
cancelledCancellation complete.

CLI Commands

The xeroctl CLI provides batch management commands. The full flag surface lives in xeroctl batches.

Full batch workflow
# Upload input file xeroctl files --upload requests.jsonl --purpose batch # Submit a batch against an endpoint slug xeroctl batches --endpoint my-endpoint --input requests.jsonl # Check batch status xeroctl batches --endpoint my-endpoint --get batch_abc123def456 # List all batches for an endpoint xeroctl batches --endpoint my-endpoint --list # Cancel a batch xeroctl batches --endpoint my-endpoint --cancel batch_abc123def456 # Download output when complete xeroctl files --download file-output789 --output results.jsonl

CLI endpoint coverage: The xeroctl batches command currently submits batches against /v1/chat/completions only. The HTTP API supports /v1/embeddings and /v1/responses as batch targets as well (see Supported Endpoints); to submit those today, call the REST API directly using the curl or client examples on this page. Confirm the exact flag spelling for your build with xeroctl batches --help and xeroctl files --help.

Client Examples

Python (requests)

Python
import requests headers = { "Authorization": "Bearer xero_myproject_your_api_key", "Content-Type": "application/json" } base = "https://api.xerotier.ai/proj_ABC123/v1" # Step 1: Upload a JSONL input file with open("requests.jsonl", "rb") as f: upload = requests.post( f"{base}/files", headers={"Authorization": headers["Authorization"]}, files={"file": ("requests.jsonl", f)}, data={"purpose": "batch"} ).json() print(f"Uploaded file: {upload['id']}") # Step 2: Create a batch batch = requests.post(f"{base}/batches", headers=headers, json={ "input_file_id": upload["id"], "endpoint": "/v1/chat/completions", "completion_window": "24h", "metadata": {"description": "nightly evaluation"} }).json() print(f"Batch created: {batch['id']} (status: {batch['status']})") # Step 3: Poll for completion import time while True: status = requests.get( f"{base}/batches/{batch['id']}", headers=headers ).json() print(f"Status: {status['status']} ({status['request_counts']})") if status["status"] in ("completed", "failed", "expired", "cancelled"): break time.sleep(10) # Step 4: Download results if status.get("output_file_id"): output = requests.get( f"{base}/files/{status['output_file_id']}/content", headers={"Authorization": headers["Authorization"]} ) with open("results.jsonl", "wb") as f: f.write(output.content) print("Results saved to results.jsonl") # List batches batches = requests.get(f"{base}/batches?limit=20", headers=headers).json() for b in batches["data"]: print(f"{b['id']} - {b['status']}") # Cancel a batch requests.post(f"{base}/batches/{batch['id']}/cancel", headers=headers)

Node.js (fetch)

JavaScript
import { readFile, writeFile } from "node:fs/promises"; const base = "https://api.xerotier.ai/proj_ABC123/v1"; const headers = { "Authorization": "Bearer xero_myproject_your_api_key", "Content-Type": "application/json" }; // Step 1: Upload a JSONL input file const fileData = await readFile("requests.jsonl"); const formData = new FormData(); formData.append("file", new Blob([fileData]), "requests.jsonl"); formData.append("purpose", "batch"); const uploadRes = await fetch(`${base}/files`, { method: "POST", headers: { "Authorization": headers.Authorization }, body: formData }); const upload = await uploadRes.json(); console.log(`Uploaded file: ${upload.id}`); // Step 2: Create a batch const batchRes = await fetch(`${base}/batches`, { method: "POST", headers, body: JSON.stringify({ input_file_id: upload.id, endpoint: "/v1/chat/completions", completion_window: "24h", metadata: { description: "nightly evaluation" } }) }); const batch = await batchRes.json(); console.log(`Batch created: ${batch.id} (${batch.status})`); // Step 3: Poll for completion let status; while (true) { const res = await fetch(`${base}/batches/${batch.id}`, { headers }); status = await res.json(); console.log(`Status: ${status.status}`); if (["completed", "failed", "expired", "cancelled"].includes(status.status)) break; await new Promise(r => setTimeout(r, 10000)); } // Step 4: Download results if (status.output_file_id) { const outputRes = await fetch( `${base}/files/${status.output_file_id}/content`, { headers: { "Authorization": headers.Authorization } } ); const outputData = await outputRes.arrayBuffer(); await writeFile("results.jsonl", Buffer.from(outputData)); console.log("Results saved to results.jsonl"); }

Error Handling

HTTP Status Error Code Description
400 invalid_request Malformed request or invalid parameters.
400 validation_error JSONL validation failed (malformed lines, duplicate custom_ids, etc.).
401 authentication_error Invalid or missing API key.
404 not_found Batch or input file not found.
429 rate_limit_exceeded Too many requests. Check the Retry-After header.