// Model Management

Model Upload

Push a complete HuggingFace model directory into your project. Archive mode for everyday uploads, directory mode for very large models or resumable file-by-file transfers. Both go through chunked uploads; both end in the same manifest.

Xerotier.ai supports two methods for uploading complete model directories:

Method Best For Key Features
Archive Upload Standard model uploads Create a .tar.gz archive, upload in chunks, server extracts
Directory Upload Large models, incremental uploads Upload files individually, fine-grained control

Both methods use chunked uploads for reliability and support the same model formats. For basic chunked file uploads, see the Model Management API reference.

XIM deployment required. Custom uploaded models can only be deployed on XIM nodes. To use shared infrastructure, an administrator must add the model to the catalog and provision it on shared agents. See the Model Catalog documentation for details.

Limits and Quotas

chunk_size
104857600 bytes 100 MiB
session ttl
24 h from created_at
max bytes
remaining project quota at init

The maximum total upload size is not a flat per-call cap. It is derived from the project's storage quota and account tier at the moment the upload session is initialized (archive archive_size or directory files[].size sum is checked against the remaining quota).

There is no public "max upload bytes" header or static constant integrators can hard-code; sessions that request more bytes than the project has remaining are rejected at initialization with a quota error. To discover the effective ceiling for your project:

  • Inspect the project's quota and current storage usage from the dashboard, or
  • Attempt to initialize the upload and read the quota error envelope if rejected, or
  • Contact your administrator to confirm the tier-level cap.

chunk_size in init responses. Archive and directory init responses return a chunk_size field (currently 104857600 bytes, 100 MiB). Use this value as the read-buffer size when streaming parts; it is the per-request body cap enforced by the chunk endpoints. The value is server-controlled and may change; always read it from the init response rather than hard-coding 100 MiB.

Sessions expire 24 hours after creation. Plan resumes accordingly; see Status Lifecycle.

Archive Upload

  1. init POST /v1/uploads/archive
  2. parts POST /uploads/{id}/parts
  3. complete POST /uploads/{id}/complete

Upload compressed archives containing HuggingFace model directories. Archives are extracted server-side, preserving the model's file structure for inference engine compatibility.

Best for: Uploading complete HuggingFace model directories. Supports .tar, .tar.gz, and .tar.bz2 formats.

Supported Formats

Format Extension Content-Type
tar .tar application/x-tar
tar.gz .tar.gz, .tgz application/gzip
tar.bz2 .tar.bz2, .tbz2 application/x-bzip2

Creating an Archive

Create a properly formatted archive from your model directory. Files must be at the archive root (not nested in subdirectories).

Basic Command
COPYFILE_DISABLE=1 tar -chzf model-name.tar.gz -C /path/to/model/directory .

Command Options

Option Description
-c Create a new archive
-h Follow symlinks (required for HuggingFace cache)
-z Compress with gzip (use -j for bzip2)
-f Specify the output filename
-C Change to directory before adding files

Important: The -h flag is required when archiving from HuggingFace cache, as it uses symlinks. Without this flag, your archive will contain broken links instead of actual model files.

HuggingFace Cache Example

bash
# Find and archive a model from HuggingFace cache COPYFILE_DISABLE=1 tar -chzf Qwen3-0.6B.tar.gz -C ~/.cache/huggingface/hub/models--Qwen--Qwen3-0.6B/snapshots/abc123/ .

Verify Archive Structure

Before uploading, verify files are at the root level:

bash
$ tar -tzf model-name.tar.gz | head -5 ./config.json ./model.safetensors ./tokenizer.json ./tokenizer_config.json ./generation_config.json

Initialize Archive Upload

POST /proj_ABC123/v1/uploads/archive

Request Body

Parameter Type Description
model_namerequired string Display name for the model
archive_sizerequired integer Total archive size in bytes
archive_formatrequired string Archive format: tar, tar.gz, or tar.bz2
descriptionoptional string Description of the model
workload_typeoptional string Workload hint. Recommended values: chat, code, reasoning, embedding, multilingual. Free-form; defaults to chat. Unrecognized values are accepted but may not influence routing.
quantizationoptional string Runtime quantization method override. Accepted values include native, fp8, awq, gptq, int4, int8, bitsandbytes, bitblas, modelopt, torchao, compressed-tensors, quark, and ipex. Defaults to native.

Returns 201 Created on success.

curl
curl -X POST https://api.xerotier.ai/proj_ABC123/v1/uploads/archive \ -H "Authorization: Bearer xero_myproject_your_api_key" \ -H "Content-Type: application/json" \ -d '{ "model_name": "Qwen3-0.6B", "archive_size": 1234567890, "archive_format": "tar.gz" }'

Response

{ "id": "00000000-1111-0000-1111-000000000000", "object": "upload", "bytes": 1234567890, "created_at": 1706123456, "filename": "Qwen3-0.6B", "purpose": "model", "status": "pending", "expires_at": 1706209856, "upload_type": "archive", "chunk_size": 104857600, "total_chunks": 12, "uploaded_chunks": 0, "progress": 0 }

Upload Archive Chunks

After initialization, upload chunks sequentially. Pass the zero-based chunk index as a query parameter:

POST /proj_ABC123/v1/uploads/{sessionId}/parts?part_number={chunkIndex}

Include an X-Chunk-Checksum header with the SHA256 hex digest of the chunk data.

Complete Archive Upload

When all chunks are uploaded, complete the upload to trigger archive extraction and model creation:

POST /proj_ABC123/v1/uploads/{sessionId}/complete

Complete Response

{ "id": "00000000-1111-0000-1111-000000000000", "object": "upload", "status": "completed", "bytes": 1234567890, "created_at": 1706123456, "filename": "Qwen3-0.6B", "upload_type": "archive", "model": { "id": "660e8400-e29b-41d4-a716-446655440001", "name": "Qwen3-0.6B", "format": "safetensors", "size_bytes": 1234567890, "status": "validating", "context_length": null, "architecture": null, "quantization": "native" } }

Archive Upload Example

Python
import os import hashlib import requests API_KEY = "xero_myproject_your_api_key" BASE_URL = "https://api.xerotier.ai/proj_ABC123/v1" def upload_archive(archive_path: str, model_name: str): archive_size = os.path.getsize(archive_path) # Initialize init = requests.post(f"{BASE_URL}/uploads/archive", headers={"Authorization": f"Bearer {API_KEY}"}, json={"model_name": model_name, "archive_size": archive_size, "archive_format": "tar.gz"} ).json() session_id = init["id"] chunk_size = init["chunk_size"] # Upload chunks with open(archive_path, "rb") as f: idx = 0 while chunk := f.read(chunk_size): checksum = hashlib.sha256(chunk).hexdigest() requests.post(f"{BASE_URL}/uploads/{session_id}/parts?part_number={idx}", headers={"Authorization": f"Bearer {API_KEY}", "X-Chunk-Checksum": checksum}, data=chunk) idx += 1 # Complete return requests.post(f"{BASE_URL}/uploads/{session_id}/complete", headers={"Authorization": f"Bearer {API_KEY}"}).json() upload_archive("./qwen3-0.6b.tar.gz", "Qwen3-0.6B")

Node.js Archive Upload Example

Node.js
import { readFile, stat } from "node:fs/promises"; import { createHash } from "node:crypto"; const API_KEY = "xero_myproject_your_api_key"; const BASE_URL = "https://api.xerotier.ai/proj_ABC123/v1"; async function uploadArchive(archivePath, modelName) { const fileInfo = await stat(archivePath); const archiveSize = fileInfo.size; // Initialize const initResponse = await fetch(`${BASE_URL}/uploads/archive`, { method: "POST", headers: { "Authorization": `Bearer ${API_KEY}`, "Content-Type": "application/json" }, body: JSON.stringify({ model_name: modelName, archive_size: archiveSize, archive_format: "tar.gz" }) }); const init = await initResponse.json(); const sessionId = init.id; const chunkSize = init.chunk_size; // Read and upload chunks const fileData = await readFile(archivePath); for (let idx = 0; idx * chunkSize < fileData.length; idx++) { const chunk = fileData.subarray(idx * chunkSize, (idx + 1) * chunkSize); const checksum = createHash("sha256").update(chunk).digest("hex"); await fetch( `${BASE_URL}/uploads/${sessionId}/parts?part_number=${idx}`, { method: "POST", headers: { "Authorization": `Bearer ${API_KEY}`, "X-Chunk-Checksum": checksum }, body: chunk } ); } // Complete const completeResponse = await fetch( `${BASE_URL}/uploads/${sessionId}/complete`, { method: "POST", headers: { "Authorization": `Bearer ${API_KEY}` } } ); return completeResponse.json(); } await uploadArchive("./qwen3-0.6b.tar.gz", "Qwen3-0.6B");

Directory Upload

  1. init POST /v1/uploads/directory
  2. files POST /uploads/{id}/files/{path}
  3. complete POST /uploads/{id}/complete

Upload individual files that mirror a local model directory structure. Useful when creating an archive is impractical or for incremental uploads.

Best for: Large models where archive creation is slow, or when you want fine-grained control over individual file uploads.

Initialize Directory Upload

POST /proj_ABC123/v1/uploads/directory

Provide a manifest of all files to upload:

Request Body

Parameter Type Description
model_namerequired string Display name for the model
filesrequired array List of files to upload. Each entry must include relative_path (string) and size (integer bytes).
descriptionoptional string Description of the model
workload_typeoptional string Workload hint. Recommended values: chat, code, reasoning, embedding, multilingual. Free-form; defaults to chat. Unrecognized values are accepted but may not influence routing.
quantizationoptional string Runtime quantization method override. Accepted values include native, fp8, awq, gptq, int4, int8, bitsandbytes, bitblas, modelopt, torchao, compressed-tensors, quark, and ipex. Defaults to native.

Returns 201 Created on success.

curl
curl -X POST https://api.xerotier.ai/proj_ABC123/v1/uploads/directory \ -H "Authorization: Bearer xero_myproject_your_api_key" \ -H "Content-Type: application/json" \ -d '{ "model_name": "Qwen3-0.6B", "files": [ {"relative_path": "config.json", "size": 1234}, {"relative_path": "model.safetensors", "size": 1234000000}, {"relative_path": "tokenizer.json", "size": 5678} ] }'

Response

{ "id": "00000000-1111-0000-1111-000000000000", "object": "upload", "bytes": 1234001234, "created_at": 1706123456, "filename": "Qwen3-0.6B", "purpose": "model", "status": "pending", "expires_at": 1706209856, "upload_type": "directory", "chunk_size": 104857600, "uploaded_chunks": 0, "progress": 0, "chunk_upload_url": "v1/uploads/00000000-1111-0000-1111-000000000000/file-chunks", "files": [ { "relative_path": "config.json", "upload_path": "v1/uploads/00000000.../files/config.json", "size": 1234, "requires_chunking": false, "status": "pending" }, { "relative_path": "model.safetensors", "upload_path": "v1/uploads/00000000.../files/model.safetensors", "size": 1234000000, "requires_chunking": true, "total_chunks": 12, "chunk_url": "v1/uploads/00000000.../file-chunks", "status": "pending" } ] }

Upload Individual File

POST /proj_ABC123/v1/uploads/{sessionId}/files/{relativePath}

Headers

Header Description
X-File-Checksum SHA256 checksum of the file content (hex-encoded). See known issue below.

Known issue: header-name split-brain. The official CLI (xeroctl) sends X-File-Checksum for this endpoint, and that is the header name documented above. The server-side upload handler currently reads X-Chunk-Checksum on the same route. Until the two are reconciled, integrators following this documentation verbatim may see the server treat the checksum as empty (and, depending on validation mode, either pass data unchecked or surface a checksum-mismatch error). To be forward-compatible, clients SHOULD send both headers with the same value:

curl
CHECKSUM=$(sha256sum config.json | cut -d' ' -f1) curl -X POST "https://api.xerotier.ai/proj_ABC123/v1/uploads/$SESSION_ID/files/config.json" \ -H "Authorization: Bearer xero_myproject_your_api_key" \ -H "X-File-Checksum: $CHECKSUM" \ -H "X-Chunk-Checksum: $CHECKSUM" \ --data-binary @config.json

Response

{ "relative_path": "config.json", "size": 1234, "checksum": "a1b2c3d4e5f6...", "uploaded_file_count": 1, "expected_file_count": 7, "progress": 14.28 }

Directory Upload Example

Python
import os import hashlib import requests from pathlib import Path API_KEY = "xero_myproject_your_api_key" BASE_URL = "https://api.xerotier.ai/proj_ABC123/v1" def upload_directory(model_dir: str, model_name: str): model_path = Path(model_dir) # Build manifest files = [{"relative_path": str(f.relative_to(model_path)), "size": f.stat().st_size} for f in model_path.rglob("*") if f.is_file()] # Initialize init = requests.post(f"{BASE_URL}/uploads/directory", headers={"Authorization": f"Bearer {API_KEY}"}, json={"model_name": model_name, "files": files} ).json() session_id = init["id"] # Upload each file for file_info in files: file_path = model_path / file_info["relative_path"] with open(file_path, "rb") as f: data = f.read() checksum = hashlib.sha256(data).hexdigest() requests.post(f"{BASE_URL}/uploads/{session_id}/files/{file_info['relative_path']}", headers={ "Authorization": f"Bearer {API_KEY}", "X-File-Checksum": checksum, "X-Chunk-Checksum": checksum, }, data=data) # Complete return requests.post(f"{BASE_URL}/uploads/{session_id}/complete", headers={"Authorization": f"Bearer {API_KEY}"}).json() upload_directory("./models--Qwen--Qwen3-0.6B", "Qwen3-0.6B")

Node.js Directory Upload Example

Node.js
import { readFile, stat, readdir } from "node:fs/promises"; import { createHash } from "node:crypto"; import { join, relative } from "node:path"; const API_KEY = "xero_myproject_your_api_key"; const BASE_URL = "https://api.xerotier.ai/proj_ABC123/v1"; async function getFiles(dir, base) { const entries = await readdir(dir, { withFileTypes: true }); const files = []; for (const entry of entries) { const fullPath = join(dir, entry.name); if (entry.isFile()) { const info = await stat(fullPath); files.push({ relative_path: relative(base, fullPath), size: info.size }); } else if (entry.isDirectory()) { files.push(...await getFiles(fullPath, base)); } } return files; } async function uploadDirectory(modelDir, modelName) { const files = await getFiles(modelDir, modelDir); // Initialize const initResponse = await fetch(`${BASE_URL}/uploads/directory`, { method: "POST", headers: { "Authorization": `Bearer ${API_KEY}`, "Content-Type": "application/json" }, body: JSON.stringify({ model_name: modelName, files }) }); const init = await initResponse.json(); const sessionId = init.id; // Upload each file for (const fileInfo of files) { const data = await readFile(join(modelDir, fileInfo.relative_path)); const checksum = createHash("sha256").update(data).digest("hex"); await fetch( `${BASE_URL}/uploads/${sessionId}/files/${fileInfo.relative_path}`, { method: "POST", headers: { "Authorization": `Bearer ${API_KEY}`, "X-File-Checksum": checksum, "X-Chunk-Checksum": checksum }, body: data } ); } // Complete const completeResponse = await fetch( `${BASE_URL}/uploads/${sessionId}/complete`, { method: "POST", headers: { "Authorization": `Bearer ${API_KEY}` } } ); return completeResponse.json(); } await uploadDirectory("./models--Qwen--Qwen3-0.6B", "Qwen3-0.6B");

Complete Directory Upload

After all files are uploaded, call the complete endpoint to finalize the model record and trigger metadata extraction:

POST /proj_ABC123/v1/uploads/{sessionId}/complete

Returns a RouterUploadCompleteResponse with the same shape as the archive complete response above.

Chunked File Upload (Directory Mode)

  1. init POST /uploads/directory
  2. chunks POST /uploads/{id}/file-chunks/{n}
  3. file POST /uploads/{id}/file-complete
  4. session POST /uploads/{id}/complete

Individual files larger than chunk_size (100 MiB by default) cannot be sent in a single POST /v1/uploads/{sessionId}/files/{relativePath} request, the per-request body cap is enforced by the server. For these files, the directory init response marks the entry with requires_chunking: true, returns a total_chunks count, and advertises a chunk_url ending in /file-chunks. Use the chunk endpoint below for each chunk, then call file-complete to finalize the file before completing the session.

Upload a File Chunk

POST /proj_ABC123/v1/uploads/{sessionId}/file-chunks/{chunkIndex}?relative_path={relativePath}

Path Parameters

Parameter Description
sessionId Upload session ID returned by directory init.
chunkIndex Zero-based chunk index for the file.

Query Parameters

Parameter Description
relative_pathrequired Relative path of the file being chunked, matching an entry in the directory manifest.

Headers

Header Description
X-Chunk-Checksumrequired SHA256 hex digest of this chunk's bytes. Empty values are rejected with 400.

Body is the raw chunk bytes; the per-request cap matches the chunk_size returned by directory init.

curl
CHECKSUM=$(sha256sum chunk-0.bin | cut -d' ' -f1) curl -X POST "https://api.xerotier.ai/proj_ABC123/v1/uploads/$SESSION_ID/file-chunks/0?relative_path=model.safetensors" \ -H "Authorization: Bearer xero_myproject_your_api_key" \ -H "X-Chunk-Checksum: $CHECKSUM" \ --data-binary @chunk-0.bin

Complete a Chunked File

After all chunks for a single file have been uploaded, call file-complete to assemble and validate that file. This is the per-file finalization hook; the session-level /complete endpoint is still required afterwards to finalize the model record.

POST /proj_ABC123/v1/uploads/{sessionId}/file-complete?relative_path={relativePath}

curl
curl -X POST "https://api.xerotier.ai/proj_ABC123/v1/uploads/$SESSION_ID/file-complete?relative_path=model.safetensors" \ -H "Authorization: Bearer xero_myproject_your_api_key"

Chunked File Upload Flow

  1. Initialize a directory upload; inspect each files[] entry for requires_chunking.
  2. For non-chunked files, POST to /files/{relativePath} as shown in Directory Upload.
  3. For chunked files, slice the file into chunk_size-byte parts and POST each one to /file-chunks/{chunkIndex}?relative_path=... with X-Chunk-Checksum.
  4. Call /file-complete?relative_path=... once all chunks for that file are uploaded.
  5. Repeat per file, then call the session-level /complete endpoint.

Resuming Interrupted Uploads

If an upload is interrupted, use the resume endpoint to find out which chunks or files still need to be uploaded:

POST /proj_ABC123/v1/uploads/{sessionId}/resume

curl
curl -X POST https://api.xerotier.ai/proj_ABC123/v1/uploads/00000000-1111-0000-1111-000000000000/resume \ -H "Authorization: Bearer xero_myproject_your_api_key"

Response

{ "id": "00000000-1111-0000-1111-000000000000", "next_chunk_index": 5, "uploaded_chunks": 5, "missing_chunks": [] }

Resume by uploading starting from next_chunk_index, or re-uploading any missing_chunks. Sessions expire 24 hours after creation (wall-clock from created_at), regardless of in-flight activity. See Status Lifecycle.

Status Lifecycle

Two independent status fields appear in upload responses: the upload session status (top-level status) and the model record status (model.status in the complete response). They have different value sets and lifecycles.

Upload Session Status

Value Meaning
pending Session created; awaiting parts or files.
completed All parts/files received and the session-level complete endpoint succeeded.
cancelled Session was explicitly cancelled (via the cancel endpoint) or expired.

Sessions expire 24 hours after creation. Plan resumes inside that window; after expiry the session moves to cancelled and chunks must be re-uploaded under a fresh session. See Uploads API status lifecycle for the canonical reference.

Model Record Status

After a successful session complete, the response contains a model object whose status reflects post-upload validation (for example validating, transitioning to a terminal state once metadata extraction and weight verification finish). This is a separate lifecycle from the upload session and continues running after the upload session is completed.

Required Model Files

For inference engine compatibility, models must include:

File Required Description
config.json Yes Model configuration
*.safetensors or *.bin Yes Model weights (at least one). .safetensors is preferred; .bin (PyTorch pickle) is accepted for compatibility. Other engine-specific formats such as .exl2 (ExLlamaV2) are not supported by the inference engines provisioned in this project.
tokenizer.json No Tokenizer configuration
tokenizer_config.json No Tokenizer settings