Model Upload
Push a complete HuggingFace model directory into your project. Archive mode for everyday uploads, directory mode for very large models or resumable file-by-file transfers. Both go through chunked uploads; both end in the same manifest.
Xerotier.ai supports two methods for uploading complete model directories:
| Method | Best For | Key Features |
|---|---|---|
| Archive Upload | Standard model uploads | Create a .tar.gz archive, upload in chunks, server extracts |
| Directory Upload | Large models, incremental uploads | Upload files individually, fine-grained control |
Both methods use chunked uploads for reliability and support the same model formats. For basic chunked file uploads, see the Model Management API reference.
XIM deployment required. Custom uploaded models can only be deployed on XIM nodes. To use shared infrastructure, an administrator must add the model to the catalog and provision it on shared agents. See the Model Catalog documentation for details.
Limits and Quotas
- chunk_size
- 104857600 bytes 100 MiB
- session ttl
- 24 h from created_at
- max bytes
- remaining project quota at init
The maximum total upload size is not a flat per-call cap. It is
derived from the project's storage quota and account tier at the
moment the upload session is initialized (archive
archive_size or directory files[].size
sum is checked against the remaining quota).
There is no public "max upload bytes" header or static constant integrators can hard-code; sessions that request more bytes than the project has remaining are rejected at initialization with a quota error. To discover the effective ceiling for your project:
- Inspect the project's quota and current storage usage from the dashboard, or
- Attempt to initialize the upload and read the quota error envelope if rejected, or
- Contact your administrator to confirm the tier-level cap.
chunk_size in init responses. Archive and
directory init responses return a chunk_size
field (currently 104857600 bytes, 100 MiB). Use this value
as the read-buffer size when streaming parts; it is the
per-request body cap enforced by the chunk endpoints. The
value is server-controlled and may change; always read it
from the init response rather than hard-coding 100 MiB.
Sessions expire 24 hours after creation. Plan resumes accordingly; see Status Lifecycle.
Archive Upload
- init
POST /v1/uploads/archive - parts
POST /uploads/{id}/parts - complete
POST /uploads/{id}/complete
Upload compressed archives containing HuggingFace model directories. Archives are extracted server-side, preserving the model's file structure for inference engine compatibility.
Best for: Uploading complete HuggingFace model directories. Supports .tar, .tar.gz, and .tar.bz2 formats.
Supported Formats
| Format | Extension | Content-Type |
|---|---|---|
| tar | .tar |
application/x-tar |
| tar.gz | .tar.gz, .tgz |
application/gzip |
| tar.bz2 | .tar.bz2, .tbz2 |
application/x-bzip2 |
Creating an Archive
Create a properly formatted archive from your model directory. Files must be at the archive root (not nested in subdirectories).
COPYFILE_DISABLE=1 tar -chzf model-name.tar.gz -C /path/to/model/directory .
Command Options
| Option | Description |
|---|---|
-c |
Create a new archive |
-h |
Follow symlinks (required for HuggingFace cache) |
-z |
Compress with gzip (use -j for bzip2) |
-f |
Specify the output filename |
-C |
Change to directory before adding files |
Important: The -h flag is required when archiving from HuggingFace cache, as it uses symlinks. Without this flag, your archive will contain broken links instead of actual model files.
HuggingFace Cache Example
# Find and archive a model from HuggingFace cache
COPYFILE_DISABLE=1 tar -chzf Qwen3-0.6B.tar.gz -C ~/.cache/huggingface/hub/models--Qwen--Qwen3-0.6B/snapshots/abc123/ .
Verify Archive Structure
Before uploading, verify files are at the root level:
$ tar -tzf model-name.tar.gz | head -5
./config.json
./model.safetensors
./tokenizer.json
./tokenizer_config.json
./generation_config.json
Initialize Archive Upload
POST /proj_ABC123/v1/uploads/archive
Request Body
| Parameter | Type | Description |
|---|---|---|
| model_namerequired | string | Display name for the model |
| archive_sizerequired | integer | Total archive size in bytes |
| archive_formatrequired | string | Archive format: tar, tar.gz, or tar.bz2 |
| descriptionoptional | string | Description of the model |
| workload_typeoptional | string | Workload hint. Recommended values: chat, code, reasoning, embedding, multilingual. Free-form; defaults to chat. Unrecognized values are accepted but may not influence routing. |
| quantizationoptional | string | Runtime quantization method override. Accepted values include native, fp8, awq, gptq, int4, int8, bitsandbytes, bitblas, modelopt, torchao, compressed-tensors, quark, and ipex. Defaults to native. |
Returns 201 Created on success.
curl -X POST https://api.xerotier.ai/proj_ABC123/v1/uploads/archive \
-H "Authorization: Bearer xero_myproject_your_api_key" \
-H "Content-Type: application/json" \
-d '{
"model_name": "Qwen3-0.6B",
"archive_size": 1234567890,
"archive_format": "tar.gz"
}'
Response
{
"id": "00000000-1111-0000-1111-000000000000",
"object": "upload",
"bytes": 1234567890,
"created_at": 1706123456,
"filename": "Qwen3-0.6B",
"purpose": "model",
"status": "pending",
"expires_at": 1706209856,
"upload_type": "archive",
"chunk_size": 104857600,
"total_chunks": 12,
"uploaded_chunks": 0,
"progress": 0
}
Upload Archive Chunks
After initialization, upload chunks sequentially. Pass the zero-based chunk index as a query parameter:
POST /proj_ABC123/v1/uploads/{sessionId}/parts?part_number={chunkIndex}
Include an X-Chunk-Checksum header with the SHA256 hex digest of the chunk data.
Complete Archive Upload
When all chunks are uploaded, complete the upload to trigger archive extraction and model creation:
POST /proj_ABC123/v1/uploads/{sessionId}/complete
Complete Response
{
"id": "00000000-1111-0000-1111-000000000000",
"object": "upload",
"status": "completed",
"bytes": 1234567890,
"created_at": 1706123456,
"filename": "Qwen3-0.6B",
"upload_type": "archive",
"model": {
"id": "660e8400-e29b-41d4-a716-446655440001",
"name": "Qwen3-0.6B",
"format": "safetensors",
"size_bytes": 1234567890,
"status": "validating",
"context_length": null,
"architecture": null,
"quantization": "native"
}
}
Archive Upload Example
import os
import hashlib
import requests
API_KEY = "xero_myproject_your_api_key"
BASE_URL = "https://api.xerotier.ai/proj_ABC123/v1"
def upload_archive(archive_path: str, model_name: str):
archive_size = os.path.getsize(archive_path)
# Initialize
init = requests.post(f"{BASE_URL}/uploads/archive",
headers={"Authorization": f"Bearer {API_KEY}"},
json={"model_name": model_name, "archive_size": archive_size, "archive_format": "tar.gz"}
).json()
session_id = init["id"]
chunk_size = init["chunk_size"]
# Upload chunks
with open(archive_path, "rb") as f:
idx = 0
while chunk := f.read(chunk_size):
checksum = hashlib.sha256(chunk).hexdigest()
requests.post(f"{BASE_URL}/uploads/{session_id}/parts?part_number={idx}",
headers={"Authorization": f"Bearer {API_KEY}", "X-Chunk-Checksum": checksum},
data=chunk)
idx += 1
# Complete
return requests.post(f"{BASE_URL}/uploads/{session_id}/complete",
headers={"Authorization": f"Bearer {API_KEY}"}).json()
upload_archive("./qwen3-0.6b.tar.gz", "Qwen3-0.6B")
Node.js Archive Upload Example
import { readFile, stat } from "node:fs/promises";
import { createHash } from "node:crypto";
const API_KEY = "xero_myproject_your_api_key";
const BASE_URL = "https://api.xerotier.ai/proj_ABC123/v1";
async function uploadArchive(archivePath, modelName) {
const fileInfo = await stat(archivePath);
const archiveSize = fileInfo.size;
// Initialize
const initResponse = await fetch(`${BASE_URL}/uploads/archive`, {
method: "POST",
headers: {
"Authorization": `Bearer ${API_KEY}`,
"Content-Type": "application/json"
},
body: JSON.stringify({
model_name: modelName,
archive_size: archiveSize,
archive_format: "tar.gz"
})
});
const init = await initResponse.json();
const sessionId = init.id;
const chunkSize = init.chunk_size;
// Read and upload chunks
const fileData = await readFile(archivePath);
for (let idx = 0; idx * chunkSize < fileData.length; idx++) {
const chunk = fileData.subarray(idx * chunkSize, (idx + 1) * chunkSize);
const checksum = createHash("sha256").update(chunk).digest("hex");
await fetch(
`${BASE_URL}/uploads/${sessionId}/parts?part_number=${idx}`,
{
method: "POST",
headers: {
"Authorization": `Bearer ${API_KEY}`,
"X-Chunk-Checksum": checksum
},
body: chunk
}
);
}
// Complete
const completeResponse = await fetch(
`${BASE_URL}/uploads/${sessionId}/complete`,
{
method: "POST",
headers: { "Authorization": `Bearer ${API_KEY}` }
}
);
return completeResponse.json();
}
await uploadArchive("./qwen3-0.6b.tar.gz", "Qwen3-0.6B");
Directory Upload
- init
POST /v1/uploads/directory - files
POST /uploads/{id}/files/{path} - complete
POST /uploads/{id}/complete
Upload individual files that mirror a local model directory structure. Useful when creating an archive is impractical or for incremental uploads.
Best for: Large models where archive creation is slow, or when you want fine-grained control over individual file uploads.
Initialize Directory Upload
POST /proj_ABC123/v1/uploads/directory
Provide a manifest of all files to upload:
Request Body
| Parameter | Type | Description |
|---|---|---|
| model_namerequired | string | Display name for the model |
| filesrequired | array | List of files to upload. Each entry must include relative_path (string) and size (integer bytes). |
| descriptionoptional | string | Description of the model |
| workload_typeoptional | string | Workload hint. Recommended values: chat, code, reasoning, embedding, multilingual. Free-form; defaults to chat. Unrecognized values are accepted but may not influence routing. |
| quantizationoptional | string | Runtime quantization method override. Accepted values include native, fp8, awq, gptq, int4, int8, bitsandbytes, bitblas, modelopt, torchao, compressed-tensors, quark, and ipex. Defaults to native. |
Returns 201 Created on success.
curl -X POST https://api.xerotier.ai/proj_ABC123/v1/uploads/directory \
-H "Authorization: Bearer xero_myproject_your_api_key" \
-H "Content-Type: application/json" \
-d '{
"model_name": "Qwen3-0.6B",
"files": [
{"relative_path": "config.json", "size": 1234},
{"relative_path": "model.safetensors", "size": 1234000000},
{"relative_path": "tokenizer.json", "size": 5678}
]
}'
Response
{
"id": "00000000-1111-0000-1111-000000000000",
"object": "upload",
"bytes": 1234001234,
"created_at": 1706123456,
"filename": "Qwen3-0.6B",
"purpose": "model",
"status": "pending",
"expires_at": 1706209856,
"upload_type": "directory",
"chunk_size": 104857600,
"uploaded_chunks": 0,
"progress": 0,
"chunk_upload_url": "v1/uploads/00000000-1111-0000-1111-000000000000/file-chunks",
"files": [
{
"relative_path": "config.json",
"upload_path": "v1/uploads/00000000.../files/config.json",
"size": 1234,
"requires_chunking": false,
"status": "pending"
},
{
"relative_path": "model.safetensors",
"upload_path": "v1/uploads/00000000.../files/model.safetensors",
"size": 1234000000,
"requires_chunking": true,
"total_chunks": 12,
"chunk_url": "v1/uploads/00000000.../file-chunks",
"status": "pending"
}
]
}
Upload Individual File
POST /proj_ABC123/v1/uploads/{sessionId}/files/{relativePath}
Headers
| Header | Description |
|---|---|
| X-File-Checksum | SHA256 checksum of the file content (hex-encoded). See known issue below. |
Known issue: header-name split-brain.
The official CLI (xeroctl) sends
X-File-Checksum for this endpoint, and that
is the header name documented above. The server-side
upload handler currently reads
X-Chunk-Checksum on the same route. Until
the two are reconciled, integrators following this
documentation verbatim may see the server treat the
checksum as empty (and, depending on validation mode,
either pass data unchecked or surface a checksum-mismatch
error). To be forward-compatible, clients SHOULD send
both headers with the same value:
CHECKSUM=$(sha256sum config.json | cut -d' ' -f1)
curl -X POST "https://api.xerotier.ai/proj_ABC123/v1/uploads/$SESSION_ID/files/config.json" \
-H "Authorization: Bearer xero_myproject_your_api_key" \
-H "X-File-Checksum: $CHECKSUM" \
-H "X-Chunk-Checksum: $CHECKSUM" \
--data-binary @config.json
Response
{
"relative_path": "config.json",
"size": 1234,
"checksum": "a1b2c3d4e5f6...",
"uploaded_file_count": 1,
"expected_file_count": 7,
"progress": 14.28
}
Directory Upload Example
import os
import hashlib
import requests
from pathlib import Path
API_KEY = "xero_myproject_your_api_key"
BASE_URL = "https://api.xerotier.ai/proj_ABC123/v1"
def upload_directory(model_dir: str, model_name: str):
model_path = Path(model_dir)
# Build manifest
files = [{"relative_path": str(f.relative_to(model_path)), "size": f.stat().st_size}
for f in model_path.rglob("*") if f.is_file()]
# Initialize
init = requests.post(f"{BASE_URL}/uploads/directory",
headers={"Authorization": f"Bearer {API_KEY}"},
json={"model_name": model_name, "files": files}
).json()
session_id = init["id"]
# Upload each file
for file_info in files:
file_path = model_path / file_info["relative_path"]
with open(file_path, "rb") as f:
data = f.read()
checksum = hashlib.sha256(data).hexdigest()
requests.post(f"{BASE_URL}/uploads/{session_id}/files/{file_info['relative_path']}",
headers={
"Authorization": f"Bearer {API_KEY}",
"X-File-Checksum": checksum,
"X-Chunk-Checksum": checksum,
},
data=data)
# Complete
return requests.post(f"{BASE_URL}/uploads/{session_id}/complete",
headers={"Authorization": f"Bearer {API_KEY}"}).json()
upload_directory("./models--Qwen--Qwen3-0.6B", "Qwen3-0.6B")
Node.js Directory Upload Example
import { readFile, stat, readdir } from "node:fs/promises";
import { createHash } from "node:crypto";
import { join, relative } from "node:path";
const API_KEY = "xero_myproject_your_api_key";
const BASE_URL = "https://api.xerotier.ai/proj_ABC123/v1";
async function getFiles(dir, base) {
const entries = await readdir(dir, { withFileTypes: true });
const files = [];
for (const entry of entries) {
const fullPath = join(dir, entry.name);
if (entry.isFile()) {
const info = await stat(fullPath);
files.push({ relative_path: relative(base, fullPath), size: info.size });
} else if (entry.isDirectory()) {
files.push(...await getFiles(fullPath, base));
}
}
return files;
}
async function uploadDirectory(modelDir, modelName) {
const files = await getFiles(modelDir, modelDir);
// Initialize
const initResponse = await fetch(`${BASE_URL}/uploads/directory`, {
method: "POST",
headers: {
"Authorization": `Bearer ${API_KEY}`,
"Content-Type": "application/json"
},
body: JSON.stringify({ model_name: modelName, files })
});
const init = await initResponse.json();
const sessionId = init.id;
// Upload each file
for (const fileInfo of files) {
const data = await readFile(join(modelDir, fileInfo.relative_path));
const checksum = createHash("sha256").update(data).digest("hex");
await fetch(
`${BASE_URL}/uploads/${sessionId}/files/${fileInfo.relative_path}`,
{
method: "POST",
headers: {
"Authorization": `Bearer ${API_KEY}`,
"X-File-Checksum": checksum,
"X-Chunk-Checksum": checksum
},
body: data
}
);
}
// Complete
const completeResponse = await fetch(
`${BASE_URL}/uploads/${sessionId}/complete`,
{
method: "POST",
headers: { "Authorization": `Bearer ${API_KEY}` }
}
);
return completeResponse.json();
}
await uploadDirectory("./models--Qwen--Qwen3-0.6B", "Qwen3-0.6B");
Complete Directory Upload
After all files are uploaded, call the complete endpoint to finalize the model record and trigger metadata extraction:
POST /proj_ABC123/v1/uploads/{sessionId}/complete
Returns a RouterUploadCompleteResponse with the same shape as the archive complete response above.
Chunked File Upload (Directory Mode)
- init
POST /uploads/directory - chunks
POST /uploads/{id}/file-chunks/{n} - file
POST /uploads/{id}/file-complete - session
POST /uploads/{id}/complete
Individual files larger than chunk_size (100 MiB
by default) cannot be sent in a single
POST /v1/uploads/{sessionId}/files/{relativePath}
request, the per-request body cap is enforced by the
server. For these files, the directory init response marks
the entry with requires_chunking: true, returns
a total_chunks count, and advertises a
chunk_url ending in
/file-chunks. Use the chunk endpoint below for
each chunk, then call file-complete to finalize the file
before completing the session.
Upload a File Chunk
POST /proj_ABC123/v1/uploads/{sessionId}/file-chunks/{chunkIndex}?relative_path={relativePath}
Path Parameters
| Parameter | Description |
|---|---|
| sessionId | Upload session ID returned by directory init. |
| chunkIndex | Zero-based chunk index for the file. |
Query Parameters
| Parameter | Description |
|---|---|
| relative_pathrequired | Relative path of the file being chunked, matching an entry in the directory manifest. |
Headers
| Header | Description |
|---|---|
| X-Chunk-Checksumrequired | SHA256 hex digest of this chunk's bytes. Empty values are rejected with 400. |
Body is the raw chunk bytes; the per-request cap matches the chunk_size returned by directory init.
CHECKSUM=$(sha256sum chunk-0.bin | cut -d' ' -f1)
curl -X POST "https://api.xerotier.ai/proj_ABC123/v1/uploads/$SESSION_ID/file-chunks/0?relative_path=model.safetensors" \
-H "Authorization: Bearer xero_myproject_your_api_key" \
-H "X-Chunk-Checksum: $CHECKSUM" \
--data-binary @chunk-0.bin
Complete a Chunked File
After all chunks for a single file have been uploaded, call
file-complete to assemble and validate that file. This is
the per-file finalization hook; the session-level
/complete endpoint is still required afterwards
to finalize the model record.
POST /proj_ABC123/v1/uploads/{sessionId}/file-complete?relative_path={relativePath}
curl -X POST "https://api.xerotier.ai/proj_ABC123/v1/uploads/$SESSION_ID/file-complete?relative_path=model.safetensors" \
-H "Authorization: Bearer xero_myproject_your_api_key"
Chunked File Upload Flow
- Initialize a directory upload; inspect each
files[]entry forrequires_chunking. - For non-chunked files, POST to
/files/{relativePath}as shown in Directory Upload. - For chunked files, slice the file into
chunk_size-byte parts and POST each one to/file-chunks/{chunkIndex}?relative_path=...withX-Chunk-Checksum. - Call
/file-complete?relative_path=...once all chunks for that file are uploaded. - Repeat per file, then call the session-level
/completeendpoint.
Resuming Interrupted Uploads
If an upload is interrupted, use the resume endpoint to find out which chunks or files still need to be uploaded:
POST /proj_ABC123/v1/uploads/{sessionId}/resume
curl -X POST https://api.xerotier.ai/proj_ABC123/v1/uploads/00000000-1111-0000-1111-000000000000/resume \
-H "Authorization: Bearer xero_myproject_your_api_key"
Response
{
"id": "00000000-1111-0000-1111-000000000000",
"next_chunk_index": 5,
"uploaded_chunks": 5,
"missing_chunks": []
}
Resume by uploading starting from next_chunk_index, or re-uploading any missing_chunks. Sessions expire 24 hours after creation (wall-clock from created_at), regardless of in-flight activity. See Status Lifecycle.
Status Lifecycle
Two independent status fields appear in upload
responses: the upload session status (top-level
status) and the model record status
(model.status in the complete response). They
have different value sets and lifecycles.
Upload Session Status
| Value | Meaning |
|---|---|
pending |
Session created; awaiting parts or files. |
completed |
All parts/files received and the session-level complete endpoint succeeded. |
cancelled |
Session was explicitly cancelled (via the cancel endpoint) or expired. |
Sessions expire 24 hours after creation. Plan resumes
inside that window; after expiry the session moves to
cancelled and chunks must be re-uploaded under
a fresh session. See
Uploads API status
lifecycle for the canonical reference.
Model Record Status
After a successful session complete, the response contains
a model object whose status
reflects post-upload validation (for example
validating, transitioning to a terminal state
once metadata extraction and weight verification finish).
This is a separate lifecycle from the upload session and
continues running after the upload session is
completed.
Required Model Files
For inference engine compatibility, models must include:
| File | Required | Description |
|---|---|---|
config.json |
Yes | Model configuration |
*.safetensors or *.bin |
Yes | Model weights (at least one). .safetensors is preferred; .bin (PyTorch pickle) is accepted for compatibility. Other engine-specific formats such as .exl2 (ExLlamaV2) are not supported by the inference engines provisioned in this project. |
tokenizer.json |
No | Tokenizer configuration |
tokenizer_config.json |
No | Tokenizer settings |