// API Reference

Server-Side Tools

Tools the router runs itself, with no round-trip to your code. Research, artifacts, project intelligence, memory, and governed execution, all callable mid-completion. Each entry below carries its schema, latency profile, and billing footprint.

The tools fall into four categories:

  • Research, tools for gathering information from external sources.
  • Content, tools for creating and updating persistent artifacts.
  • Interaction, tools for requesting input from the user mid-stream.
  • Knowledge, tools for saving, recalling, and searching stored data.
  • Project Intelligence, tools for tracking decisions, milestones, relationships, and generating project briefings.

Tool Summary

ToolCategoryDescription
x_web_search Research Built-in web search
x_fetch_url Research Fetch and extract text from web pages and PDFs
x_code_search Research GitHub code search (snippet results)
x_repo_overview Research Full GitHub repository overview (metadata, languages, README, tree, recent commits)
x_repo_read Research Read specific files or directory listings from a GitHub repository
x_inspect_site Research Single-URL HTML and stylesheet inspection for theme, palette, and typography extraction
x_calculator Research Evaluate mathematical expressions safely
x_create_artifact Content Create persistent artifacts (code, docs, HTML, SVG, diagrams)
x_update_artifact Content Update a previously created artifact with new content
x_create_mockup Content Create a multi-file mockup bundle for visual preview in a sandboxed iframe (direct API only; not advertised to agentic chats)
x_add_mockup_file Content Incrementally write one file at a time into a mockup bundle keyed by identifier (agentic-mode preferred path)
x_update_mockup Content Update files in a previously created mockup bundle
x_read_mockup Content Read mockup bundle contents (manifest, single file, or full inline). 256 KB cap on returned content.
x_read_artifact Content Read a single artifact's metadata and inline content (1 MiB cap; metadata-only above that)
x_list_artifacts Content List artifacts visible to the calling workspace (optionally filtered by chat and type)
x_ask_user Interaction Ask the user a clarifying question with optional structured input
x_save_memory Knowledge Save facts, preferences, or instructions to per-workspace memory
x_recall_memory Knowledge Search saved memories using semantic similarity
x_forget Knowledge Retract a memory, decision, milestone, relation, or artifact created earlier in the session
x_workspace_search Knowledge Unified semantic search across workspace memories, documents, and artifacts
x_track_decision Project Intelligence Record structured architectural or design decisions
x_query_decisions Project Intelligence Search and filter recorded decisions
x_track_milestone Project Intelligence Record project milestones and goals
x_query_timeline Project Intelligence Query the project timeline of milestones
x_project_brief Project Intelligence Generate a project state briefing
x_relate Project Intelligence Create relationships between workspace entities
x_query_graph Project Intelligence Return the workspace knowledge graph (decisions + milestones + relations) as one composite payload
x_analyze_workspace Project Intelligence Synthesize a structured brief of the workspace context (documents, memories, artifacts, chat history)
x_exec_invoke Execution Invoke a governed XEM tool via the MCP exec adapter; returns the invocation id for polling
x_exec_poll Execution Poll the lifecycle status of an exec invocation
x_exec_cancel Execution Cancel an in-flight exec invocation
x_exec_invoke_blocking Execution Long-lived blocking exec dispatch with streaming progress (MCP-only; 900s timeout)

Enabling Server-Side Tools

Toggleable research tools (x_web_search, x_code_search, x_repo_overview, x_repo_read, x_inspect_site) must be explicitly opted into via the UI or x_tools. All other tools are injected automatically:

  • x_create_artifact and x_update_artifact, always injected in chat context. x_read_artifact and x_list_artifacts are also chat-visible.
  • x_add_mockup_file and x_update_mockup, injected only in agentic mode (when the workspace's agenticByDefault is enabled or the request explicitly opts into agentic mode). x_create_mockup stays registered for direct POST /v1/responses callers that ship their own tools array, but is not advertised to agentic chats. x_read_mockup is in the agentic default tool set.
  • x_ask_user, always injected in chat context (dispatched via the chat-completions SSE hook; no server-side tool implementation file).
  • x_save_memory, x_recall_memory, and x_forget, always injected in chat context.
  • x_workspace_search, always injected in chat context for unified search across workspace content.
  • x_calculator and x_fetch_url, always injected (always-on tier).
  • x_track_decision, x_query_decisions, x_track_milestone, x_query_timeline, x_project_brief, x_relate, x_query_graph, x_analyze_workspace, always injected (project intelligence tools).
  • x_exec_invoke, x_exec_poll, x_exec_cancel, chat-toggleable governed-XEM execution adapter. x_exec_invoke_blocking is MCP-only.

Chat Completions API

Add web_search_options to a standard /v1/chat/completions request. Use the x_tools array to select which research tools to enable:

JSON
{ "model": "my-model", "messages": [ {"role": "user", "content": "Search for the latest Rust async patterns"} ], "stream": true, "web_search_options": { "search_context_size": "medium", "max_iterations": 5, "x_tools": ["x_web_search", "x_code_search", "x_calculator"] } }

Responses API

Include a web_search_preview tool in the tools array:

JSON
{ "model": "my-model", "input": "Find the latest research on transformer architectures", "stream": true, "tools": [ { "type": "web_search_preview", "search_context_size": "medium", "x_tools": ["x_web_search", "x_fetch_url", "x_code_search"] } ] }

x_tools Selection

BehaviorDetails
Omitted or empty Defaults to ["x_web_search", "x_fetch_url"].
x_web_search included x_fetch_url is auto-included for URL follow-up.
Invalid names Silently ignored. Falls back to defaults if all are invalid.
x_code_search, x_repo_overview, x_repo_read Available when GitHub code search is enabled for your endpoint.
x_web_search Available when web search is enabled for your endpoint. Where no search backend is configured, requests for x_web_search are silently dropped.
x_calculator Always available. Evaluates mathematical expressions server-side.

Humanize Output

Humanize Output is a workspace policy, not an LLM-callable tool. Every assistant turn produced for a chat in that workspace runs through a deterministic post-synthesis filter before the content is flushed to the chat bubble and persisted. The filter strips AI tells (sycophantic openers, marketing adjectives, em-dash spam, redundant scaffolding, hedge filler) while preserving code, file paths, URLs, numbers, math, and citation markers byte-identical.

The filter is server-side and chat-page-scoped. External SDKs calling /v1/chat/completions directly do not pass through it; only chats rendered in the Xerotier chat-page UI invoke it.

How it works

Six pipeline stages. Each stage either pass-through or safe-abort; if any invariant trips, the original assistant draft is returned byte-identical and the abort is logged at .notice with the precise reason.

  1. Protect-zone tokenizer. Masks code fences, inline code, URLs, file paths, numeric literals, math expressions, and citation tokens behind opaque sentinels so later stages cannot rewrite them.
  2. Deterministic scrub. Site-ordered rule engine applies ~32 named rules covering openers, wrap-ups, banned phrases, marketing adjectives, transitional cliches, em-dash discipline, header demotion, and tricolon flattening.
  3. Optional LLM polish. When the workspace's humanize_polish_enabled is on and the scrubbed draft is over 120 characters, sends the text to the workspace's configured polish endpoint (defaults to the workspace small endpoint) with a strict rewrite prompt. Failure (timeout, non-200, parse error) falls through to the next stage with the scrubbed draft; polish never blocks an assistant turn.
  4. Post-polish re-scrub. Runs the same rule engine again. Polish models occasionally reintroduce tells; the scrubber is the floor.
  5. Safety net. Aborts and returns the original draft if any sentinel went missing or got duplicated, any banned floor phrase survived, or the masked-text length changed by more than 40% (with a 32-character minimum-length floor).
  6. Unmask. Replaces every sentinel with its original protected span.

Enabling the policy

Open the chat workspace settings tab and find the Humanize output card under Workspace behaviour. The main switch turns the deterministic scrub on or off for every chat in the workspace. A sub-card exposes:

  • LLM polish: when on, the polish pass runs after the deterministic scrub. Higher quality, adds ~1-3 s latency per turn.
  • Polish endpoint: the workspace endpoint the polish call targets. Defaults to workspace-small (auto), which selects the same small/cheap endpoint chat-name generation uses.
  • Voice anchor: a per-workspace textarea that conditions the polish system prompt on a custom voice register (see below).

Per-chat override

Each chat carries a tri-state override chip in the model-options panel: Inherit | On | Off. Inherit defers to the workspace toggle; On and Off force the filter for that single chat regardless of the workspace setting. Use Off when humanization is interfering with a precise technical answer.

Voice anchoring

The platform ships a built-in default voice block compiled into the frontend service. When the polish sub-toggle is on, the polish system prompt is conditioned on that voice register as "Match this voice: ...". This is what turns "less AI-sounding" into "sounds like this project."

The chat-settings Voice anchor textarea pre-populates from the built-in default when the workspace has not stored an override. Three states are honored:

  • Override set (non-empty): the text is sent verbatim as the voice line. Use this to give one workspace a distinct register without affecting others.
  • Override set (empty): explicit opt-out. The polish prompt omits the voice line entirely for this workspace.
  • No override (default): the built-in default voice block is used. Clear the textarea and save to enter the opt-out state; re-saving the seeded default text installs it as an explicit override.

Notes and limitations:

  • The built-in default lives in compiled code, not on disk. Operators who need to change the platform-wide default ship a new frontend release; operators who need a per-workspace variation use the textarea.
  • The voice block is injected only when the LLM polish sub-toggle is on. The deterministic scrub runs regardless and does not read the voice block.
  • The polish prompt budgets roughly 800 characters for the voice line; longer overrides are passed through and may cost extra prompt tokens.

What is preserved verbatim

The protect-zone tokenizer ensures the following spans pass through every stage byte-identical. If any sentinel is missing or duplicated after polish, the safety net aborts and returns the original draft.

  • Fenced code blocks (triple backtick)
  • Inline code (single backtick)
  • Markdown links and bare URLs (http, https, ws, wss, file)
  • Absolute and relative file paths
  • Numeric literals (integers, floats, hex, scientific notation, percentages)
  • Math expressions ($...$ and $$...$$)
  • Citation tokens like [^1]

Operator visibility

A workspace-level Show humanized chip toggle exposes a small humanized chip on assistant bubbles whose content went through the filter. Hidden by default; flip it on when you want visual confirmation during QA. Hovering the chip shows whether the message was "scrub only" or "scrub + polish".

Admin observability

GET /admin/humanize/stats returns an admin-gated JSON snapshot covering:

  • Per-stage invocation counts (scrub, polish, aborted, skipped).
  • Per-reason abort counts (missing_sentinel, duplicated_sentinel, banned_phrase_floor, length_delta, polish_timeout, polish_error).
  • Top rules by hit count.
  • Polish-stage latency P50 and P95.

Prometheus metrics emitted under the xerotier_humanize_* namespace (_invocations_total, _aborts_total, _polish_tokens_total, _rule_hits_total, _latency_seconds) live on FrontendMetrics.shared.

Audit trail

Every assistant message produced by a humanized turn stores the raw pre-humanize draft alongside the humanized content. The audit column is chat_messages.pre_humanize_content and is nullable; it is only populated when the filter actually ran for that row.

x_fetch_url

Fetches and extracts readable text content from web pages and PDF documents. Supports single-URL fetch, parallel multi-URL fetch, link discovery for site exploration, and automatic same-domain link following.

Parameters

ParameterTypeDescription
urlrequired string A single URL to fetch content from. Must be http or https.
urlsoptional string[] Multiple URLs to fetch in parallel (max 5). Use this OR url, not both. When both are provided, they are merged and deduplicated.
discover_linksoptional boolean Extract links from fetched pages for site exploration. Default: false.

Modes

  • Single URL, provide only url without discovery flags. Returns extracted plain text directly.
  • Multi-URL, provide urls array for parallel fetching. Returns a JSON object with a pages array, each containing url, content, and error fields.
  • Discovery, set discover_links: true to extract links from fetched pages.

SSRF Protection

All URL fetches are protected against Server-Side Request Forgery (SSRF). Requests to private and internal IP ranges are blocked: 10.x, 172.16-31.x, 192.168.x, 127.x, ::1, fe80::/10, 169.254.x.x.

Character Budget

Multi-URL fetches share a total budget of 24,000 characters, distributed equally across all successfully fetched pages. Pages exceeding their share are truncated.

Use Cases

  • Reading documentation pages in full
  • Extracting article content for summarization
  • Crawling related pages on the same domain
  • Fetching PDF text content

Example: Single Fetch

JSON
{ "name": "x_fetch_url", "arguments": { "url": "https://docs.example.com/api/authentication" } }

Example: Multi-Fetch with Discovery

JSON
{ "name": "x_fetch_url", "arguments": { "urls": [ "https://docs.example.com/guide/getting-started", "https://docs.example.com/guide/configuration" ], "discover_links": true } }

Example Multi-Fetch Response

JSON
{ "discover_links_enabled": true, "total_pages": 4, "pages": [ { "url": "https://docs.example.com/guide/getting-started", "content": "Getting Started. Install the SDK with npm install...", "error": false, "discovered_links": [ {"url": "https://docs.example.com/guide/authentication", "text": "Authentication"}, {"url": "https://docs.example.com/guide/advanced", "text": "Advanced Usage"} ] }, { "url": "https://docs.example.com/guide/configuration", "content": "Configuration. Set environment variables to customize...", "error": false, "discovered_links": [] }, { "url": "https://docs.example.com/guide/authentication", "content": "Authentication. Use API keys or OAuth2 tokens...", "error": false, "followed_from_discovery": true }, { "url": "https://docs.example.com/guide/advanced", "content": "Advanced Usage. Configure retry policies and timeouts...", "error": false, "followed_from_discovery": true } ] }

SSE Event

SSE
data: {"type":"x_research.reading","name":"x_fetch_url","arguments":"{\"url\":\"https://docs.example.com/api/authentication\"}"}

x_repo_overview

Returns a full GitHub repository overview: metadata, language breakdown, README content, top-level directory tree, and recent commits. Use this first to discover repository structure before reading individual files with x_repo_read.

Parameters

ParameterTypeDescription
repositoryrequired string GitHub repository. Accepts owner/repo or a full URL (e.g. https://github.com/owner/repo).

Example

JSON
{ "name": "x_repo_overview", "arguments": { "repository": "vapor/vapor" } }

x_repo_read

Reads specific files or directory listings from a GitHub repository. Use x_repo_overview first to discover the structure, then x_repo_read to fetch the files you need.

Parameters

ParameterTypeDescription
repositoryrequired string GitHub repository. Accepts owner/repo or a full URL.
pathoptional string Single file or directory path to fetch.
pathsoptional string[] Multiple file or directory paths to fetch in one call.

Example

JSON
{ "name": "x_repo_read", "arguments": { "repository": "apple/swift-nio", "paths": ["Sources/NIOCore/Channel.swift", "Sources/NIOCore/EventLoop.swift"] } }

x_inspect_site

Single-URL HTML and stylesheet inspection for extracting site theme metadata: palette, typography, structure, and key element classes. Use when the model needs to mirror a reference design rather than crawl multiple pages.

Parameters

ParameterTypeDescription
urlrequired string The page URL to inspect.

x_calculator

Evaluates mathematical expressions server-side using a safe recursive-descent parser. No code execution, only arithmetic, functions, and constants are supported.

Parameters

ParameterTypeDescription
expressionrequired string Mathematical expression to evaluate. Maximum 1,000 characters.

Supported Operations

CategorySupported
Operators + - * / ^
Functions sqrt, log, ln, sin, cos, tan, abs, floor, ceil, round, min, max
Constants pi, e

Use Cases

  • Unit conversions and dimensional analysis
  • Financial calculations (compound interest, amortization)
  • Scientific computation (trigonometry, logarithms)
  • Quick arithmetic the model should not hallucinate

Example Tool Call

JSON
{ "name": "x_calculator", "arguments": { "expression": "10000 * (1 + 0.05)^3" } }

Example Response

JSON
{ "expression": "10000 * (1 + 0.05)^3", "result": 11576.25 }

More Examples

ExpressionResult
sqrt(144) + 2^320.0
sin(pi/2)1.0
log(1000)3.0
max(42, 17) * min(3, 5)126.0
abs(-273.15) + ceil(2.1)276.15

SSE Event

SSE
data: {"type":"x_research.calculating","name":"x_calculator","arguments":"{\"expression\":\"10000 * (1 + 0.05)^3\"}"}

x_create_artifact

Creates a persistent artifact such as a code file, document, HTML page, SVG image, or Mermaid diagram. Artifacts are stored in the chat and can be viewed, downloaded, and updated. This tool is always injected in chat context, no opt-in required.

Parameters

ParameterTypeDescription
identifierrequired string Stable slug for this artifact. 1-128 characters, ASCII alphanumeric, hyphens, underscores, and dots only. Used to reference the artifact for later updates.
titlerequired string Human-readable display title for the artifact.
typerequired string MIME content type for the artifact content.
languageoptional string Programming language for syntax highlighting (e.g. python, swift, javascript). Omit for non-code artifacts.
contentrequired string The full content of the artifact. Maximum 1 MB.

Supported MIME Types

MIME TypeUse For
text/markdownMarkdown documents, reports, notes
text/htmlHTML pages, interactive previews
text/plainPlain text files, configuration
image/svg+xmlSVG vector graphics
text/x-mermaidMermaid diagram definitions
application/jsonJSON data files
text/x-{language}Code files (e.g. text/x-python, text/x-swift)

Identifier Rules

  • 1-128 characters long
  • ASCII alphanumeric characters, hyphens, underscores, and dots only
  • Slug format, lowercase recommended (e.g. my-component.tsx, data-pipeline)
  • Must be unique within the chat for creation; reuse the same identifier with x_update_artifact

Versioning

Creating an artifact sets it to version 1. Subsequent updates via x_update_artifact increment the version number. All versions are retained for history.

Use Cases

  • Generating code files with syntax highlighting
  • Creating documentation and technical reports
  • Building interactive HTML previews
  • Rendering architecture diagrams with Mermaid

Example: Code File

JSON
{ "name": "x_create_artifact", "arguments": { "identifier": "fibonacci.py", "title": "Fibonacci Generator", "type": "text/x-python", "language": "python", "content": "def fibonacci(n: int) -> list[int]:\n \"\"\"Generate the first n Fibonacci numbers.\"\"\"\n if n <= 0:\n return []\n if n == 1:\n return [0]\n seq = [0, 1]\n for _ in range(2, n):\n seq.append(seq[-1] + seq[-2])\n return seq\n\nif __name__ == \"__main__\":\n print(fibonacci(10))\n" } }

Example: Mermaid Diagram

JSON
{ "name": "x_create_artifact", "arguments": { "identifier": "auth-flow", "title": "Authentication Flow", "type": "text/x-mermaid", "content": "sequenceDiagram\n participant U as User\n participant A as API Gateway\n participant S as Auth Service\n participant D as Database\n U->>A: POST /login\n A->>S: Validate credentials\n S->>D: Query user\n D-->>S: User record\n S-->>A: JWT token\n A-->>U: 200 OK + token\n" } }

SSE Event

SSE
data: {"type":"x_artifact.created","identifier":"fibonacci.py","title":"Fibonacci Generator","version":1}

x_update_artifact

Updates a previously created artifact with new content, creating a new version. The update is a full replacement, the entire content is replaced, not a diff or patch. This tool is always injected in chat context.

Parameters

ParameterTypeDescription
identifierrequired string The identifier of the artifact to update. Must match a previously created artifact in this chat.
contentrequired string The complete updated content. Replaces the previous version entirely.

Use Cases

  • Iterating on code based on user feedback
  • Refining documents and reports
  • Updating diagrams with new components
  • Fixing bugs in previously generated code

Example Tool Call

JSON
{ "name": "x_update_artifact", "arguments": { "identifier": "fibonacci.py", "content": "from functools import lru_cache\n\n@lru_cache(maxsize=None)\ndef fibonacci(n: int) -> int:\n \"\"\"Return the nth Fibonacci number (memoized).\"\"\"\n if n < 2:\n return n\n return fibonacci(n - 1) + fibonacci(n - 2)\n\ndef fibonacci_sequence(count: int) -> list[int]:\n \"\"\"Generate the first count Fibonacci numbers.\"\"\"\n return [fibonacci(i) for i in range(count)]\n\nif __name__ == \"__main__\":\n print(fibonacci_sequence(10))\n" } }

SSE Event

SSE
data: {"type":"x_artifact.updated","identifier":"fibonacci.py","version":2}

x_create_mockup

Creates a multi-file mockup bundle (HTML, CSS, JavaScript, and other static assets) intended for visual preview. Unlike x_create_artifact, which stores a single file, a mockup bundle groups several related files that reference each other through relative paths. The bundle is rendered inside a sandboxed iframe so the model can demonstrate visual designs and interactive prototypes without exposing the host page to the bundle's scripts.

Availability: create_mockup stays registered on the router but is only surfaced to direct POST /v1/responses callers that ship their own tools array. The agentic chat surface advertises x_add_mockup_file (incremental, one file per call) and x_update_mockup (delete + bulk-replace against an existing bundle) instead, and only when the workspace has agenticByDefault enabled or the request explicitly opts into agentic mode.

Parameters

ParameterTypeDescription
identifierrequired string Stable slug for this bundle. 1-128 characters, ASCII alphanumeric, hyphens, underscores, and dots only. Used to reference the bundle for later updates.
titlerequired string Human-readable display title for the mockup.
entryrequired string Relative path within the bundle of the entry document loaded by the preview iframe (e.g. index.html). Must match one of the supplied file paths.
filesrequired array Array of file objects that make up the bundle. See file object schema below. At least one file is required and one of them must match entry.

File Object Schema

FieldTypeDescription
pathrequired string Relative path within the bundle (e.g. index.html, css/main.css, js/app.js). Must not start with / or contain .. segments.
contentrequired string The file's content. Plain text for text files; base64 for binary assets when encoding is base64.
encodingoptional string Either utf8 (default) or base64. Use base64 for binary assets such as images.

Bundle Preview URL

Each persisted file in a bundle is reachable via:

HTTP
GET /v1/mockups/{bundleId}/{path}

The bundleId is returned in the x_mockup.created event (see the streaming docs). The route is intended for sandboxed-iframe rendering only, responses include a strict Content-Security-Policy and are not meant to be embedded in unsandboxed contexts.

Use Cases

  • Demonstrating a multi-file UI design (HTML + CSS + JS) inside the chat
  • Building interactive prototypes with multiple linked pages
  • Sharing a small static site preview without leaving the chat

Example Tool Call

JSON
{ "name": "x_create_mockup", "arguments": { "identifier": "landing-page", "title": "Landing Page Mockup", "entry": "index.html", "files": [ { "path": "index.html", "content": "<!doctype html>\n<html>\n<head>\n <link rel=\"stylesheet\" href=\"css/main.css\">\n</head>\n<body>\n <h1>Hello</h1>\n <script src=\"js/app.js\"></script>\n</body>\n</html>\n" }, { "path": "css/main.css", "content": "body { font-family: system-ui; margin: 2rem; }\nh1 { color: #2b6cb0; }\n" }, { "path": "js/app.js", "content": "console.log('mockup loaded');\n" } ] } }

SSE Event

SSE
data: {"type":"x_mockup.created","bundleId":"mkp_8f3k2m","identifier":"landing-page","title":"Landing Page Mockup","entry":"index.html","files":[{"path":"index.html","contentType":"text/html","size":182},{"path":"css/main.css","contentType":"text/css","size":74},{"path":"js/app.js","contentType":"application/javascript","size":28}]}

x_add_mockup_file

Writes a single file into a mockup bundle keyed by identifier. The first call for a new identifier creates the bundle (title is required on that call); subsequent calls with the same identifier add or replace files in that bundle. Each call carries one file's worth of content, so the tool-call arguments JSON stays small and survives output-token cuts that routinely truncate create_mockup's atomic "every file in one call" payload.

Availability: x_add_mockup_file is the preferred mockup-authoring path in agentic mode. It is advertised only when the workspace has agenticByDefault enabled or the request explicitly opts into agentic mode, and only when the caller leaves the corresponding chat-settings toggle on.

Parameters

ParameterTypeDescription
identifierrequired string Stable slug for the bundle. Reuse the same value across calls so files accumulate into one bundle.
titleoptional string Human-readable display title. Required on the first call for a given identifier; ignored on subsequent calls.
pathrequired string Relative path within the bundle (e.g. index.html, css/main.css, js/app.js). Must not start with / or contain .. segments.
contentrequired string The file's content. Plain text for text files; base64 for binary assets when encoding is base64.
encodingoptional string Either utf8 (default) or base64. Use base64 for binary assets such as images.
entryoptional string Entry document loaded by the preview iframe. Defaults to index.html on the first call. Honored on the first call only.

Use Cases

  • Any mockup whose files contain HTML, CSS, JavaScript, or other quote-laden content (avoids parser truncation that breaks create_mockup).
  • Multi-file bundles built up over several turns of a conversation.
  • Adding a file to an existing bundle without rewriting the whole bundle.

Example Tool Calls

Two calls sharing the same identifier assemble one bundle. The first call creates the bundle and supplies title; the second call adds another file.

JSON (call 1)
{ "name": "x_add_mockup_file", "arguments": { "identifier": "landing-page", "title": "Landing Page Mockup", "path": "index.html", "content": "<!doctype html>\n<html>\n<head>\n <link rel=\"stylesheet\" href=\"css/main.css\">\n</head>\n<body>\n <h1>Hello</h1>\n</body>\n</html>\n" } }
JSON (call 2)
{ "name": "x_add_mockup_file", "arguments": { "identifier": "landing-page", "path": "css/main.css", "content": "body { font-family: system-ui; margin: 2rem; }\nh1 { color: #2b6cb0; }\n" } }

SSE Events

The first call for a new identifier emits x_mockup.created; subsequent calls against the same identifier emit x_mockup.updated with the changed paths. Failures emit x_mockup.error. Payload schemas are documented in the streaming reference.

x_update_mockup

Updates an existing mockup bundle. Files listed in files are added or replaced; paths listed in delete are removed. Files that are not mentioned remain unchanged. Each updated file gets its version incremented. x_update_mockup is advertised only in agentic mode, alongside x_add_mockup_file.

Parameters

ParameterTypeDescription
identifieroptional string The identifier of a previously created bundle in this chat. One of identifier or bundle_id is required.
bundle_idoptional string The opaque bundle id returned in x_mockup.created. One of identifier or bundle_id is required.
titleoptional string New display title. Omit to leave unchanged.
entryoptional string New entry path. Must match an existing or newly added file. Omit to leave unchanged.
filesoptional array Files to add or replace. Same schema as create_mockup.
deleteoptional array Array of relative paths to remove from the bundle. The current entry path cannot be deleted unless entry is also reassigned.

Use Cases

  • Iterating on a mockup based on user feedback
  • Adding a new page or asset to an existing bundle
  • Removing files that are no longer part of the design

Example Tool Call

JSON
{ "name": "x_update_mockup", "arguments": { "identifier": "landing-page", "files": [ { "path": "css/main.css", "content": "body { font-family: system-ui; margin: 2rem; background: #0b1220; color: #f0f4ff; }\nh1 { color: #7aa9ff; }\n" } ], "delete": ["js/app.js"] } }

SSE Event

SSE
data: {"type":"x_mockup.updated","bundleId":"mkp_8f3k2m","identifier":"landing-page","title":"Landing Page Mockup","entry":"index.html","changed":["css/main.css"],"deleted":["js/app.js"]}

x_ask_user

Pauses the response and asks the user a clarifying question. The model uses this when ambiguity prevents a useful response. Supports multiple interaction styles: free-text questions, multiple-choice options, confirmation prompts, rating scales, and structured form fields. This tool is always injected in chat context.

Parameters

ParameterTypeDescription
questionrequired string The clarifying question to ask the user.
optionsoptional string[] List of suggested answers for the user to pick from.
allow_free_textoptional boolean Whether the user can type a free-text answer in addition to picking an option. Default: true.
multi_selectoptional boolean Whether the user can select more than one option. Default: false.
styleoptional string Card style. One of: default, confirm, rating, form. Default: default.
fieldsoptional object[] Form fields for form style cards. Each field has label, key, and optional placeholder and required.

Fields Sub-Parameters (for form style)

FieldTypeDescription
labelrequired string Display label for the input field.
keyrequired string Machine-readable key for the field value.
placeholderoptional string Placeholder text shown in the input.
requiredoptional boolean Whether the field must be filled. Default: false.

Card Styles

Default Style

A question with optional multiple-choice options and free-text input.

JSON
{ "name": "x_ask_user", "arguments": { "question": "Which programming language would you like the example in?", "options": ["Python", "JavaScript", "Go", "Rust"], "allow_free_text": true } }

Confirm Style

A Yes/No confirmation prompt.

JSON
{ "name": "x_ask_user", "arguments": { "question": "This will overwrite the existing configuration. Proceed?", "style": "confirm" } }

Rating Style

A 1-5 rating scale.

JSON
{ "name": "x_ask_user", "arguments": { "question": "How satisfied are you with this solution?", "style": "rating" } }

Form Style

Labeled input fields for collecting structured data.

JSON
{ "name": "x_ask_user", "arguments": { "question": "Please provide the database connection details:", "style": "form", "fields": [ {"label": "Host", "key": "host", "placeholder": "localhost", "required": true}, {"label": "Port", "key": "port", "placeholder": "5432", "required": true}, {"label": "Database Name", "key": "db_name", "placeholder": "myapp_production", "required": true}, {"label": "Username", "key": "username", "placeholder": "admin"}, {"label": "Password", "key": "password", "placeholder": "********"} ] } }

Use Cases

  • Disambiguating vague requirements before generating code
  • Collecting structured configuration input
  • Getting user confirmation before destructive operations
  • Offering choices when multiple valid approaches exist

SSE Events

The chat-completions SSE hook converts an x_ask_user tool call into the events below. All field names are camelCase, matching the emitter in ChatCompletionsArtifactPostSynthesisHook:

SSE
data: {"type":"x_ask_user.question","askUserId":"ask_abc123","toolCallId":"call_42","question":"Which programming language?","options":["Python","JavaScript","Go","Rust"],"allowFreeText":true,"multiSelect":false,"style":"default","fields":[]} data: {"type":"x_ask_user.pending_state","askUserId":"ask_abc123","toolCallId":"call_42","assistantContent":"...partial assistant text...","toolCalls":[]}

Notes: x_ask_user has no server-tool implementation file, dispatch is SSE-only via the chat-completions hook. Argument defaults derive from optionalBool/optionalString reads (allow_free_text defaults to false when the model does not set it, despite the convenience example above).

x_save_memory

Saves a fact, preference, or instruction to the per-workspace memory store. The content is embedded as a vector and persisted to the database for future semantic recall. Near-duplicate content is automatically deduplicated via embedding similarity comparison. This tool is always injected in chat context and requires a chat ID.

Parameters

ParameterTypeDescription
contentrequired string The fact, preference, or instruction to save to memory.
categoryoptional string Category of the memory entry. One of: preference, fact, instruction. When omitted, the system infers the category from the content.

Deduplication

When saving, the system checks for existing memories with very high semantic similarity. If a near-duplicate is found, the save is skipped and the existing memory is returned instead. This prevents the memory store from accumulating redundant entries.

Use Cases

  • Storing user formatting preferences
  • Remembering project details and deadlines
  • Saving coding style instructions
  • Persisting domain-specific knowledge for the conversation

Example Tool Call

JSON
{ "name": "x_save_memory", "arguments": { "content": "User prefers TypeScript with strict mode enabled for all code examples", "category": "preference" } }

Example Response

JSON
{ "status": "saved", "memory_id": "mem_x7k9p2", "content": "User prefers TypeScript with strict mode enabled for all code examples", "category": "preference" }

For full details on memory architecture, passive injection, and the management API, see Chat Memory.

x_recall_memory

Searches the per-workspace memory store using semantic similarity. Returns the top matching memories ranked by relevance score. This tool is always injected in chat context and requires a chat ID.

Parameters

ParameterTypeDescription
queryrequired string Natural language query to search memories against.
limitoptional integer Max memories to return.
offsetoptional integer Zero-based row offset for pagination.

Return Value

Returns a ranked markdown list of matching memories.

Use Cases

  • Retrieving user preferences before generating output
  • Looking up previously mentioned facts and context
  • Checking for saved instructions before starting a task

Example Tool Call

JSON
{ "name": "x_recall_memory", "arguments": { "query": "What programming language does the user prefer?" } }

Example Response

JSON
{ "memories": [ { "id": "mem_x7k9p2", "content": "User prefers TypeScript with strict mode enabled for all code examples", "category": "preference", "relevance": 0.94, "created_at": "2026-03-09T14:22:00Z" }, { "id": "mem_r3m8q1", "content": "Always use ESLint with the recommended ruleset", "category": "instruction", "relevance": 0.78, "created_at": "2026-03-09T14:18:00Z" } ] }

For full details on memory architecture, passive injection, and the management API, see Chat Memory.

x_forget

Retracts a memory, decision, milestone, or artifact the model created earlier in the session. The delete is soft: the entity remains visible in the app and the CLI and can be restored there. Use this for model self-correction only. User-requested deletion is done in the app or CLI, not via this tool.

Parameters

ParameterTypeDescription
typerequired string Entity type to retract. One of: memory, decision, milestone, artifact.
idrequired string The handle the original tool returned: the memory UUID for memory, the dec_ id for decision, the mst_ id for milestone, or the identifier slug for artifact.

Example Tool Call

JSON
{ "name": "x_forget", "arguments": { "type": "decision", "id": "dec_a1b2c3d4e5f6g7h8" } }

Agentic Loop

When server-side tools are enabled, Xerotier uses an agentic loop to iteratively call tools and feed results back to the model. This loop is the foundational mechanism that powers all server-side tool execution, including research workflows and Deep Think.

How the Loop Works

  1. Model response, The model generates a response that may include one or more tool calls.
  2. Tool execution, Xerotier executes the requested tools server-side (e.g., x_web_search, x_fetch_url, x_code_search).
  3. Result injection, Tool results are appended to the conversation as tool-role messages.
  4. Follow-up, The model generates another response based on the tool results. This response may include additional tool calls.
  5. Termination, Steps 2-4 repeat until the model produces a final text response (no tool calls) or the iteration limit is reached.

Limits

LimitValueDescription
Maximum iterations per request 5 The loop runs at most 5 tool-call rounds before forcing a final response.
Per-tool execution timeout 15 seconds Each individual tool call must complete within 15 seconds or it is cancelled.
Tool call rate limit 45 / minute Maximum of 45 tool calls per minute per endpoint to prevent abuse.

Automatic Behavior

The agentic loop runs automatically whenever tools such as x_web_search, x_fetch_url, x_code_search, or x_calculator are enabled via web_search_options or the web_search_preview tool type. No additional client-side configuration is required, the router handles all tool execution, result marshalling, and follow-up model calls transparently.

If the model does not invoke any tools, the loop completes in a single iteration and the response is returned directly. When the maximum iteration limit is reached, the model is prompted to produce a final answer using the information gathered so far.

Note: The Deep Think feature extends this agentic loop with a multi-phase planning and synthesis layer on top. Each Deep Think sub-task runs its own agentic loop independently.

Deep Think

Deep Think performs extended, multi-phase autonomous research on a query. The system decomposes the question into sub-tasks, executes each independently through the agentic loop, and synthesizes all findings into a comprehensive report. During execution the SSE stream emits progress events so clients can display real-time status for each sub-task.

How It Works

  1. Planning, The model decomposes the user query into focused sub-tasks (up to 10 by default), each with a specific search question and tool set.
  2. Execution, Each sub-task runs through the agentic loop sequentially. All research tools (x_web_search, x_fetch_url, x_code_search, x_repo_overview, x_repo_read, x_calculator) are available per sub-task.
  3. Synthesis, All sub-task results are combined and the model produces a single comprehensive report streamed as normal SSE content chunks.

Enabling Deep Think

Add x_deep_think: true to the web_search_options object. All research tools are automatically enabled for deep think requests.

JSON
{ "model": "my-model", "messages": [ {"role": "user", "content": "Comprehensive analysis of quantum computing advances in 2026"} ], "stream": true, "web_search_options": { "search_context_size": "medium", "x_deep_think": true, "x_tools": ["x_web_search", "x_fetch_url", "x_code_search", "x_calculator"] } }

web_search_options Fields

FieldTypeDefaultDescription
x_deep_thinkoptional boolean false When true, activates the multi-phase deep think pipeline instead of the single agentic loop.

Deep Think Lifecycle

The SSE stream for a deep think request follows this sequence. Caveat: the x_deep_think.completed, x_deep_think.error, and x_deep_think.artifact_created event types are declared in DeepThinkModels but are dashboard-synthesised or not currently emitted on the public /v1/chat/completions stream. Treat them as forward-looking contract.

Sequence
1. x_deep_think.plan_created , plan ready, includes title + sub-task count 2. x_deep_think.subtask_started , sub-task N begins (repeated per sub-task) x_research.searching / x_research.reading , tool-level events within sub-task 3. x_deep_think.subtask_completed, sub-task N finished ... (repeat 2-3 for each sub-task) ... 4. x_deep_think.synthesizing , synthesis phase begins data: {"choices":[...]} , normal SSE content chunks (final report) 5. x_deep_think.completed , pipeline finished, includes artifact info 6. data: [DONE] , stream terminates

Limits

LimitValueDescription
Max sub-tasks 10 Maximum sub-tasks the planner can create per request.
Sub-task iterations 5 Maximum agentic loop iterations per sub-task.

Deep Think Events (SSE)

Deep think progress events are emitted as inline SSE data lines with a type field prefixed by x_deep_think.. Clients should check for this prefix and render progress UI accordingly.

Event TypeFieldsDescription
x_deep_think.plan_created title, total_subtasks Emitted after the planning phase. Contains the research title and number of sub-tasks.
x_deep_think.subtask_started subtask_id, subtask_query, subtask_index, total_subtasks Emitted when a sub-task begins execution.
x_deep_think.subtask_completed subtask_id, subtask_index Emitted when a sub-task finishes.
x_deep_think.synthesizing message Emitted when synthesis begins. Content chunks follow as normal SSE.
x_deep_think.completed title, artifact_name Emitted when the pipeline finishes. The report is auto-saved as an artifact.
x_deep_think.error message Emitted if a fatal error occurs during deep think.
x_deep_think.discovery_started mode, message Emitted when the discovery phase begins (target-focused mode only).
x_deep_think.discovery_completed pages_fetched, site_map, message Emitted when the discovery phase completes. Includes page count and site structure summary.
x_deep_think.artifact_created artifact_type, artifact_title, message Emitted when a structured artifact (table, matrix, findings list) is created from sub-task results.

Example Event Stream

SSE
data: {"type":"x_deep_think.plan_created","title":"Quantum Computing Advances","total_subtasks":4} data: {"type":"x_deep_think.subtask_started","subtask_id":"1","subtask_query":"Latest quantum error correction breakthroughs","subtask_index":0,"total_subtasks":4} data: {"type":"x_research.searching","name":"x_web_search","arguments":"{\"query\":\"quantum error correction 2026\"}"} data: {"type":"x_deep_think.subtask_completed","subtask_id":"1","subtask_index":0} data: {"type":"x_deep_think.subtask_started","subtask_id":"2","subtask_query":"Major quantum hardware milestones","subtask_index":1,"total_subtasks":4} ... data: {"type":"x_deep_think.synthesizing","message":"Synthesizing final report..."} data: {"choices":[{"index":0,"delta":{"content":"# Quantum Computing Advances\n\n"},"finish_reason":null}]} ... data: {"type":"x_deep_think.completed","title":"Quantum Computing Advances","artifact_name":"deep-think-20260227-143012.md"} data: [DONE]

SSE Events Reference

When streaming, server-side tool execution emits inline SSE events so clients can display progress indicators. All vendor-specific events use the x_ prefix for OpenAI spec compliance.

Research Events

Event TypeToolDescription
x_research.searching x_web_search Emitted when a web search begins execution.
x_research.reading x_fetch_url Emitted when a URL fetch begins execution.
x_research.code_searching x_code_search Emitted when a code search begins execution.
x_research.calculating x_calculator Emitted when a calculator evaluation begins.
x_research.tool_call Any Emitted for tool invocations that do not match a specific research event type. Includes tool_name, message.
x_research.result All research tools Emitted when a tool call completes with its result.
x_research.complete -- Declared in the event vocabulary and consumed by the chat dashboard, but not currently emitted by any router-side code path on the public /v1/chat/completions stream. Documented for future parity; do not depend on it from public-API clients yet.

Artifact Events

Event TypeDescription
x_artifact.created Emitted when a new artifact is created. Includes identifier, title, version.
x_artifact.updated Emitted when an existing artifact is updated. Includes identifier, version.

Ask User Events

Event TypeDescription
x_ask_user.question Emitted when the model asks a clarifying question. CamelCase fields: askUserId, toolCallId, question, options, allowFreeText, multiSelect, style, fields.
x_ask_user.pending_state Emitted with the assistant's partial content and tool state while awaiting user response. Fields: askUserId, toolCallId, assistantContent, toolCalls.

Deep Think Events

Event TypeDescription
x_deep_think.plan_created Emitted after the planning phase. Includes title, total_subtasks.
x_deep_think.subtask_started Emitted when a sub-task begins. Includes subtask_id, subtask_query, subtask_index, total_subtasks.
x_deep_think.subtask_completed Emitted when a sub-task finishes. Includes subtask_id, subtask_index.
x_deep_think.synthesizing Emitted when synthesis begins. Includes message.
x_deep_think.completed Emitted when the pipeline finishes. Includes title, artifact_name.
x_deep_think.error Emitted if a fatal error occurs. Includes message.
x_deep_think.discovery_started Emitted when discovery phase begins (target-focused mode). Includes mode, message.
x_deep_think.discovery_completed Emitted when discovery phase completes. Includes pages_fetched, site_map, message.
x_deep_think.artifact_created Emitted when a structured artifact is created. Includes artifact_type, artifact_title, message.

File Search Events (Responses API)

Note: the OpenAI-compatible response.file_search_call.* event types below are declared in the router's event vocabulary but are not currently emitted by any router-side code path on the public /v1/chat/completions stream. They are reserved for future parity with the Responses API and should not be relied upon by client integrations yet.

Event TypeDescription
response.file_search_call.in_progress Emitted when a server-side file search invocation begins. Includes item_id, output_index.
response.file_search_call.searching Emitted while the file search is actively executing.
response.file_search_call.completed Emitted when the file search finishes. Includes the completed output item with results.

Chat Metadata Events

Event TypeDescription
x_chat.metadata Emitted at the end of a response stream with aggregated usage metadata including research token counts. Includes usage (object with research token breakdown).

Example SSE Stream

SSE
data: {"type":"x_research.searching","name":"x_web_search","arguments":"{\"query\":\"rust async patterns 2026\"}"} data: {"type":"x_research.result","name":"x_web_search","tool_call_id":"call_1"} data: {"type":"x_research.reading","name":"x_fetch_url","arguments":"{\"url\":\"https://blog.rust-lang.org/...\"}"} data: {"type":"x_research.result","name":"x_fetch_url","tool_call_id":"call_2"} data: {"type":"x_research.calculating","name":"x_calculator","arguments":"{\"expression\":\"2^32\"}"} data: {"type":"x_research.result","name":"x_calculator","tool_call_id":"call_3"} data: {"type":"x_research.complete","elapsed_ms":4200,"input_tokens":12500,"output_tokens":850,"iterations":3,"sources":5} data: {"choices":[{"index":0,"delta":{"content":"Based on my research..."},"finish_reason":null}]} ... data: [DONE]

Rate Limiting & Caching

Per-Project Rate Limiting

Tool calls are rate limited per project (default 45 calls per minute). When the limit is exceeded, the tool returns an error response instead of executing.

Rate Limit Error Response

JSON
{ "error": "Research tool rate limit exceeded. Try again in 12 seconds." }

Result Caching

Identical tool calls (same function name and arguments) within a 5-minute window return cached results without re-executing. This prevents redundant network calls when the model re-invokes the same search or fetch. Auto-fetched URLs from x_web_search are also cached, so subsequent explicit x_fetch_url calls for the same URL are free.

Tool Limits

LimitValueDescription
Rate limit 45/min Maximum tool calls per minute per project.
Cache TTL 5 minutes How long identical tool results are cached.
Auto-fetch count 2 Number of top URLs auto-fetched from web_search results.
Max URLs per fetch 5 Maximum URLs per fetch_url call.
Max follow links 3 Maximum same-domain links to auto-follow in discovery mode.
Character budget 24,000 Character budget for multi-URL fetch results, distributed across pages.

Full API Examples

Chat Completions with Research Tools

curl
curl -X POST https://api.xerotier.ai/proj_ABC123/my-endpoint/v1/chat/completions \ -H "Authorization: Bearer xero_myproject_your_api_key" \ -H "Content-Type: application/json" \ -d '{ "model": "my-model", "messages": [ {"role": "user", "content": "What are the key differences between Rust and Go for building web servers?"} ], "stream": true, "web_search_options": { "search_context_size": "medium", "x_tools": ["x_web_search", "x_fetch_url", "x_code_search"] } }'

Chat Completions with All Tools Enabled

curl
curl -X POST https://api.xerotier.ai/proj_ABC123/my-endpoint/v1/chat/completions \ -H "Authorization: Bearer xero_myproject_your_api_key" \ -H "Content-Type: application/json" \ -d '{ "model": "my-model", "messages": [ {"role": "user", "content": "Compare the async runtimes in Rust and explain the tradeoffs"} ], "stream": true, "web_search_options": { "search_context_size": "high", "max_iterations": 8, "x_tools": [ "x_web_search", "x_fetch_url", "x_code_search", "x_repo_overview", "x_calculator" ] } }'
Python (OpenAI SDK)
from openai import OpenAI client = OpenAI( base_url="https://api.xerotier.ai/proj_ABC123/my-endpoint/v1", api_key="xero_myproject_your_api_key", ) stream = client.chat.completions.create( model="my-model", messages=[ {"role": "user", "content": "Compare the async runtimes in Rust and explain the tradeoffs"} ], stream=True, extra_body={ "web_search_options": { "search_context_size": "high", "max_iterations": 8, "x_tools": [ "x_web_search", "x_fetch_url", "x_code_search", "x_repo_overview", "x_calculator" ] } }, ) for chunk in stream: if chunk.choices and chunk.choices[0].delta.content: print(chunk.choices[0].delta.content, end="") print()
Node.js (OpenAI SDK)
import OpenAI from "openai"; const client = new OpenAI({ baseURL: "https://api.xerotier.ai/proj_ABC123/my-endpoint/v1", apiKey: "xero_myproject_your_api_key", }); const stream = await client.chat.completions.create({ model: "my-model", messages: [ { role: "user", content: "Compare the async runtimes in Rust and explain the tradeoffs" } ], stream: true, web_search_options: { search_context_size: "high", max_iterations: 8, x_tools: [ "x_web_search", "x_fetch_url", "x_code_search", "x_repo_overview", "x_calculator" ], }, }); for await (const chunk of stream) { const content = chunk.choices?.[0]?.delta?.content; if (content) process.stdout.write(content); } console.log();

Responses API with Web Search and File Search

curl
curl -X POST https://api.xerotier.ai/proj_ABC123/my-endpoint/v1/responses \ -H "Authorization: Bearer xero_myproject_your_api_key" \ -H "Content-Type: application/json" \ -d '{ "model": "my-model", "input": "Compare the uploaded design spec with current best practices for REST API design", "stream": true, "tools": [ { "type": "web_search_preview", "search_context_size": "medium", "x_tools": ["x_web_search", "x_fetch_url"] }, { "type": "file_search", "vector_store_ids": ["vs_abc123"] } ] }'
Python (OpenAI SDK)
from openai import OpenAI client = OpenAI( base_url="https://api.xerotier.ai/proj_ABC123/my-endpoint/v1", api_key="xero_myproject_your_api_key", ) response = client.responses.create( model="my-model", input="Compare the uploaded design spec with current best practices for REST API design", stream=True, tools=[ { "type": "web_search_preview", "search_context_size": "medium", "x_tools": ["x_web_search", "x_fetch_url"], }, { "type": "file_search", "vector_store_ids": ["vs_abc123"], }, ], ) for event in response: if hasattr(event, "type"): print(event)
Node.js (OpenAI SDK)
import OpenAI from "openai"; const client = new OpenAI({ baseURL: "https://api.xerotier.ai/proj_ABC123/my-endpoint/v1", apiKey: "xero_myproject_your_api_key", }); const stream = await client.responses.create({ model: "my-model", input: "Compare the uploaded design spec with current best practices for REST API design", stream: true, tools: [ { type: "web_search_preview", search_context_size: "medium", x_tools: ["x_web_search", "x_fetch_url"], }, { type: "file_search", vector_store_ids: ["vs_abc123"], }, ], }); for await (const event of stream) { console.log(event); }

Multi-Tool Conversation Flow

This example shows a typical multi-tool chain where the model searches for information, fetches a page for detail, and uses the calculator to verify a number, all within a single agentic loop.

curl
curl -X POST https://api.xerotier.ai/proj_ABC123/my-endpoint/v1/chat/completions \ -H "Authorization: Bearer xero_myproject_your_api_key" \ -H "Content-Type: application/json" \ -d '{ "model": "my-model", "messages": [ { "role": "user", "content": "What is the current price of gold per ounce, and how much would 3.5 troy ounces cost in EUR at today'\''s exchange rate?" } ], "stream": true, "web_search_options": { "search_context_size": "medium", "max_iterations": 6, "x_tools": ["x_web_search", "x_fetch_url", "x_calculator"] } }'
Python (OpenAI SDK)
import json from openai import OpenAI client = OpenAI( base_url="https://api.xerotier.ai/proj_ABC123/my-endpoint/v1", api_key="xero_myproject_your_api_key", ) # The model may chain: web_search -> fetch_url -> calculator # all server-side in a single request stream = client.chat.completions.create( model="my-model", messages=[ { "role": "user", "content": "What is the current price of gold per ounce, " "and how much would 3.5 troy ounces cost in EUR " "at today's exchange rate?" } ], stream=True, extra_body={ "web_search_options": { "search_context_size": "medium", "max_iterations": 6, "x_tools": ["x_web_search", "x_fetch_url", "x_calculator"] } }, ) for chunk in stream: raw = chunk.model_dump() # Check for research progress events if "type" in raw and raw["type"].startswith("x_research."): event_type = raw["type"] if event_type == "x_research.searching": print(f"[Searching] {raw.get('arguments', '')}") elif event_type == "x_research.reading": print(f"[Reading] {raw.get('arguments', '')}") elif event_type == "x_research.calculating": print(f"[Calculating] {raw.get('arguments', '')}") elif event_type == "x_research.complete": print(f"[Research complete] {raw.get('iterations', 0)} iterations, " f"{raw.get('elapsed_ms', 0)}ms") continue # Normal content chunks if chunk.choices and chunk.choices[0].delta.content: print(chunk.choices[0].delta.content, end="") print()
Node.js (OpenAI SDK)
import OpenAI from "openai"; const client = new OpenAI({ baseURL: "https://api.xerotier.ai/proj_ABC123/my-endpoint/v1", apiKey: "xero_myproject_your_api_key", }); // The model may chain: web_search -> fetch_url -> calculator // all server-side in a single request const stream = await client.chat.completions.create({ model: "my-model", messages: [ { role: "user", content: "What is the current price of gold per ounce, " + "and how much would 3.5 troy ounces cost in EUR " + "at today's exchange rate?", } ], stream: true, web_search_options: { search_context_size: "medium", max_iterations: 6, x_tools: ["x_web_search", "x_fetch_url", "x_calculator"], }, }); for await (const chunk of stream) { const raw = chunk; // Check for research progress events if (raw.type && raw.type.startsWith("x_research.")) { if (raw.type === "x_research.searching") { console.log(`[Searching] ${raw.arguments || ""}`); } else if (raw.type === "x_research.reading") { console.log(`[Reading] ${raw.arguments || ""}`); } else if (raw.type === "x_research.calculating") { console.log(`[Calculating] ${raw.arguments || ""}`); } else if (raw.type === "x_research.complete") { console.log(`[Research complete] ${raw.iterations || 0} iterations, ${raw.elapsed_ms || 0}ms`); } continue; } // Normal content chunks const content = chunk.choices?.[0]?.delta?.content; if (content) process.stdout.write(content); } console.log();

Deep Think (curl)

curl
curl -X POST https://api.xerotier.ai/proj_ABC123/my-endpoint/v1/chat/completions \ -H "Authorization: Bearer xero_myproject_your_api_key" \ -H "Content-Type: application/json" \ -d '{ "model": "my-model", "messages": [ {"role": "user", "content": "Comprehensive analysis of quantum computing advances in 2026"} ], "stream": true, "web_search_options": { "search_context_size": "medium", "x_deep_think": true, "x_tools": ["x_web_search", "x_fetch_url", "x_code_search", "x_calculator"] } }'

Deep Think (Python)

Python (OpenAI SDK)
import json from openai import OpenAI client = OpenAI( base_url="https://api.xerotier.ai/proj_ABC123/my-endpoint/v1", api_key="xero_myproject_your_api_key", ) stream = client.chat.completions.create( model="my-model", messages=[ {"role": "user", "content": "Comprehensive analysis of quantum computing advances in 2026"} ], stream=True, extra_body={ "web_search_options": { "search_context_size": "medium", "x_deep_think": True, "x_tools": ["x_web_search", "x_fetch_url", "x_code_search", "x_calculator"] } }, ) for chunk in stream: raw = chunk.model_dump() # Check for deep think progress events if "type" in raw and raw["type"].startswith("x_deep_think."): print(f"[{raw['type']}]", json.dumps(raw, indent=2)) continue # Normal content chunks if chunk.choices and chunk.choices[0].delta.content: print(chunk.choices[0].delta.content, end="") print()
Node.js (OpenAI SDK)
import OpenAI from "openai"; const client = new OpenAI({ baseURL: "https://api.xerotier.ai/proj_ABC123/my-endpoint/v1", apiKey: "xero_myproject_your_api_key", }); const stream = await client.chat.completions.create({ model: "my-model", messages: [ { role: "user", content: "Comprehensive analysis of quantum computing advances in 2026" } ], stream: true, web_search_options: { search_context_size: "medium", x_deep_think: true, x_tools: ["x_web_search", "x_fetch_url", "x_code_search", "x_calculator"], }, }); for await (const chunk of stream) { const raw = chunk; // Check for deep think progress events if (raw.type && raw.type.startsWith("x_deep_think.")) { console.log(`[${raw.type}]`, JSON.stringify(raw, null, 2)); continue; } // Normal content chunks const content = chunk.choices?.[0]?.delta?.content; if (content) process.stdout.write(content); } console.log();

Deep think requests use streaming and emit progress events inline. The final report is streamed as normal content chunks after all sub-tasks complete. The report is also auto-saved as an artifact in the chat interface.

Artifact Creation Flow

curl
curl -X POST https://api.xerotier.ai/proj_ABC123/my-endpoint/v1/chat/completions \ -H "Authorization: Bearer xero_myproject_your_api_key" \ -H "Content-Type: application/json" \ -d '{ "model": "my-model", "messages": [ { "role": "user", "content": "Create a Python script that implements a binary search tree with insert, search, and in-order traversal methods." } ], "stream": true }'
Python (OpenAI SDK)
import json from openai import OpenAI client = OpenAI( base_url="https://api.xerotier.ai/proj_ABC123/my-endpoint/v1", api_key="xero_myproject_your_api_key", ) # In the chat context, create_artifact and update_artifact # are always available. The model will use them when appropriate. stream = client.chat.completions.create( model="my-model", messages=[ { "role": "user", "content": "Create a Python script that implements a " "binary search tree with insert, search, and " "in-order traversal methods." } ], stream=True, ) for chunk in stream: raw = chunk.model_dump() # Check for artifact events if "type" in raw: if raw["type"] == "x_artifact.created": print(f"\n[Artifact created] {raw.get('identifier', '')} " f"- {raw.get('title', '')} (v{raw.get('version', 1)})") elif raw["type"] == "x_artifact.updated": print(f"\n[Artifact updated] {raw.get('identifier', '')} " f"(v{raw.get('version', '')})") continue if chunk.choices and chunk.choices[0].delta.content: print(chunk.choices[0].delta.content, end="") print()
Node.js (OpenAI SDK)
import OpenAI from "openai"; const client = new OpenAI({ baseURL: "https://api.xerotier.ai/proj_ABC123/my-endpoint/v1", apiKey: "xero_myproject_your_api_key", }); // In the chat context, create_artifact and update_artifact // are always available. The model will use them when appropriate. const stream = await client.chat.completions.create({ model: "my-model", messages: [ { role: "user", content: "Create a Python script that implements a " + "binary search tree with insert, search, and " + "in-order traversal methods.", } ], stream: true, }); for await (const chunk of stream) { const raw = chunk; // Check for artifact events if (raw.type) { if (raw.type === "x_artifact.created") { console.log(`\n[Artifact created] ${raw.identifier || ""} ` + `- ${raw.title || ""} (v${raw.version || 1})`); } else if (raw.type === "x_artifact.updated") { console.log(`\n[Artifact updated] ${raw.identifier || ""} ` + `(v${raw.version || ""})`); } continue; } const content = chunk.choices?.[0]?.delta?.content; if (content) process.stdout.write(content); } console.log();