Server-Side Tools Reference

The tools fall into four categories:

Research, tools for gathering information from external sources.
Content, tools for creating and updating persistent artifacts.
Interaction, tools for requesting input from the user mid-stream.
Knowledge, tools for saving, recalling, and searching stored data.
Project Intelligence, tools for tracking decisions, milestones, relationships, and generating project briefings.

Tool Summary

Tool	Category	Description
`x_web_search`	Research	Built-in web search
`x_fetch_url`	Research	Fetch and extract text from web pages and PDFs
`x_code_search`	Research	GitHub code search (snippet results)
`x_repo_overview`	Research	Full GitHub repository overview (metadata, languages, README, tree, recent commits)
`x_repo_read`	Research	Read specific files or directory listings from a GitHub repository
`x_inspect_site`	Research	Single-URL HTML and stylesheet inspection for theme, palette, and typography extraction
`x_calculator`	Research	Evaluate mathematical expressions safely
`x_create_artifact`	Content	Create persistent artifacts (code, docs, HTML, SVG, diagrams)
`x_update_artifact`	Content	Update a previously created artifact with new content
`x_create_mockup`	Content	Create a multi-file mockup bundle for visual preview in a sandboxed iframe (direct API only; not advertised to agentic chats)
`x_add_mockup_file`	Content	Incrementally write one file at a time into a mockup bundle keyed by identifier (agentic-mode preferred path)
`x_update_mockup`	Content	Update files in a previously created mockup bundle
`x_read_mockup`	Content	Read mockup bundle contents (manifest, single file, or full inline). 256 KB cap on returned content.
`x_read_artifact`	Content	Read a single artifact's metadata and inline content (1 MiB cap; metadata-only above that)
`x_list_artifacts`	Content	List artifacts visible to the calling workspace (optionally filtered by chat and type)
`x_ask_user`	Interaction	Ask the user a clarifying question with optional structured input
`x_save_memory`	Knowledge	Save facts, preferences, or instructions to per-workspace memory
`x_recall_memory`	Knowledge	Search saved memories using semantic similarity
`x_forget`	Knowledge	Retract a memory, decision, milestone, relation, or artifact created earlier in the session
`x_workspace_search`	Knowledge	Unified semantic search across workspace memories, documents, and artifacts
`x_track_decision`	Project Intelligence	Record structured architectural or design decisions
`x_query_decisions`	Project Intelligence	Search and filter recorded decisions
`x_track_milestone`	Project Intelligence	Record project milestones and goals
`x_query_timeline`	Project Intelligence	Query the project timeline of milestones
`x_project_brief`	Project Intelligence	Generate a project state briefing
`x_relate`	Project Intelligence	Create relationships between workspace entities
`x_query_graph`	Project Intelligence	Return the workspace knowledge graph (decisions + milestones + relations) as one composite payload
`x_analyze_workspace`	Project Intelligence	Synthesize a structured brief of the workspace context (documents, memories, artifacts, chat history)
`x_exec_invoke`	Execution	Invoke a governed XEM tool via the MCP exec adapter; returns the invocation id for polling
`x_exec_poll`	Execution	Poll the lifecycle status of an exec invocation
`x_exec_cancel`	Execution	Cancel an in-flight exec invocation
`x_exec_invoke_blocking`	Execution	Long-lived blocking exec dispatch with streaming progress (MCP-only; 900s timeout)

Enabling Server-Side Tools

Toggleable research tools (x_web_search, x_code_search, x_repo_overview, x_repo_read, x_inspect_site) must be explicitly opted into via the UI or x_tools. All other tools are injected automatically:

x_create_artifact and x_update_artifact, always injected in chat context. x_read_artifact and x_list_artifacts are also chat-visible.
x_add_mockup_file and x_update_mockup, injected only in agentic mode (when the workspace's agenticByDefault is enabled or the request explicitly opts into agentic mode). x_create_mockup stays registered for direct POST /v1/responses callers that ship their own tools array, but is not advertised to agentic chats. x_read_mockup is in the agentic default tool set.
x_ask_user, always injected in chat context (dispatched via the chat-completions SSE hook; no server-side tool implementation file).
x_save_memory, x_recall_memory, and x_forget, always injected in chat context.
x_workspace_search, always injected in chat context for unified search across workspace content.
x_calculator and x_fetch_url, always injected (always-on tier).
x_track_decision, x_query_decisions, x_track_milestone, x_query_timeline, x_project_brief, x_relate, x_query_graph, x_analyze_workspace, always injected (project intelligence tools).
x_exec_invoke, x_exec_poll, x_exec_cancel, chat-toggleable governed-XEM execution adapter. x_exec_invoke_blocking is MCP-only.

Chat Completions API

Add web_search_options to a standard /v1/chat/completions request. Use the x_tools array to select which research tools to enable:

JSON

                    {
  "model": "my-model",
  "messages": [
    {"role": "user", "content": "Search for the latest Rust async patterns"}
  ],
  "stream": true,
  "web_search_options": {
    "search_context_size": "medium",
    "max_iterations": 5,
    "x_tools": ["x_web_search", "x_code_search", "x_calculator"]
  }
}
                

Responses API

Include a web_search_preview tool in the tools array:

JSON

                    {
  "model": "my-model",
  "input": "Find the latest research on transformer architectures",
  "stream": true,
  "tools": [
    {
      "type": "web_search_preview",
      "search_context_size": "medium",
      "x_tools": ["x_web_search", "x_fetch_url", "x_code_search"]
    }
  ]
}
                

x_tools Selection

Behavior	Details
Omitted or empty	Defaults to `["x_web_search", "x_fetch_url"]`.
`x_web_search` included	`x_fetch_url` is auto-included for URL follow-up.
Invalid names	Silently ignored. Falls back to defaults if all are invalid.
`x_code_search`, `x_repo_overview`, `x_repo_read`	Available when GitHub code search is enabled for your endpoint.
`x_web_search`	Available when web search is enabled for your endpoint. Where no search backend is configured, requests for `x_web_search` are silently dropped.
`x_calculator`	Always available. Evaluates mathematical expressions server-side.

Humanize Output

Humanize Output is a workspace policy, not an LLM-callable tool. Every assistant turn produced for a chat in that workspace runs through a deterministic post-synthesis filter before the content is flushed to the chat bubble and persisted. The filter strips AI tells (sycophantic openers, marketing adjectives, em-dash spam, redundant scaffolding, hedge filler) while preserving code, file paths, URLs, numbers, math, and citation markers byte-identical.

The filter is server-side and chat-page-scoped. External SDKs calling /v1/chat/completions directly do not pass through it; only chats rendered in the Xerotier chat-page UI invoke it.

How it works

Six pipeline stages. Each stage either pass-through or safe-abort; if any invariant trips, the original assistant draft is returned byte-identical and the abort is logged at .notice with the precise reason.

Protect-zone tokenizer. Masks code fences, inline code, URLs, file paths, numeric literals, math expressions, and citation tokens behind opaque sentinels so later stages cannot rewrite them.
Deterministic scrub. Site-ordered rule engine applies ~32 named rules covering openers, wrap-ups, banned phrases, marketing adjectives, transitional cliches, em-dash discipline, header demotion, and tricolon flattening.
Optional LLM polish. When the workspace's humanize_polish_enabled is on and the scrubbed draft is over 120 characters, sends the text to the workspace's configured polish endpoint (defaults to the workspace small endpoint) with a strict rewrite prompt. Failure (timeout, non-200, parse error) falls through to the next stage with the scrubbed draft; polish never blocks an assistant turn.
Post-polish re-scrub. Runs the same rule engine again. Polish models occasionally reintroduce tells; the scrubber is the floor.
Safety net. Aborts and returns the original draft if any sentinel went missing or got duplicated, any banned floor phrase survived, or the masked-text length changed by more than 40% (with a 32-character minimum-length floor).
Unmask. Replaces every sentinel with its original protected span.

Enabling the policy

Open the chat workspace settings tab and find the Humanize output card under Workspace behaviour. The main switch turns the deterministic scrub on or off for every chat in the workspace. A sub-card exposes:

LLM polish: when on, the polish pass runs after the deterministic scrub. Higher quality, adds ~1-3 s latency per turn.
Polish endpoint: the workspace endpoint the polish call targets. Defaults to workspace-small (auto), which selects the same small/cheap endpoint chat-name generation uses.
Voice anchor: a per-workspace textarea that conditions the polish system prompt on a custom voice register (see below).

Per-chat override

Each chat carries a tri-state override chip in the model-options panel: Inherit | On | Off. Inherit defers to the workspace toggle; On and Off force the filter for that single chat regardless of the workspace setting. Use Off when humanization is interfering with a precise technical answer.

Voice anchoring

The platform ships a built-in default voice block compiled into the frontend service. When the polish sub-toggle is on, the polish system prompt is conditioned on that voice register as "Match this voice: ...". This is what turns "less AI-sounding" into "sounds like this project."

The chat-settings Voice anchor textarea pre-populates from the built-in default when the workspace has not stored an override. Three states are honored:

Override set (non-empty): the text is sent verbatim as the voice line. Use this to give one workspace a distinct register without affecting others.
Override set (empty): explicit opt-out. The polish prompt omits the voice line entirely for this workspace.
No override (default): the built-in default voice block is used. Clear the textarea and save to enter the opt-out state; re-saving the seeded default text installs it as an explicit override.

Notes and limitations:

The built-in default lives in compiled code, not on disk. Operators who need to change the platform-wide default ship a new frontend release; operators who need a per-workspace variation use the textarea.
The voice block is injected only when the LLM polish sub-toggle is on. The deterministic scrub runs regardless and does not read the voice block.
The polish prompt budgets roughly 800 characters for the voice line; longer overrides are passed through and may cost extra prompt tokens.

What is preserved verbatim

The protect-zone tokenizer ensures the following spans pass through every stage byte-identical. If any sentinel is missing or duplicated after polish, the safety net aborts and returns the original draft.

Fenced code blocks (triple backtick)
Inline code (single backtick)
Markdown links and bare URLs (http, https, ws, wss, file)
Absolute and relative file paths
Numeric literals (integers, floats, hex, scientific notation, percentages)
Math expressions ( $...$ and $$...$$)
Citation tokens like [^1]

Operator visibility

A workspace-level Show humanized chip toggle exposes a small humanized chip on assistant bubbles whose content went through the filter. Hidden by default; flip it on when you want visual confirmation during QA. Hovering the chip shows whether the message was "scrub only" or "scrub + polish".

Admin observability

GET /admin/humanize/stats returns an admin-gated JSON snapshot covering:

Per-stage invocation counts (scrub, polish, aborted, skipped).
Per-reason abort counts (missing_sentinel, duplicated_sentinel, banned_phrase_floor, length_delta, polish_timeout, polish_error).
Top rules by hit count.
Polish-stage latency P50 and P95.

Prometheus metrics emitted under the xerotier_humanize_* namespace (_invocations_total, _aborts_total, _polish_tokens_total, _rule_hits_total, _latency_seconds) live on FrontendMetrics.shared.

Audit trail

Every assistant message produced by a humanized turn stores the raw pre-humanize draft alongside the humanized content. The audit column is chat_messages.pre_humanize_content and is nullable; it is only populated when the filter actually ran for that row.

x_web_search

Searches the web using built-in web search. Returns structured results with titles, URLs, and snippets. The top 2 result URLs are automatically fetched in parallel and appended as fetched_pages for immediate full-page context enrichment in a single tool call.

Web search is a research tool and must be explicitly opted into via web_search_options (Chat Completions) or a web_search_preview tool entry (Responses API). Including x_web_search in x_tools automatically enables x_fetch_url as well. See Enabling Server-Side Tools for request setup, SSE Events Reference for streaming progress events, and Rate Limiting & Caching for limits.

web_search Tool Parameters

When the model decides to invoke web search, it generates a tool call with the following argument:

Parameter	Type	Description
queryrequired	string	The search query to look up on the web. The model formulates this from the user message context.

Example Tool Call (generated by model)

JSON

                    {
  "name": "x_web_search",
  "arguments": {
    "query": "rust async trait stabilization 2026"
  }
}
                

web_search_options Fields

The fields below are accepted on the web_search_options object (Chat Completions) and on the web_search_preview tool object (Responses API).

Field	Type	Default	Description
search_context_sizeoptional	string	`"medium"`	Controls how much search context is used. One of: `low`, `medium`, `high`.
max_iterationsoptional	integer	`5`	Maximum agentic loop iterations. Range 1-10.
x_toolsoptional	string[]	`["x_web_search", "x_fetch_url"]`	Which research tools to enable. When omitted or empty, defaults to `["x_web_search", "x_fetch_url"]`. Including `x_web_search` auto-includes `x_fetch_url`.
x_deep_thinkoptional	boolean	`false`	Enables the multi-phase Deep Think pipeline. See the Deep Think section.
x_deep_think_max_branchesoptional	integer	-	Caps the number of parallel exploration branches when Deep Think is enabled.
x_deep_think_max_depthoptional	integer	-	Caps recursion depth for Deep Think exploration.
x_analystoptional	boolean	`false`	Enables the Analyst sub-agent that post-processes research into a structured analysis.
x_execoptional	boolean	`false`	Permits the model to invoke server-side execution tools (sandboxed). Subject to project policy.
x_exec_approval_policyoptional	string	`"never"`	Approval policy for exec tool calls. One of: `never`, `on_request`, `always`.

Response Format

The tool returns a JSON object with the following fields. The router injects this as a tool-role message back to the model.

Field	Type	Description
`answer`	string	Direct answer from the search engine, if available. May be empty.
`abstract`	string	Summary text from the search engine, if available. May be empty.
`results`	object[]	Array of search result objects, each with `title`, `url`, and `snippet`.
`fetched_pages`	object[]	Array of auto-fetched page content for the top 2 URLs. Each entry has `url` and `content`.

Example Response

JSON

                    {
  "answer": "",
  "abstract": "",
  "results": [
    {
      "title": "Async Trait Methods Stabilized in Rust 1.85",
      "url": "https://blog.rust-lang.org/2026/02/20/async-traits.html",
      "snippet": "Rust 1.85 stabilizes async fn in traits, enabling..."
    },
    {
      "title": "Understanding async traits in Rust",
      "url": "https://docs.rs/async-trait/latest/guide",
      "snippet": "A comprehensive guide to using async trait methods..."
    }
  ],
  "fetched_pages": [
    {
      "url": "https://blog.rust-lang.org/2026/02/20/async-traits.html",
      "content": "Announcing Rust 1.85. We are happy to announce that async fn in traits..."
    },
    {
      "url": "https://docs.rs/async-trait/latest/guide",
      "content": "Async Trait Guide. This guide covers the fundamentals of async trait..."
    }
  ]
}
                

Auto-Fetch Enrichment

After returning search results, the router automatically fetches the top 2 result URLs in parallel and appends the extracted text as fetched_pages. This gives the model both the search snippets and full page content in a single tool call, reducing the number of loop iterations needed.

Each auto-fetched page shares an equal portion of a fixed character budget (currently 12,000 characters across all auto-fetched pages, subject to change). Pages exceeding their share are truncated.

Auto-fetched URLs are also cached. If the model subsequently calls x_fetch_url for a URL that was already auto-fetched, the cached result is returned instantly without a network request.

x_fetch_url

Fetches and extracts readable text content from web pages and PDF documents. Supports single-URL fetch, parallel multi-URL fetch, link discovery for site exploration, and automatic same-domain link following.

Parameters

Parameter	Type	Description
urlrequired	string	A single URL to fetch content from. Must be http or https.
urlsoptional	string[]	Multiple URLs to fetch in parallel (max 5). Use this OR `url`, not both. When both are provided, they are merged and deduplicated.
discover_linksoptional	boolean	Extract links from fetched pages for site exploration. Default: `false`.

Modes

Single URL, provide only url without discovery flags. Returns extracted plain text directly.
Multi-URL, provide urls array for parallel fetching. Returns a JSON object with a pages array, each containing url, content, and error fields.
Discovery, set discover_links: true to extract links from fetched pages.

SSRF Protection

All URL fetches are protected against Server-Side Request Forgery (SSRF). Requests to private and internal IP ranges are blocked: 10.x, 172.16-31.x, 192.168.x, 127.x, ::1, fe80::/10, 169.254.x.x.

Character Budget

Multi-URL fetches share a total budget of 24,000 characters, distributed equally across all successfully fetched pages. Pages exceeding their share are truncated.

Use Cases

Reading documentation pages in full
Extracting article content for summarization
Crawling related pages on the same domain
Fetching PDF text content

Example: Single Fetch

JSON

                    {
  "name": "x_fetch_url",
  "arguments": {
    "url": "https://docs.example.com/api/authentication"
  }
}
                

Example: Multi-Fetch with Discovery

JSON

                    {
  "name": "x_fetch_url",
  "arguments": {
    "urls": [
      "https://docs.example.com/guide/getting-started",
      "https://docs.example.com/guide/configuration"
    ],
    "discover_links": true
  }
}
                

Example Multi-Fetch Response

JSON

                    {
  "discover_links_enabled": true,
  "total_pages": 4,
  "pages": [
    {
      "url": "https://docs.example.com/guide/getting-started",
      "content": "Getting Started. Install the SDK with npm install...",
      "error": false,
      "discovered_links": [
        {"url": "https://docs.example.com/guide/authentication", "text": "Authentication"},
        {"url": "https://docs.example.com/guide/advanced", "text": "Advanced Usage"}
      ]
    },
    {
      "url": "https://docs.example.com/guide/configuration",
      "content": "Configuration. Set environment variables to customize...",
      "error": false,
      "discovered_links": []
    },
    {
      "url": "https://docs.example.com/guide/authentication",
      "content": "Authentication. Use API keys or OAuth2 tokens...",
      "error": false,
      "followed_from_discovery": true
    },
    {
      "url": "https://docs.example.com/guide/advanced",
      "content": "Advanced Usage. Configure retry policies and timeouts...",
      "error": false,
      "followed_from_discovery": true
    }
  ]
}
                

SSE Event

SSE

                    data: {"type":"x_research.reading","name":"x_fetch_url","arguments":"{\"url\":\"https://docs.example.com/api/authentication\"}"}
                

x_code_search

Searches public code on GitHub via the GitHub Code Search API. Returns code snippets with repository, path, URL, and full file content for the top results. To inspect an entire repository (metadata, README, tree) or read specific files from a repo, use x_repo_overview or x_repo_read instead.

Parameters

Parameter	Type	Description
queryrequired	string	GitHub code search query. Supports qualifiers: `repo:owner/name`, `org:name`, `path:dir/`, `filename:name`, `extension:ext`, `language:lang`.
limitoptional	integer	Max results to return (defaults to 5, capped at 20). Full file content is fetched only for the top 3.
offsetoptional	integer	Zero-based row offset for pagination. Rounded down to the nearest `limit` boundary because GitHub paginates in fixed-size pages.

Example

JSON

                    {
  "name": "x_code_search",
  "arguments": {
    "query": "async stream handler repo:apple/swift-nio language:swift",
    "limit": 5
  }
}
                

SSE Event

SSE

                    data: {"type":"x_research.code_searching","name":"x_code_search","arguments":"{\"query\":\"async stream handler language:swift\"}"}
                

x_repo_overview

Returns a full GitHub repository overview: metadata, language breakdown, README content, top-level directory tree, and recent commits. Use this first to discover repository structure before reading individual files with x_repo_read.

Parameters

Parameter	Type	Description
repositoryrequired	string	GitHub repository. Accepts `owner/repo` or a full URL (e.g. `https://github.com/owner/repo`).

Example

JSON

                    {
  "name": "x_repo_overview",
  "arguments": {
    "repository": "vapor/vapor"
  }
}
                

x_repo_read

Reads specific files or directory listings from a GitHub repository. Use x_repo_overview first to discover the structure, then x_repo_read to fetch the files you need.

Parameters

Parameter	Type	Description
repositoryrequired	string	GitHub repository. Accepts `owner/repo` or a full URL.
pathoptional	string	Single file or directory path to fetch.
pathsoptional	string[]	Multiple file or directory paths to fetch in one call.

Example

JSON

                    {
  "name": "x_repo_read",
  "arguments": {
    "repository": "apple/swift-nio",
    "paths": ["Sources/NIOCore/Channel.swift", "Sources/NIOCore/EventLoop.swift"]
  }
}
                

x_inspect_site

Single-URL HTML and stylesheet inspection for extracting site theme metadata: palette, typography, structure, and key element classes. Use when the model needs to mirror a reference design rather than crawl multiple pages.

Parameters

Parameter	Type	Description
urlrequired	string	The page URL to inspect.

x_calculator

Evaluates mathematical expressions server-side using a safe recursive-descent parser. No code execution, only arithmetic, functions, and constants are supported.

Parameters

Parameter	Type	Description
expressionrequired	string	Mathematical expression to evaluate. Maximum 1,000 characters.

Supported Operations

Category	Supported
Operators	`+` `-` `*` `/` `^`
Functions	`sqrt`, `log`, `ln`, `sin`, `cos`, `tan`, `abs`, `floor`, `ceil`, `round`, `min`, `max`
Constants	`pi`, `e`

Use Cases

Unit conversions and dimensional analysis
Financial calculations (compound interest, amortization)
Scientific computation (trigonometry, logarithms)
Quick arithmetic the model should not hallucinate

Example Tool Call

JSON

                    {
  "name": "x_calculator",
  "arguments": {
    "expression": "10000 * (1 + 0.05)^3"
  }
}
                

Example Response

JSON

                    {
  "expression": "10000 * (1 + 0.05)^3",
  "result": 11576.25
}
                

More Examples

Expression	Result
`sqrt(144) + 2^3`	`20.0`
`sin(pi/2)`	`1.0`
`log(1000)`	`3.0`
`max(42, 17) * min(3, 5)`	`126.0`
`abs(-273.15) + ceil(2.1)`	`276.15`

SSE Event

SSE

                    data: {"type":"x_research.calculating","name":"x_calculator","arguments":"{\"expression\":\"10000 * (1 + 0.05)^3\"}"}
                

x_create_artifact

Creates a persistent artifact such as a code file, document, HTML page, SVG image, or Mermaid diagram. Artifacts are stored in the chat and can be viewed, downloaded, and updated. This tool is always injected in chat context, no opt-in required.

Parameters

Parameter	Type	Description
identifierrequired	string	Stable slug for this artifact. 1-128 characters, ASCII alphanumeric, hyphens, underscores, and dots only. Used to reference the artifact for later updates.
titlerequired	string	Human-readable display title for the artifact.
typerequired	string	MIME content type for the artifact content.
languageoptional	string	Programming language for syntax highlighting (e.g. python, swift, javascript). Omit for non-code artifacts.
contentrequired	string	The full content of the artifact. Maximum 1 MB.

Supported MIME Types

MIME Type	Use For
`text/markdown`	Markdown documents, reports, notes
`text/html`	HTML pages, interactive previews
`text/plain`	Plain text files, configuration
`image/svg+xml`	SVG vector graphics
`text/x-mermaid`	Mermaid diagram definitions
`application/json`	JSON data files
`text/x-{language}`	Code files (e.g. `text/x-python`, `text/x-swift`)

Identifier Rules

1-128 characters long
ASCII alphanumeric characters, hyphens, underscores, and dots only
Slug format, lowercase recommended (e.g. my-component.tsx, data-pipeline)
Must be unique within the chat for creation; reuse the same identifier with x_update_artifact

Versioning

Creating an artifact sets it to version 1. Subsequent updates via x_update_artifact increment the version number. All versions are retained for history.

Use Cases

Generating code files with syntax highlighting
Creating documentation and technical reports
Building interactive HTML previews
Rendering architecture diagrams with Mermaid

Example: Code File

JSON

                    {
  "name": "x_create_artifact",
  "arguments": {
    "identifier": "fibonacci.py",
    "title": "Fibonacci Generator",
    "type": "text/x-python",
    "language": "python",
    "content": "def fibonacci(n: int) -> list[int]:\n    \"\"\"Generate the first n Fibonacci numbers.\"\"\"\n    if n <= 0:\n        return []\n    if n == 1:\n        return [0]\n    seq = [0, 1]\n    for _ in range(2, n):\n        seq.append(seq[-1] + seq[-2])\n    return seq\n\nif __name__ == \"__main__\":\n    print(fibonacci(10))\n"
  }
}
                

Example: Mermaid Diagram

JSON

                    {
  "name": "x_create_artifact",
  "arguments": {
    "identifier": "auth-flow",
    "title": "Authentication Flow",
    "type": "text/x-mermaid",
    "content": "sequenceDiagram\n    participant U as User\n    participant A as API Gateway\n    participant S as Auth Service\n    participant D as Database\n    U->>A: POST /login\n    A->>S: Validate credentials\n    S->>D: Query user\n    D-->>S: User record\n    S-->>A: JWT token\n    A-->>U: 200 OK + token\n"
  }
}
                

SSE Event

SSE

                    data: {"type":"x_artifact.created","identifier":"fibonacci.py","title":"Fibonacci Generator","version":1}
                

x_update_artifact

Updates a previously created artifact with new content, creating a new version. The update is a full replacement, the entire content is replaced, not a diff or patch. This tool is always injected in chat context.

Parameters

Parameter	Type	Description
identifierrequired	string	The identifier of the artifact to update. Must match a previously created artifact in this chat.
contentrequired	string	The complete updated content. Replaces the previous version entirely.

Use Cases

Iterating on code based on user feedback
Refining documents and reports
Updating diagrams with new components
Fixing bugs in previously generated code

Example Tool Call

JSON

                    {
  "name": "x_update_artifact",
  "arguments": {
    "identifier": "fibonacci.py",
    "content": "from functools import lru_cache\n\n@lru_cache(maxsize=None)\ndef fibonacci(n: int) -> int:\n    \"\"\"Return the nth Fibonacci number (memoized).\"\"\"\n    if n < 2:\n        return n\n    return fibonacci(n - 1) + fibonacci(n - 2)\n\ndef fibonacci_sequence(count: int) -> list[int]:\n    \"\"\"Generate the first count Fibonacci numbers.\"\"\"\n    return [fibonacci(i) for i in range(count)]\n\nif __name__ == \"__main__\":\n    print(fibonacci_sequence(10))\n"
  }
}
                

SSE Event

SSE

data: {"type":"x_artifact.updated","identifier":"fibonacci.py","version":2}

x_create_mockup

Creates a multi-file mockup bundle (HTML, CSS, JavaScript, and other static assets) intended for visual preview. Unlike x_create_artifact, which stores a single file, a mockup bundle groups several related files that reference each other through relative paths. The bundle is rendered inside a sandboxed iframe so the model can demonstrate visual designs and interactive prototypes without exposing the host page to the bundle's scripts.

Availability: create_mockup stays registered on the router but is only surfaced to direct POST /v1/responses callers that ship their own tools array. The agentic chat surface advertises x_add_mockup_file (incremental, one file per call) and x_update_mockup (delete + bulk-replace against an existing bundle) instead, and only when the workspace has agenticByDefault enabled or the request explicitly opts into agentic mode.

Parameters

Parameter	Type	Description
identifierrequired	string	Stable slug for this bundle. 1-128 characters, ASCII alphanumeric, hyphens, underscores, and dots only. Used to reference the bundle for later updates.
titlerequired	string	Human-readable display title for the mockup.
entryrequired	string	Relative path within the bundle of the entry document loaded by the preview iframe (e.g. `index.html`). Must match one of the supplied file paths.
filesrequired	array	Array of file objects that make up the bundle. See file object schema below. At least one file is required and one of them must match `entry`.

File Object Schema

Field	Type	Description
pathrequired	string	Relative path within the bundle (e.g. `index.html`, `css/main.css`, `js/app.js`). Must not start with `/` or contain `..` segments.
contentrequired	string	The file's content. Plain text for text files; base64 for binary assets when `encoding` is `base64`.
encodingoptional	string	Either `utf8` (default) or `base64`. Use `base64` for binary assets such as images.

Bundle Preview URL

Each persisted file in a bundle is reachable via:

HTTP

GET /v1/mockups/{bundleId}/{path}

The bundleId is returned in the x_mockup.created event (see the streaming docs). The route is intended for sandboxed-iframe rendering only, responses include a strict Content-Security-Policy and are not meant to be embedded in unsandboxed contexts.

Use Cases

Demonstrating a multi-file UI design (HTML + CSS + JS) inside the chat
Building interactive prototypes with multiple linked pages
Sharing a small static site preview without leaving the chat

Example Tool Call

JSON

                    {
  "name": "x_create_mockup",
  "arguments": {
    "identifier": "landing-page",
    "title": "Landing Page Mockup",
    "entry": "index.html",
    "files": [
      {
        "path": "index.html",
        "content": "<!doctype html>\n<html>\n<head>\n  <link rel=\"stylesheet\" href=\"css/main.css\">\n</head>\n<body>\n  <h1>Hello</h1>\n  <script src=\"js/app.js\"></script>\n</body>\n</html>\n"
      },
      {
        "path": "css/main.css",
        "content": "body { font-family: system-ui; margin: 2rem; }\nh1 { color: #2b6cb0; }\n"
      },
      {
        "path": "js/app.js",
        "content": "console.log('mockup loaded');\n"
      }
    ]
  }
}
                

SSE Event

SSE

                    data: {"type":"x_mockup.created","bundleId":"mkp_8f3k2m","identifier":"landing-page","title":"Landing Page Mockup","entry":"index.html","files":[{"path":"index.html","contentType":"text/html","size":182},{"path":"css/main.css","contentType":"text/css","size":74},{"path":"js/app.js","contentType":"application/javascript","size":28}]}
                

x_add_mockup_file

Writes a single file into a mockup bundle keyed by identifier. The first call for a new identifier creates the bundle (title is required on that call); subsequent calls with the same identifier add or replace files in that bundle. Each call carries one file's worth of content, so the tool-call arguments JSON stays small and survives output-token cuts that routinely truncate create_mockup's atomic "every file in one call" payload.

Availability: x_add_mockup_file is the preferred mockup-authoring path in agentic mode. It is advertised only when the workspace has agenticByDefault enabled or the request explicitly opts into agentic mode, and only when the caller leaves the corresponding chat-settings toggle on.

Parameters

Parameter	Type	Description
identifierrequired	string	Stable slug for the bundle. Reuse the same value across calls so files accumulate into one bundle.
titleoptional	string	Human-readable display title. Required on the first call for a given identifier; ignored on subsequent calls.
pathrequired	string	Relative path within the bundle (e.g. `index.html`, `css/main.css`, `js/app.js`). Must not start with `/` or contain `..` segments.
contentrequired	string	The file's content. Plain text for text files; base64 for binary assets when `encoding` is `base64`.
encodingoptional	string	Either `utf8` (default) or `base64`. Use `base64` for binary assets such as images.
entryoptional	string	Entry document loaded by the preview iframe. Defaults to `index.html` on the first call. Honored on the first call only.

Use Cases

Any mockup whose files contain HTML, CSS, JavaScript, or other quote-laden content (avoids parser truncation that breaks create_mockup).
Multi-file bundles built up over several turns of a conversation.
Adding a file to an existing bundle without rewriting the whole bundle.

Example Tool Calls

Two calls sharing the same identifier assemble one bundle. The first call creates the bundle and supplies title; the second call adds another file.

JSON (call 1)

                    {
  "name": "x_add_mockup_file",
  "arguments": {
    "identifier": "landing-page",
    "title": "Landing Page Mockup",
    "path": "index.html",
    "content": "<!doctype html>\n<html>\n<head>\n  <link rel=\"stylesheet\" href=\"css/main.css\">\n</head>\n<body>\n  <h1>Hello</h1>\n</body>\n</html>\n"
  }
}
                

JSON (call 2)

                    {
  "name": "x_add_mockup_file",
  "arguments": {
    "identifier": "landing-page",
    "path": "css/main.css",
    "content": "body { font-family: system-ui; margin: 2rem; }\nh1 { color: #2b6cb0; }\n"
  }
}
                

SSE Events

The first call for a new identifier emits x_mockup.created; subsequent calls against the same identifier emit x_mockup.updated with the changed paths. Failures emit x_mockup.error. Payload schemas are documented in the streaming reference.

x_update_mockup

Updates an existing mockup bundle. Files listed in files are added or replaced; paths listed in delete are removed. Files that are not mentioned remain unchanged. Each updated file gets its version incremented. x_update_mockup is advertised only in agentic mode, alongside x_add_mockup_file.

Parameters

Parameter	Type	Description
identifieroptional	string	The identifier of a previously created bundle in this chat. One of `identifier` or `bundle_id` is required.
bundle_idoptional	string	The opaque bundle id returned in `x_mockup.created`. One of `identifier` or `bundle_id` is required.
titleoptional	string	New display title. Omit to leave unchanged.
entryoptional	string	New entry path. Must match an existing or newly added file. Omit to leave unchanged.
filesoptional	array	Files to add or replace. Same schema as `create_mockup`.
deleteoptional	array	Array of relative paths to remove from the bundle. The current entry path cannot be deleted unless `entry` is also reassigned.

Use Cases

Iterating on a mockup based on user feedback
Adding a new page or asset to an existing bundle
Removing files that are no longer part of the design

Example Tool Call

JSON

                    {
  "name": "x_update_mockup",
  "arguments": {
    "identifier": "landing-page",
    "files": [
      {
        "path": "css/main.css",
        "content": "body { font-family: system-ui; margin: 2rem; background: #0b1220; color: #f0f4ff; }\nh1 { color: #7aa9ff; }\n"
      }
    ],
    "delete": ["js/app.js"]
  }
}
                

SSE Event

SSE

                    data: {"type":"x_mockup.updated","bundleId":"mkp_8f3k2m","identifier":"landing-page","title":"Landing Page Mockup","entry":"index.html","changed":["css/main.css"],"deleted":["js/app.js"]}
                

x_ask_user

Pauses the response and asks the user a clarifying question. The model uses this when ambiguity prevents a useful response. Supports multiple interaction styles: free-text questions, multiple-choice options, confirmation prompts, rating scales, and structured form fields. This tool is always injected in chat context.

Parameters

Parameter	Type	Description
questionrequired	string	The clarifying question to ask the user.
optionsoptional	string[]	List of suggested answers for the user to pick from.
allow_free_textoptional	boolean	Whether the user can type a free-text answer in addition to picking an option. Default: `true`.
multi_selectoptional	boolean	Whether the user can select more than one option. Default: `false`.
styleoptional	string	Card style. One of: `default`, `confirm`, `rating`, `form`. Default: `default`.
fieldsoptional	object[]	Form fields for `form` style cards. Each field has `label`, `key`, and optional `placeholder` and `required`.

Fields Sub-Parameters (for form style)

Field	Type	Description
labelrequired	string	Display label for the input field.
keyrequired	string	Machine-readable key for the field value.
placeholderoptional	string	Placeholder text shown in the input.
requiredoptional	boolean	Whether the field must be filled. Default: `false`.

Card Styles

Default Style

A question with optional multiple-choice options and free-text input.

JSON

                    {
  "name": "x_ask_user",
  "arguments": {
    "question": "Which programming language would you like the example in?",
    "options": ["Python", "JavaScript", "Go", "Rust"],
    "allow_free_text": true
  }
}
                

Confirm Style

A Yes/No confirmation prompt.

JSON

                    {
  "name": "x_ask_user",
  "arguments": {
    "question": "This will overwrite the existing configuration. Proceed?",
    "style": "confirm"
  }
}
                

Rating Style

A 1-5 rating scale.

JSON

                    {
  "name": "x_ask_user",
  "arguments": {
    "question": "How satisfied are you with this solution?",
    "style": "rating"
  }
}
                

Form Style

Labeled input fields for collecting structured data.

JSON

                    {
  "name": "x_ask_user",
  "arguments": {
    "question": "Please provide the database connection details:",
    "style": "form",
    "fields": [
      {"label": "Host", "key": "host", "placeholder": "localhost", "required": true},
      {"label": "Port", "key": "port", "placeholder": "5432", "required": true},
      {"label": "Database Name", "key": "db_name", "placeholder": "myapp_production", "required": true},
      {"label": "Username", "key": "username", "placeholder": "admin"},
      {"label": "Password", "key": "password", "placeholder": "********"}
    ]
  }
}
                

Use Cases

Disambiguating vague requirements before generating code
Collecting structured configuration input
Getting user confirmation before destructive operations
Offering choices when multiple valid approaches exist

SSE Events

The chat-completions SSE hook converts an x_ask_user tool call into the events below. All field names are camelCase, matching the emitter in ChatCompletionsArtifactPostSynthesisHook:

SSE

                    data: {"type":"x_ask_user.question","askUserId":"ask_abc123","toolCallId":"call_42","question":"Which programming language?","options":["Python","JavaScript","Go","Rust"],"allowFreeText":true,"multiSelect":false,"style":"default","fields":[]}

data: {"type":"x_ask_user.pending_state","askUserId":"ask_abc123","toolCallId":"call_42","assistantContent":"...partial assistant text...","toolCalls":[]}

Notes: x_ask_user has no server-tool implementation file, dispatch is SSE-only via the chat-completions hook. Argument defaults derive from optionalBool/optionalString reads (allow_free_text defaults to false when the model does not set it, despite the convenience example above).

x_save_memory

Saves a fact, preference, or instruction to the per-workspace memory store. The content is embedded as a vector and persisted to the database for future semantic recall. Near-duplicate content is automatically deduplicated via embedding similarity comparison. This tool is always injected in chat context and requires a chat ID.

Parameters

Parameter	Type	Description
contentrequired	string	The fact, preference, or instruction to save to memory.
categoryoptional	string	Category of the memory entry. One of: `preference`, `fact`, `instruction`. When omitted, the system infers the category from the content.

Deduplication

When saving, the system checks for existing memories with very high semantic similarity. If a near-duplicate is found, the save is skipped and the existing memory is returned instead. This prevents the memory store from accumulating redundant entries.

Use Cases

Storing user formatting preferences
Remembering project details and deadlines
Saving coding style instructions
Persisting domain-specific knowledge for the conversation

Example Tool Call

JSON

                    {
  "name": "x_save_memory",
  "arguments": {
    "content": "User prefers TypeScript with strict mode enabled for all code examples",
    "category": "preference"
  }
}
                

Example Response

JSON

                    {
  "status": "saved",
  "memory_id": "mem_x7k9p2",
  "content": "User prefers TypeScript with strict mode enabled for all code examples",
  "category": "preference"
}
                

For full details on memory architecture, passive injection, and the management API, see Chat Memory.

x_recall_memory

Searches the per-workspace memory store using semantic similarity. Returns the top matching memories ranked by relevance score. This tool is always injected in chat context and requires a chat ID.

Parameters

Parameter	Type	Description
queryrequired	string	Natural language query to search memories against.
limitoptional	integer	Max memories to return.
offsetoptional	integer	Zero-based row offset for pagination.

Return Value

Returns a ranked markdown list of matching memories.

Use Cases

Retrieving user preferences before generating output
Looking up previously mentioned facts and context
Checking for saved instructions before starting a task

Example Tool Call

JSON

                    {
  "name": "x_recall_memory",
  "arguments": {
    "query": "What programming language does the user prefer?"
  }
}
                

Example Response

JSON

                    {
  "memories": [
    {
      "id": "mem_x7k9p2",
      "content": "User prefers TypeScript with strict mode enabled for all code examples",
      "category": "preference",
      "relevance": 0.94,
      "created_at": "2026-03-09T14:22:00Z"
    },
    {
      "id": "mem_r3m8q1",
      "content": "Always use ESLint with the recommended ruleset",
      "category": "instruction",
      "relevance": 0.78,
      "created_at": "2026-03-09T14:18:00Z"
    }
  ]
}
                

For full details on memory architecture, passive injection, and the management API, see Chat Memory.

x_forget

Retracts a memory, decision, milestone, or artifact the model created earlier in the session. The delete is soft: the entity remains visible in the app and the CLI and can be restored there. Use this for model self-correction only. User-requested deletion is done in the app or CLI, not via this tool.

Parameters

Parameter	Type	Description
typerequired	string	Entity type to retract. One of: `memory`, `decision`, `milestone`, `artifact`.
idrequired	string	The handle the original tool returned: the memory UUID for `memory`, the `dec_` id for `decision`, the `mst_` id for `milestone`, or the identifier slug for `artifact`.

Example Tool Call

JSON

                    {
  "name": "x_forget",
  "arguments": {
    "type": "decision",
    "id": "dec_a1b2c3d4e5f6g7h8"
  }
}
                

Agentic Loop

When server-side tools are enabled, Xerotier uses an agentic loop to iteratively call tools and feed results back to the model. This loop is the foundational mechanism that powers all server-side tool execution, including research workflows and Deep Think.

How the Loop Works

Model response, The model generates a response that may include one or more tool calls.
Tool execution, Xerotier executes the requested tools server-side (e.g., x_web_search, x_fetch_url, x_code_search).
Result injection, Tool results are appended to the conversation as tool-role messages.
Follow-up, The model generates another response based on the tool results. This response may include additional tool calls.
Termination, Steps 2-4 repeat until the model produces a final text response (no tool calls) or the iteration limit is reached.

Limits

Limit	Value	Description
Maximum iterations per request	`5`	The loop runs at most 5 tool-call rounds before forcing a final response.
Per-tool execution timeout	`15 seconds`	Each individual tool call must complete within 15 seconds or it is cancelled.
Tool call rate limit	`45 / minute`	Maximum of 45 tool calls per minute per endpoint to prevent abuse.

Automatic Behavior

The agentic loop runs automatically whenever tools such as x_web_search, x_fetch_url, x_code_search, or x_calculator are enabled via web_search_options or the web_search_preview tool type. No additional client-side configuration is required, the router handles all tool execution, result marshalling, and follow-up model calls transparently.

If the model does not invoke any tools, the loop completes in a single iteration and the response is returned directly. When the maximum iteration limit is reached, the model is prompted to produce a final answer using the information gathered so far.

Note: The Deep Think feature extends this agentic loop with a multi-phase planning and synthesis layer on top. Each Deep Think sub-task runs its own agentic loop independently.

Deep Think

Deep Think performs extended, multi-phase autonomous research on a query. The system decomposes the question into sub-tasks, executes each independently through the agentic loop, and synthesizes all findings into a comprehensive report. During execution the SSE stream emits progress events so clients can display real-time status for each sub-task.

How It Works

Planning, The model decomposes the user query into focused sub-tasks (up to 10 by default), each with a specific search question and tool set.
Execution, Each sub-task runs through the agentic loop sequentially. All research tools (x_web_search, x_fetch_url, x_code_search, x_repo_overview, x_repo_read, x_calculator) are available per sub-task.
Synthesis, All sub-task results are combined and the model produces a single comprehensive report streamed as normal SSE content chunks.

Enabling Deep Think

Add x_deep_think: true to the web_search_options object. All research tools are automatically enabled for deep think requests.

JSON

                    {
  "model": "my-model",
  "messages": [
    {"role": "user", "content": "Comprehensive analysis of quantum computing advances in 2026"}
  ],
  "stream": true,
  "web_search_options": {
    "search_context_size": "medium",
    "x_deep_think": true,
    "x_tools": ["x_web_search", "x_fetch_url", "x_code_search", "x_calculator"]
  }
}
                

web_search_options Fields

Field	Type	Default	Description
x_deep_thinkoptional	boolean	`false`	When `true`, activates the multi-phase deep think pipeline instead of the single agentic loop.

Deep Think Lifecycle

The SSE stream for a deep think request follows this sequence. Caveat: the x_deep_think.completed, x_deep_think.error, and x_deep_think.artifact_created event types are declared in DeepThinkModels but are dashboard-synthesised or not currently emitted on the public /v1/chat/completions stream. Treat them as forward-looking contract.

Sequence

                    1. x_deep_think.plan_created     , plan ready, includes title + sub-task count
2. x_deep_think.subtask_started  , sub-task N begins (repeated per sub-task)
   x_research.searching / x_research.reading , tool-level events within sub-task
3. x_deep_think.subtask_completed, sub-task N finished
   ... (repeat 2-3 for each sub-task) ...
4. x_deep_think.synthesizing     , synthesis phase begins
   data: {"choices":[...]}          , normal SSE content chunks (final report)
5. x_deep_think.completed        , pipeline finished, includes artifact info
6. data: [DONE]                     , stream terminates
                

Limits

Limit	Value	Description
Max sub-tasks	10	Maximum sub-tasks the planner can create per request.
Sub-task iterations	5	Maximum agentic loop iterations per sub-task.

Deep Think Events (SSE)

Deep think progress events are emitted as inline SSE data lines with a type field prefixed by x_deep_think.. Clients should check for this prefix and render progress UI accordingly.

Event Type	Fields	Description
`x_deep_think.plan_created`	`title`, `total_subtasks`	Emitted after the planning phase. Contains the research title and number of sub-tasks.
`x_deep_think.subtask_started`	`subtask_id`, `subtask_query`, `subtask_index`, `total_subtasks`	Emitted when a sub-task begins execution.
`x_deep_think.subtask_completed`	`subtask_id`, `subtask_index`	Emitted when a sub-task finishes.
`x_deep_think.synthesizing`	`message`	Emitted when synthesis begins. Content chunks follow as normal SSE.
`x_deep_think.completed`	`title`, `artifact_name`	Emitted when the pipeline finishes. The report is auto-saved as an artifact.
`x_deep_think.error`	`message`	Emitted if a fatal error occurs during deep think.
`x_deep_think.discovery_started`	`mode`, `message`	Emitted when the discovery phase begins (target-focused mode only).
`x_deep_think.discovery_completed`	`pages_fetched`, `site_map`, `message`	Emitted when the discovery phase completes. Includes page count and site structure summary.
`x_deep_think.artifact_created`	`artifact_type`, `artifact_title`, `message`	Emitted when a structured artifact (table, matrix, findings list) is created from sub-task results.

Example Event Stream

SSE

                    data: {"type":"x_deep_think.plan_created","title":"Quantum Computing Advances","total_subtasks":4}

data: {"type":"x_deep_think.subtask_started","subtask_id":"1","subtask_query":"Latest quantum error correction breakthroughs","subtask_index":0,"total_subtasks":4}

data: {"type":"x_research.searching","name":"x_web_search","arguments":"{\"query\":\"quantum error correction 2026\"}"}

data: {"type":"x_deep_think.subtask_completed","subtask_id":"1","subtask_index":0}

data: {"type":"x_deep_think.subtask_started","subtask_id":"2","subtask_query":"Major quantum hardware milestones","subtask_index":1,"total_subtasks":4}

...

data: {"type":"x_deep_think.synthesizing","message":"Synthesizing final report..."}

data: {"choices":[{"index":0,"delta":{"content":"# Quantum Computing Advances\n\n"},"finish_reason":null}]}

...

data: {"type":"x_deep_think.completed","title":"Quantum Computing Advances","artifact_name":"deep-think-20260227-143012.md"}

data: [DONE]

SSE Events Reference

When streaming, server-side tool execution emits inline SSE events so clients can display progress indicators. All vendor-specific events use the x_ prefix for OpenAI spec compliance.

Research Events

Event Type	Tool	Description
`x_research.searching`	`x_web_search`	Emitted when a web search begins execution.
`x_research.reading`	`x_fetch_url`	Emitted when a URL fetch begins execution.
`x_research.code_searching`	`x_code_search`	Emitted when a code search begins execution.
`x_research.calculating`	`x_calculator`	Emitted when a calculator evaluation begins.
`x_research.tool_call`	Any	Emitted for tool invocations that do not match a specific research event type. Includes `tool_name`, `message`.
`x_research.result`	All research tools	Emitted when a tool call completes with its result.
`x_research.complete`	--	Declared in the event vocabulary and consumed by the chat dashboard, but not currently emitted by any router-side code path on the public `/v1/chat/completions` stream. Documented for future parity; do not depend on it from public-API clients yet.

Artifact Events

Event Type	Description
`x_artifact.created`	Emitted when a new artifact is created. Includes `identifier`, `title`, `version`.
`x_artifact.updated`	Emitted when an existing artifact is updated. Includes `identifier`, `version`.

Ask User Events

Event Type	Description
`x_ask_user.question`	Emitted when the model asks a clarifying question. CamelCase fields: `askUserId`, `toolCallId`, `question`, `options`, `allowFreeText`, `multiSelect`, `style`, `fields`.
`x_ask_user.pending_state`	Emitted with the assistant's partial content and tool state while awaiting user response. Fields: `askUserId`, `toolCallId`, `assistantContent`, `toolCalls`.

Deep Think Events

Event Type	Description
`x_deep_think.plan_created`	Emitted after the planning phase. Includes `title`, `total_subtasks`.
`x_deep_think.subtask_started`	Emitted when a sub-task begins. Includes `subtask_id`, `subtask_query`, `subtask_index`, `total_subtasks`.
`x_deep_think.subtask_completed`	Emitted when a sub-task finishes. Includes `subtask_id`, `subtask_index`.
`x_deep_think.synthesizing`	Emitted when synthesis begins. Includes `message`.
`x_deep_think.completed`	Emitted when the pipeline finishes. Includes `title`, `artifact_name`.
`x_deep_think.error`	Emitted if a fatal error occurs. Includes `message`.
`x_deep_think.discovery_started`	Emitted when discovery phase begins (target-focused mode). Includes `mode`, `message`.
`x_deep_think.discovery_completed`	Emitted when discovery phase completes. Includes `pages_fetched`, `site_map`, `message`.
`x_deep_think.artifact_created`	Emitted when a structured artifact is created. Includes `artifact_type`, `artifact_title`, `message`.

File Search Events (Responses API)

Note: the OpenAI-compatible response.file_search_call.* event types below are declared in the router's event vocabulary but are not currently emitted by any router-side code path on the public /v1/chat/completions stream. They are reserved for future parity with the Responses API and should not be relied upon by client integrations yet.

Event Type	Description
`response.file_search_call.in_progress`	Emitted when a server-side file search invocation begins. Includes `item_id`, `output_index`.
`response.file_search_call.searching`	Emitted while the file search is actively executing.
`response.file_search_call.completed`	Emitted when the file search finishes. Includes the completed output item with results.

Chat Metadata Events

Event Type	Description
`x_chat.metadata`	Emitted at the end of a response stream with aggregated usage metadata including research token counts. Includes `usage` (object with research token breakdown).

Example SSE Stream

SSE

                    data: {"type":"x_research.searching","name":"x_web_search","arguments":"{\"query\":\"rust async patterns 2026\"}"}

data: {"type":"x_research.result","name":"x_web_search","tool_call_id":"call_1"}

data: {"type":"x_research.reading","name":"x_fetch_url","arguments":"{\"url\":\"https://blog.rust-lang.org/...\"}"}

data: {"type":"x_research.result","name":"x_fetch_url","tool_call_id":"call_2"}

data: {"type":"x_research.calculating","name":"x_calculator","arguments":"{\"expression\":\"2^32\"}"}

data: {"type":"x_research.result","name":"x_calculator","tool_call_id":"call_3"}

data: {"type":"x_research.complete","elapsed_ms":4200,"input_tokens":12500,"output_tokens":850,"iterations":3,"sources":5}

data: {"choices":[{"index":0,"delta":{"content":"Based on my research..."},"finish_reason":null}]}

...

data: [DONE]

Rate Limiting & Caching

Per-Project Rate Limiting

Tool calls are rate limited per project (default 45 calls per minute). When the limit is exceeded, the tool returns an error response instead of executing.

Rate Limit Error Response

JSON

                    {
  "error": "Research tool rate limit exceeded. Try again in 12 seconds."
}
                

Result Caching

Identical tool calls (same function name and arguments) within a 5-minute window return cached results without re-executing. This prevents redundant network calls when the model re-invokes the same search or fetch. Auto-fetched URLs from x_web_search are also cached, so subsequent explicit x_fetch_url calls for the same URL are free.

Tool Limits

Limit	Value	Description
Rate limit	45/min	Maximum tool calls per minute per project.
Cache TTL	5 minutes	How long identical tool results are cached.
Auto-fetch count	2	Number of top URLs auto-fetched from web_search results.
Max URLs per fetch	5	Maximum URLs per fetch_url call.
Max follow links	3	Maximum same-domain links to auto-follow in discovery mode.
Character budget	24,000	Character budget for multi-URL fetch results, distributed across pages.

Full API Examples

Chat Completions with Research Tools

curl

                    curl -X POST https://api.xerotier.ai/proj_ABC123/my-endpoint/v1/chat/completions \
  -H "Authorization: Bearer xero_myproject_your_api_key" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "my-model",
    "messages": [
      {"role": "user", "content": "What are the key differences between Rust and Go for building web servers?"}
    ],
    "stream": true,
    "web_search_options": {
      "search_context_size": "medium",
      "x_tools": ["x_web_search", "x_fetch_url", "x_code_search"]
    }
  }'
                

Chat Completions with All Tools Enabled

curl

                    curl -X POST https://api.xerotier.ai/proj_ABC123/my-endpoint/v1/chat/completions \
  -H "Authorization: Bearer xero_myproject_your_api_key" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "my-model",
    "messages": [
      {"role": "user", "content": "Compare the async runtimes in Rust and explain the tradeoffs"}
    ],
    "stream": true,
    "web_search_options": {
      "search_context_size": "high",
      "max_iterations": 8,
      "x_tools": [
        "x_web_search", "x_fetch_url", "x_code_search",
        "x_repo_overview", "x_calculator"
      ]
    }
  }'
                

Python (OpenAI SDK)

                    from openai import OpenAI

client = OpenAI(
    base_url="https://api.xerotier.ai/proj_ABC123/my-endpoint/v1",
    api_key="xero_myproject_your_api_key",
)

stream = client.chat.completions.create(
    model="my-model",
    messages=[
        {"role": "user", "content": "Compare the async runtimes in Rust and explain the tradeoffs"}
    ],
    stream=True,
    extra_body={
        "web_search_options": {
            "search_context_size": "high",
            "max_iterations": 8,
            "x_tools": [
                "x_web_search", "x_fetch_url", "x_code_search",
                "x_repo_overview", "x_calculator"
            ]
        }
    },
)

for chunk in stream:
    if chunk.choices and chunk.choices[0].delta.content:
        print(chunk.choices[0].delta.content, end="")
print()
                

Node.js (OpenAI SDK)

                    import OpenAI from "openai";

const client = new OpenAI({
    baseURL: "https://api.xerotier.ai/proj_ABC123/my-endpoint/v1",
    apiKey: "xero_myproject_your_api_key",
});

const stream = await client.chat.completions.create({
    model: "my-model",
    messages: [
        { role: "user", content: "Compare the async runtimes in Rust and explain the tradeoffs" }
    ],
    stream: true,
    web_search_options: {
        search_context_size: "high",
        max_iterations: 8,
        x_tools: [
            "x_web_search", "x_fetch_url", "x_code_search",
            "x_repo_overview", "x_calculator"
        ],
    },
});

for await (const chunk of stream) {
    const content = chunk.choices?.[0]?.delta?.content;
    if (content) process.stdout.write(content);
}
console.log();
                

Responses API with Web Search and File Search

curl

                    curl -X POST https://api.xerotier.ai/proj_ABC123/my-endpoint/v1/responses \
  -H "Authorization: Bearer xero_myproject_your_api_key" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "my-model",
    "input": "Compare the uploaded design spec with current best practices for REST API design",
    "stream": true,
    "tools": [
      {
        "type": "web_search_preview",
        "search_context_size": "medium",
        "x_tools": ["x_web_search", "x_fetch_url"]
      },
      {
        "type": "file_search",
        "vector_store_ids": ["vs_abc123"]
      }
    ]
  }'
                

Python (OpenAI SDK)

                    from openai import OpenAI

client = OpenAI(
    base_url="https://api.xerotier.ai/proj_ABC123/my-endpoint/v1",
    api_key="xero_myproject_your_api_key",
)

response = client.responses.create(
    model="my-model",
    input="Compare the uploaded design spec with current best practices for REST API design",
    stream=True,
    tools=[
        {
            "type": "web_search_preview",
            "search_context_size": "medium",
            "x_tools": ["x_web_search", "x_fetch_url"],
        },
        {
            "type": "file_search",
            "vector_store_ids": ["vs_abc123"],
        },
    ],
)

for event in response:
    if hasattr(event, "type"):
        print(event)
                

Node.js (OpenAI SDK)

                    import OpenAI from "openai";

const client = new OpenAI({
    baseURL: "https://api.xerotier.ai/proj_ABC123/my-endpoint/v1",
    apiKey: "xero_myproject_your_api_key",
});

const stream = await client.responses.create({
    model: "my-model",
    input: "Compare the uploaded design spec with current best practices for REST API design",
    stream: true,
    tools: [
        {
            type: "web_search_preview",
            search_context_size: "medium",
            x_tools: ["x_web_search", "x_fetch_url"],
        },
        {
            type: "file_search",
            vector_store_ids: ["vs_abc123"],
        },
    ],
});

for await (const event of stream) {
    console.log(event);
}
                

Multi-Tool Conversation Flow

This example shows a typical multi-tool chain where the model searches for information, fetches a page for detail, and uses the calculator to verify a number, all within a single agentic loop.

curl

                    curl -X POST https://api.xerotier.ai/proj_ABC123/my-endpoint/v1/chat/completions \
  -H "Authorization: Bearer xero_myproject_your_api_key" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "my-model",
    "messages": [
      {
        "role": "user",
        "content": "What is the current price of gold per ounce, and how much would 3.5 troy ounces cost in EUR at today'\''s exchange rate?"
      }
    ],
    "stream": true,
    "web_search_options": {
      "search_context_size": "medium",
      "max_iterations": 6,
      "x_tools": ["x_web_search", "x_fetch_url", "x_calculator"]
    }
  }'
                

Python (OpenAI SDK)

                    import json
from openai import OpenAI

client = OpenAI(
    base_url="https://api.xerotier.ai/proj_ABC123/my-endpoint/v1",
    api_key="xero_myproject_your_api_key",
)

# The model may chain: web_search -> fetch_url -> calculator
# all server-side in a single request
stream = client.chat.completions.create(
    model="my-model",
    messages=[
        {
            "role": "user",
            "content": "What is the current price of gold per ounce, "
                       "and how much would 3.5 troy ounces cost in EUR "
                       "at today's exchange rate?"
        }
    ],
    stream=True,
    extra_body={
        "web_search_options": {
            "search_context_size": "medium",
            "max_iterations": 6,
            "x_tools": ["x_web_search", "x_fetch_url", "x_calculator"]
        }
    },
)

for chunk in stream:
    raw = chunk.model_dump()
    # Check for research progress events
    if "type" in raw and raw["type"].startswith("x_research."):
        event_type = raw["type"]
        if event_type == "x_research.searching":
            print(f"[Searching] {raw.get('arguments', '')}")
        elif event_type == "x_research.reading":
            print(f"[Reading] {raw.get('arguments', '')}")
        elif event_type == "x_research.calculating":
            print(f"[Calculating] {raw.get('arguments', '')}")
        elif event_type == "x_research.complete":
            print(f"[Research complete] {raw.get('iterations', 0)} iterations, "
                  f"{raw.get('elapsed_ms', 0)}ms")
        continue
    # Normal content chunks
    if chunk.choices and chunk.choices[0].delta.content:
        print(chunk.choices[0].delta.content, end="")
print()
                

Node.js (OpenAI SDK)

                    import OpenAI from "openai";

const client = new OpenAI({
    baseURL: "https://api.xerotier.ai/proj_ABC123/my-endpoint/v1",
    apiKey: "xero_myproject_your_api_key",
});

// The model may chain: web_search -> fetch_url -> calculator
// all server-side in a single request
const stream = await client.chat.completions.create({
    model: "my-model",
    messages: [
        {
            role: "user",
            content: "What is the current price of gold per ounce, " +
                     "and how much would 3.5 troy ounces cost in EUR " +
                     "at today's exchange rate?",
        }
    ],
    stream: true,
    web_search_options: {
        search_context_size: "medium",
        max_iterations: 6,
        x_tools: ["x_web_search", "x_fetch_url", "x_calculator"],
    },
});

for await (const chunk of stream) {
    const raw = chunk;
    // Check for research progress events
    if (raw.type && raw.type.startsWith("x_research.")) {
        if (raw.type === "x_research.searching") {
            console.log(`[Searching] ${raw.arguments || ""}`);
        } else if (raw.type === "x_research.reading") {
            console.log(`[Reading] ${raw.arguments || ""}`);
        } else if (raw.type === "x_research.calculating") {
            console.log(`[Calculating] ${raw.arguments || ""}`);
        } else if (raw.type === "x_research.complete") {
            console.log(`[Research complete] ${raw.iterations || 0} iterations, ${raw.elapsed_ms || 0}ms`);
        }
        continue;
    }
    // Normal content chunks
    const content = chunk.choices?.[0]?.delta?.content;
    if (content) process.stdout.write(content);
}
console.log();
                

Deep Think (curl)

curl

                    curl -X POST https://api.xerotier.ai/proj_ABC123/my-endpoint/v1/chat/completions \
  -H "Authorization: Bearer xero_myproject_your_api_key" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "my-model",
    "messages": [
      {"role": "user", "content": "Comprehensive analysis of quantum computing advances in 2026"}
    ],
    "stream": true,
    "web_search_options": {
      "search_context_size": "medium",
      "x_deep_think": true,
      "x_tools": ["x_web_search", "x_fetch_url", "x_code_search", "x_calculator"]
    }
  }'
                

Deep Think (Python)

Python (OpenAI SDK)

                    import json
from openai import OpenAI

client = OpenAI(
    base_url="https://api.xerotier.ai/proj_ABC123/my-endpoint/v1",
    api_key="xero_myproject_your_api_key",
)

stream = client.chat.completions.create(
    model="my-model",
    messages=[
        {"role": "user", "content": "Comprehensive analysis of quantum computing advances in 2026"}
    ],
    stream=True,
    extra_body={
        "web_search_options": {
            "search_context_size": "medium",
            "x_deep_think": True,
            "x_tools": ["x_web_search", "x_fetch_url", "x_code_search", "x_calculator"]
        }
    },
)

for chunk in stream:
    raw = chunk.model_dump()
    # Check for deep think progress events
    if "type" in raw and raw["type"].startswith("x_deep_think."):
        print(f"[{raw['type']}]", json.dumps(raw, indent=2))
        continue
    # Normal content chunks
    if chunk.choices and chunk.choices[0].delta.content:
        print(chunk.choices[0].delta.content, end="")
print()
                

Node.js (OpenAI SDK)

                    import OpenAI from "openai";

const client = new OpenAI({
    baseURL: "https://api.xerotier.ai/proj_ABC123/my-endpoint/v1",
    apiKey: "xero_myproject_your_api_key",
});

const stream = await client.chat.completions.create({
    model: "my-model",
    messages: [
        { role: "user", content: "Comprehensive analysis of quantum computing advances in 2026" }
    ],
    stream: true,
    web_search_options: {
        search_context_size: "medium",
        x_deep_think: true,
        x_tools: ["x_web_search", "x_fetch_url", "x_code_search", "x_calculator"],
    },
});

for await (const chunk of stream) {
    const raw = chunk;
    // Check for deep think progress events
    if (raw.type && raw.type.startsWith("x_deep_think.")) {
        console.log(`[${raw.type}]`, JSON.stringify(raw, null, 2));
        continue;
    }
    // Normal content chunks
    const content = chunk.choices?.[0]?.delta?.content;
    if (content) process.stdout.write(content);
}
console.log();
                

Deep think requests use streaming and emit progress events inline. The final report is streamed as normal content chunks after all sub-tasks complete. The report is also auto-saved as an artifact in the chat interface.

Artifact Creation Flow

curl

                    curl -X POST https://api.xerotier.ai/proj_ABC123/my-endpoint/v1/chat/completions \
  -H "Authorization: Bearer xero_myproject_your_api_key" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "my-model",
    "messages": [
      {
        "role": "user",
        "content": "Create a Python script that implements a binary search tree with insert, search, and in-order traversal methods."
      }
    ],
    "stream": true
  }'
                

Python (OpenAI SDK)

                    import json
from openai import OpenAI

client = OpenAI(
    base_url="https://api.xerotier.ai/proj_ABC123/my-endpoint/v1",
    api_key="xero_myproject_your_api_key",
)

# In the chat context, create_artifact and update_artifact
# are always available. The model will use them when appropriate.
stream = client.chat.completions.create(
    model="my-model",
    messages=[
        {
            "role": "user",
            "content": "Create a Python script that implements a "
                       "binary search tree with insert, search, and "
                       "in-order traversal methods."
        }
    ],
    stream=True,
)

for chunk in stream:
    raw = chunk.model_dump()
    # Check for artifact events
    if "type" in raw:
        if raw["type"] == "x_artifact.created":
            print(f"\n[Artifact created] {raw.get('identifier', '')} "
                  f"- {raw.get('title', '')} (v{raw.get('version', 1)})")
        elif raw["type"] == "x_artifact.updated":
            print(f"\n[Artifact updated] {raw.get('identifier', '')} "
                  f"(v{raw.get('version', '')})")
        continue
    if chunk.choices and chunk.choices[0].delta.content:
        print(chunk.choices[0].delta.content, end="")
print()
                

Node.js (OpenAI SDK)

                    import OpenAI from "openai";

const client = new OpenAI({
    baseURL: "https://api.xerotier.ai/proj_ABC123/my-endpoint/v1",
    apiKey: "xero_myproject_your_api_key",
});

// In the chat context, create_artifact and update_artifact
// are always available. The model will use them when appropriate.
const stream = await client.chat.completions.create({
    model: "my-model",
    messages: [
        {
            role: "user",
            content: "Create a Python script that implements a " +
                     "binary search tree with insert, search, and " +
                     "in-order traversal methods.",
        }
    ],
    stream: true,
});

for await (const chunk of stream) {
    const raw = chunk;
    // Check for artifact events
    if (raw.type) {
        if (raw.type === "x_artifact.created") {
            console.log(`\n[Artifact created] ${raw.identifier || ""} ` +
                         `- ${raw.title || ""} (v${raw.version || 1})`);
        } else if (raw.type === "x_artifact.updated") {
            console.log(`\n[Artifact updated] ${raw.identifier || ""} ` +
                         `(v${raw.version || ""})`);
        }
        continue;
    }
    const content = chunk.choices?.[0]?.delta?.content;
    if (content) process.stdout.write(content);
}
console.log();