Server-Side Tools
Tools the router runs itself, with no round-trip to your code. Research, artifacts, project intelligence, memory, and governed execution, all callable mid-completion. Each entry below carries its schema, latency profile, and billing footprint.
The tools fall into four categories:
- Research, tools for gathering information from external sources.
- Content, tools for creating and updating persistent artifacts.
- Interaction, tools for requesting input from the user mid-stream.
- Knowledge, tools for saving, recalling, and searching stored data.
- Project Intelligence, tools for tracking decisions, milestones, relationships, and generating project briefings.
Tool Summary
| Tool | Category | Description |
|---|---|---|
x_web_search |
Research | Built-in web search |
x_fetch_url |
Research | Fetch and extract text from web pages and PDFs |
x_code_search |
Research | GitHub code search (snippet results) |
x_repo_overview |
Research | Full GitHub repository overview (metadata, languages, README, tree, recent commits) |
x_repo_read |
Research | Read specific files or directory listings from a GitHub repository |
x_inspect_site |
Research | Single-URL HTML and stylesheet inspection for theme, palette, and typography extraction |
x_calculator |
Research | Evaluate mathematical expressions safely |
x_create_artifact |
Content | Create persistent artifacts (code, docs, HTML, SVG, diagrams) |
x_update_artifact |
Content | Update a previously created artifact with new content |
x_create_mockup |
Content | Create a multi-file mockup bundle for visual preview in a sandboxed iframe (direct API only; not advertised to agentic chats) |
x_add_mockup_file |
Content | Incrementally write one file at a time into a mockup bundle keyed by identifier (agentic-mode preferred path) |
x_update_mockup |
Content | Update files in a previously created mockup bundle |
x_read_mockup |
Content | Read mockup bundle contents (manifest, single file, or full inline). 256 KB cap on returned content. |
x_read_artifact |
Content | Read a single artifact's metadata and inline content (1 MiB cap; metadata-only above that) |
x_list_artifacts |
Content | List artifacts visible to the calling workspace (optionally filtered by chat and type) |
x_ask_user |
Interaction | Ask the user a clarifying question with optional structured input |
x_save_memory |
Knowledge | Save facts, preferences, or instructions to per-workspace memory |
x_recall_memory |
Knowledge | Search saved memories using semantic similarity |
x_forget |
Knowledge | Retract a memory, decision, milestone, relation, or artifact created earlier in the session |
x_workspace_search |
Knowledge | Unified semantic search across workspace memories, documents, and artifacts |
x_track_decision |
Project Intelligence | Record structured architectural or design decisions |
x_query_decisions |
Project Intelligence | Search and filter recorded decisions |
x_track_milestone |
Project Intelligence | Record project milestones and goals |
x_query_timeline |
Project Intelligence | Query the project timeline of milestones |
x_project_brief |
Project Intelligence | Generate a project state briefing |
x_relate |
Project Intelligence | Create relationships between workspace entities |
x_query_graph |
Project Intelligence | Return the workspace knowledge graph (decisions + milestones + relations) as one composite payload |
x_analyze_workspace |
Project Intelligence | Synthesize a structured brief of the workspace context (documents, memories, artifacts, chat history) |
x_exec_invoke |
Execution | Invoke a governed XEM tool via the MCP exec adapter; returns the invocation id for polling |
x_exec_poll |
Execution | Poll the lifecycle status of an exec invocation |
x_exec_cancel |
Execution | Cancel an in-flight exec invocation |
x_exec_invoke_blocking |
Execution | Long-lived blocking exec dispatch with streaming progress (MCP-only; 900s timeout) |
Enabling Server-Side Tools
Toggleable research tools (x_web_search,
x_code_search, x_repo_overview,
x_repo_read, x_inspect_site)
must be explicitly opted into via the UI or x_tools.
All other tools are injected automatically:
x_create_artifactandx_update_artifact, always injected in chat context.x_read_artifactandx_list_artifactsare also chat-visible.x_add_mockup_fileandx_update_mockup, injected only in agentic mode (when the workspace'sagenticByDefaultis enabled or the request explicitly opts into agentic mode).x_create_mockupstays registered for directPOST /v1/responsescallers that ship their owntoolsarray, but is not advertised to agentic chats.x_read_mockupis in the agentic default tool set.x_ask_user, always injected in chat context (dispatched via the chat-completions SSE hook; no server-side tool implementation file).x_save_memory,x_recall_memory, andx_forget, always injected in chat context.x_workspace_search, always injected in chat context for unified search across workspace content.x_calculatorandx_fetch_url, always injected (always-on tier).x_track_decision,x_query_decisions,x_track_milestone,x_query_timeline,x_project_brief,x_relate,x_query_graph,x_analyze_workspace, always injected (project intelligence tools).x_exec_invoke,x_exec_poll,x_exec_cancel, chat-toggleable governed-XEM execution adapter.x_exec_invoke_blockingis MCP-only.
Chat Completions API
Add web_search_options to a standard
/v1/chat/completions request. Use the x_tools
array to select which research tools to enable:
{
"model": "my-model",
"messages": [
{"role": "user", "content": "Search for the latest Rust async patterns"}
],
"stream": true,
"web_search_options": {
"search_context_size": "medium",
"max_iterations": 5,
"x_tools": ["x_web_search", "x_code_search", "x_calculator"]
}
}
Responses API
Include a web_search_preview tool in the
tools array:
{
"model": "my-model",
"input": "Find the latest research on transformer architectures",
"stream": true,
"tools": [
{
"type": "web_search_preview",
"search_context_size": "medium",
"x_tools": ["x_web_search", "x_fetch_url", "x_code_search"]
}
]
}
x_tools Selection
| Behavior | Details |
|---|---|
| Omitted or empty | Defaults to ["x_web_search", "x_fetch_url"]. |
x_web_search included |
x_fetch_url is auto-included for URL follow-up. |
| Invalid names | Silently ignored. Falls back to defaults if all are invalid. |
x_code_search, x_repo_overview, x_repo_read |
Available when GitHub code search is enabled for your endpoint. |
x_web_search |
Available when web search is enabled for your endpoint. Where no search backend is configured, requests for x_web_search are silently dropped. |
x_calculator |
Always available. Evaluates mathematical expressions server-side. |
Humanize Output
Humanize Output is a workspace policy, not an LLM-callable tool. Every assistant turn produced for a chat in that workspace runs through a deterministic post-synthesis filter before the content is flushed to the chat bubble and persisted. The filter strips AI tells (sycophantic openers, marketing adjectives, em-dash spam, redundant scaffolding, hedge filler) while preserving code, file paths, URLs, numbers, math, and citation markers byte-identical.
The filter is server-side and chat-page-scoped. External
SDKs calling /v1/chat/completions directly
do not pass through it; only chats rendered in the
Xerotier chat-page UI invoke it.
How it works
Six pipeline stages. Each stage either pass-through or
safe-abort; if any invariant trips, the original assistant
draft is returned byte-identical and the abort is logged
at .notice with the precise reason.
- Protect-zone tokenizer. Masks code fences, inline code, URLs, file paths, numeric literals, math expressions, and citation tokens behind opaque sentinels so later stages cannot rewrite them.
- Deterministic scrub. Site-ordered rule engine applies ~32 named rules covering openers, wrap-ups, banned phrases, marketing adjectives, transitional cliches, em-dash discipline, header demotion, and tricolon flattening.
- Optional LLM polish. When the
workspace's
humanize_polish_enabledis on and the scrubbed draft is over 120 characters, sends the text to the workspace's configured polish endpoint (defaults to the workspace small endpoint) with a strict rewrite prompt. Failure (timeout, non-200, parse error) falls through to the next stage with the scrubbed draft; polish never blocks an assistant turn. - Post-polish re-scrub. Runs the same rule engine again. Polish models occasionally reintroduce tells; the scrubber is the floor.
- Safety net. Aborts and returns the original draft if any sentinel went missing or got duplicated, any banned floor phrase survived, or the masked-text length changed by more than 40% (with a 32-character minimum-length floor).
- Unmask. Replaces every sentinel with its original protected span.
Enabling the policy
Open the chat workspace settings tab and find the Humanize output card under Workspace behaviour. The main switch turns the deterministic scrub on or off for every chat in the workspace. A sub-card exposes:
- LLM polish: when on, the polish pass runs after the deterministic scrub. Higher quality, adds ~1-3 s latency per turn.
- Polish endpoint: the workspace endpoint the polish call targets. Defaults to workspace-small (auto), which selects the same small/cheap endpoint chat-name generation uses.
- Voice anchor: a per-workspace textarea that conditions the polish system prompt on a custom voice register (see below).
Per-chat override
Each chat carries a tri-state override chip in the
model-options panel: Inherit | On | Off.
Inherit defers to the workspace toggle;
On and Off force the filter
for that single chat regardless of the workspace
setting. Use Off when humanization is
interfering with a precise technical answer.
Voice anchoring
The platform ships a built-in default voice block compiled into the frontend service. When the polish sub-toggle is on, the polish system prompt is conditioned on that voice register as "Match this voice: ...". This is what turns "less AI-sounding" into "sounds like this project."
The chat-settings Voice anchor textarea pre-populates from the built-in default when the workspace has not stored an override. Three states are honored:
- Override set (non-empty): the text is sent verbatim as the voice line. Use this to give one workspace a distinct register without affecting others.
- Override set (empty): explicit opt-out. The polish prompt omits the voice line entirely for this workspace.
- No override (default): the built-in default voice block is used. Clear the textarea and save to enter the opt-out state; re-saving the seeded default text installs it as an explicit override.
Notes and limitations:
- The built-in default lives in compiled code, not on disk. Operators who need to change the platform-wide default ship a new frontend release; operators who need a per-workspace variation use the textarea.
- The voice block is injected only when the LLM polish sub-toggle is on. The deterministic scrub runs regardless and does not read the voice block.
- The polish prompt budgets roughly 800 characters for the voice line; longer overrides are passed through and may cost extra prompt tokens.
What is preserved verbatim
The protect-zone tokenizer ensures the following spans pass through every stage byte-identical. If any sentinel is missing or duplicated after polish, the safety net aborts and returns the original draft.
- Fenced code blocks (triple backtick)
- Inline code (single backtick)
- Markdown links and bare URLs
(
http,https,ws,wss,file) - Absolute and relative file paths
- Numeric literals (integers, floats, hex, scientific notation, percentages)
- Math expressions
(
$...$and$$...$$) - Citation tokens like
[^1]
Operator visibility
A workspace-level Show humanized chip
toggle exposes a small humanized chip
on assistant bubbles whose content went through the
filter. Hidden by default; flip it on when you want
visual confirmation during QA. Hovering the chip
shows whether the message was
"scrub only" or
"scrub + polish".
Admin observability
GET /admin/humanize/stats
returns an admin-gated JSON snapshot covering:
- Per-stage invocation counts
(
scrub,polish,aborted,skipped). - Per-reason abort counts
(
missing_sentinel,duplicated_sentinel,banned_phrase_floor,length_delta,polish_timeout,polish_error). - Top rules by hit count.
- Polish-stage latency P50 and P95.
Prometheus metrics emitted under the
xerotier_humanize_* namespace
(_invocations_total,
_aborts_total,
_polish_tokens_total,
_rule_hits_total,
_latency_seconds) live on
FrontendMetrics.shared.
Audit trail
Every assistant message produced by a humanized
turn stores the raw pre-humanize draft alongside
the humanized content. The audit column is
chat_messages.pre_humanize_content and
is nullable; it is only populated when the filter
actually ran for that row.
x_web_search
Searches the web using built-in web search. Returns structured
results with titles, URLs, and snippets. The top 2 result URLs
are automatically fetched in parallel and appended as
fetched_pages for immediate full-page context
enrichment in a single tool call.
Web search is a research tool and must be explicitly opted into
via web_search_options (Chat Completions) or a
web_search_preview tool entry (Responses API).
Including x_web_search in x_tools
automatically enables x_fetch_url as well. See
Enabling Server-Side Tools for request
setup, SSE Events Reference for
streaming progress events, and
Rate Limiting & Caching for
limits.
web_search Tool Parameters
When the model decides to invoke web search, it generates a tool call with the following argument:
| Parameter | Type | Description |
|---|---|---|
| queryrequired | string | The search query to look up on the web. The model formulates this from the user message context. |
Example Tool Call (generated by model)
{
"name": "x_web_search",
"arguments": {
"query": "rust async trait stabilization 2026"
}
}
web_search_options Fields
The fields below are accepted on the web_search_options
object (Chat Completions) and on the
web_search_preview tool object (Responses API).
| Field | Type | Default | Description |
|---|---|---|---|
| search_context_sizeoptional | string | "medium" |
Controls how much search context is used. One of: low, medium, high. |
| max_iterationsoptional | integer | 5 |
Maximum agentic loop iterations. Range 1-10. |
| x_toolsoptional | string[] | ["x_web_search", "x_fetch_url"] |
Which research tools to enable. When omitted or empty, defaults to ["x_web_search", "x_fetch_url"]. Including x_web_search auto-includes x_fetch_url. |
| x_deep_thinkoptional | boolean | false |
Enables the multi-phase Deep Think pipeline. See the Deep Think section. |
| x_deep_think_max_branchesoptional | integer | - | Caps the number of parallel exploration branches when Deep Think is enabled. |
| x_deep_think_max_depthoptional | integer | - | Caps recursion depth for Deep Think exploration. |
| x_analystoptional | boolean | false |
Enables the Analyst sub-agent that post-processes research into a structured analysis. |
| x_execoptional | boolean | false |
Permits the model to invoke server-side execution tools (sandboxed). Subject to project policy. |
| x_exec_approval_policyoptional | string | "never" |
Approval policy for exec tool calls. One of: never, on_request, always. |
Response Format
The tool returns a JSON object with the following fields. The router injects this as a tool-role message back to the model.
| Field | Type | Description |
|---|---|---|
answer |
string | Direct answer from the search engine, if available. May be empty. |
abstract |
string | Summary text from the search engine, if available. May be empty. |
results |
object[] | Array of search result objects, each with title, url, and snippet. |
fetched_pages |
object[] | Array of auto-fetched page content for the top 2 URLs. Each entry has url and content. |
Example Response
{
"answer": "",
"abstract": "",
"results": [
{
"title": "Async Trait Methods Stabilized in Rust 1.85",
"url": "https://blog.rust-lang.org/2026/02/20/async-traits.html",
"snippet": "Rust 1.85 stabilizes async fn in traits, enabling..."
},
{
"title": "Understanding async traits in Rust",
"url": "https://docs.rs/async-trait/latest/guide",
"snippet": "A comprehensive guide to using async trait methods..."
}
],
"fetched_pages": [
{
"url": "https://blog.rust-lang.org/2026/02/20/async-traits.html",
"content": "Announcing Rust 1.85. We are happy to announce that async fn in traits..."
},
{
"url": "https://docs.rs/async-trait/latest/guide",
"content": "Async Trait Guide. This guide covers the fundamentals of async trait..."
}
]
}
Auto-Fetch Enrichment
After returning search results, the router automatically fetches
the top 2 result URLs in parallel and appends the extracted text
as fetched_pages. This gives the model both the
search snippets and full page content in a single tool call,
reducing the number of loop iterations needed.
Each auto-fetched page shares an equal portion of a fixed character budget (currently 12,000 characters across all auto-fetched pages, subject to change). Pages exceeding their share are truncated.
Auto-fetched URLs are also cached. If the model subsequently calls
x_fetch_url for a URL that was already auto-fetched, the
cached result is returned instantly without a network request.
x_fetch_url
Fetches and extracts readable text content from web pages and PDF documents. Supports single-URL fetch, parallel multi-URL fetch, link discovery for site exploration, and automatic same-domain link following.
Parameters
| Parameter | Type | Description |
|---|---|---|
| urlrequired | string | A single URL to fetch content from. Must be http or https. |
| urlsoptional | string[] | Multiple URLs to fetch in parallel (max 5). Use this OR url, not both. When both are provided, they are merged and deduplicated. |
| discover_linksoptional | boolean | Extract links from fetched pages for site exploration. Default: false. |
Modes
-
Single URL, provide only
urlwithout discovery flags. Returns extracted plain text directly. -
Multi-URL, provide
urlsarray for parallel fetching. Returns a JSON object with apagesarray, each containingurl,content, anderrorfields. -
Discovery, set
discover_links: trueto extract links from fetched pages.
SSRF Protection
All URL fetches are protected against Server-Side Request Forgery (SSRF).
Requests to private and internal IP ranges are blocked:
10.x, 172.16-31.x, 192.168.x,
127.x, ::1, fe80::/10,
169.254.x.x.
Character Budget
Multi-URL fetches share a total budget of 24,000 characters, distributed equally across all successfully fetched pages. Pages exceeding their share are truncated.
Use Cases
- Reading documentation pages in full
- Extracting article content for summarization
- Crawling related pages on the same domain
- Fetching PDF text content
Example: Single Fetch
{
"name": "x_fetch_url",
"arguments": {
"url": "https://docs.example.com/api/authentication"
}
}
Example: Multi-Fetch with Discovery
{
"name": "x_fetch_url",
"arguments": {
"urls": [
"https://docs.example.com/guide/getting-started",
"https://docs.example.com/guide/configuration"
],
"discover_links": true
}
}
Example Multi-Fetch Response
{
"discover_links_enabled": true,
"total_pages": 4,
"pages": [
{
"url": "https://docs.example.com/guide/getting-started",
"content": "Getting Started. Install the SDK with npm install...",
"error": false,
"discovered_links": [
{"url": "https://docs.example.com/guide/authentication", "text": "Authentication"},
{"url": "https://docs.example.com/guide/advanced", "text": "Advanced Usage"}
]
},
{
"url": "https://docs.example.com/guide/configuration",
"content": "Configuration. Set environment variables to customize...",
"error": false,
"discovered_links": []
},
{
"url": "https://docs.example.com/guide/authentication",
"content": "Authentication. Use API keys or OAuth2 tokens...",
"error": false,
"followed_from_discovery": true
},
{
"url": "https://docs.example.com/guide/advanced",
"content": "Advanced Usage. Configure retry policies and timeouts...",
"error": false,
"followed_from_discovery": true
}
]
}
SSE Event
data: {"type":"x_research.reading","name":"x_fetch_url","arguments":"{\"url\":\"https://docs.example.com/api/authentication\"}"}
x_code_search
Searches public code on GitHub via the GitHub Code Search API. Returns
code snippets with repository, path, URL, and full file content for the
top results. To inspect an entire repository (metadata, README, tree)
or read specific files from a repo, use x_repo_overview or
x_repo_read instead.
Parameters
| Parameter | Type | Description |
|---|---|---|
| queryrequired | string | GitHub code search query. Supports qualifiers: repo:owner/name, org:name, path:dir/, filename:name, extension:ext, language:lang. |
| limitoptional | integer | Max results to return (defaults to 5, capped at 20). Full file content is fetched only for the top 3. |
| offsetoptional | integer | Zero-based row offset for pagination. Rounded down to the nearest limit boundary because GitHub paginates in fixed-size pages. |
Example
{
"name": "x_code_search",
"arguments": {
"query": "async stream handler repo:apple/swift-nio language:swift",
"limit": 5
}
}
SSE Event
data: {"type":"x_research.code_searching","name":"x_code_search","arguments":"{\"query\":\"async stream handler language:swift\"}"}
x_repo_overview
Returns a full GitHub repository overview: metadata, language breakdown,
README content, top-level directory tree, and recent commits. Use this
first to discover repository structure before reading individual files
with x_repo_read.
Parameters
| Parameter | Type | Description |
|---|---|---|
| repositoryrequired | string | GitHub repository. Accepts owner/repo or a full URL (e.g. https://github.com/owner/repo). |
Example
{
"name": "x_repo_overview",
"arguments": {
"repository": "vapor/vapor"
}
}
x_repo_read
Reads specific files or directory listings from a GitHub repository.
Use x_repo_overview first to discover the structure, then
x_repo_read to fetch the files you need.
Parameters
| Parameter | Type | Description |
|---|---|---|
| repositoryrequired | string | GitHub repository. Accepts owner/repo or a full URL. |
| pathoptional | string | Single file or directory path to fetch. |
| pathsoptional | string[] | Multiple file or directory paths to fetch in one call. |
Example
{
"name": "x_repo_read",
"arguments": {
"repository": "apple/swift-nio",
"paths": ["Sources/NIOCore/Channel.swift", "Sources/NIOCore/EventLoop.swift"]
}
}
x_inspect_site
Single-URL HTML and stylesheet inspection for extracting site theme metadata: palette, typography, structure, and key element classes. Use when the model needs to mirror a reference design rather than crawl multiple pages.
Parameters
| Parameter | Type | Description |
|---|---|---|
| urlrequired | string | The page URL to inspect. |
x_calculator
Evaluates mathematical expressions server-side using a safe recursive-descent parser. No code execution, only arithmetic, functions, and constants are supported.
Parameters
| Parameter | Type | Description |
|---|---|---|
| expressionrequired | string | Mathematical expression to evaluate. Maximum 1,000 characters. |
Supported Operations
| Category | Supported |
|---|---|
| Operators | + - * / ^ |
| Functions | sqrt, log, ln, sin, cos, tan, abs, floor, ceil, round, min, max |
| Constants | pi, e |
Use Cases
- Unit conversions and dimensional analysis
- Financial calculations (compound interest, amortization)
- Scientific computation (trigonometry, logarithms)
- Quick arithmetic the model should not hallucinate
Example Tool Call
{
"name": "x_calculator",
"arguments": {
"expression": "10000 * (1 + 0.05)^3"
}
}
Example Response
{
"expression": "10000 * (1 + 0.05)^3",
"result": 11576.25
}
More Examples
| Expression | Result |
|---|---|
sqrt(144) + 2^3 | 20.0 |
sin(pi/2) | 1.0 |
log(1000) | 3.0 |
max(42, 17) * min(3, 5) | 126.0 |
abs(-273.15) + ceil(2.1) | 276.15 |
SSE Event
data: {"type":"x_research.calculating","name":"x_calculator","arguments":"{\"expression\":\"10000 * (1 + 0.05)^3\"}"}
x_create_artifact
Creates a persistent artifact such as a code file, document, HTML page, SVG image, or Mermaid diagram. Artifacts are stored in the chat and can be viewed, downloaded, and updated. This tool is always injected in chat context, no opt-in required.
Parameters
| Parameter | Type | Description |
|---|---|---|
| identifierrequired | string | Stable slug for this artifact. 1-128 characters, ASCII alphanumeric, hyphens, underscores, and dots only. Used to reference the artifact for later updates. |
| titlerequired | string | Human-readable display title for the artifact. |
| typerequired | string | MIME content type for the artifact content. |
| languageoptional | string | Programming language for syntax highlighting (e.g. python, swift, javascript). Omit for non-code artifacts. |
| contentrequired | string | The full content of the artifact. Maximum 1 MB. |
Supported MIME Types
| MIME Type | Use For |
|---|---|
text/markdown | Markdown documents, reports, notes |
text/html | HTML pages, interactive previews |
text/plain | Plain text files, configuration |
image/svg+xml | SVG vector graphics |
text/x-mermaid | Mermaid diagram definitions |
application/json | JSON data files |
text/x-{language} | Code files (e.g. text/x-python, text/x-swift) |
Identifier Rules
- 1-128 characters long
- ASCII alphanumeric characters, hyphens, underscores, and dots only
- Slug format, lowercase recommended (e.g.
my-component.tsx,data-pipeline) - Must be unique within the chat for creation; reuse the same identifier with
x_update_artifact
Versioning
Creating an artifact sets it to version 1. Subsequent updates via
x_update_artifact increment the version number. All versions
are retained for history.
Use Cases
- Generating code files with syntax highlighting
- Creating documentation and technical reports
- Building interactive HTML previews
- Rendering architecture diagrams with Mermaid
Example: Code File
{
"name": "x_create_artifact",
"arguments": {
"identifier": "fibonacci.py",
"title": "Fibonacci Generator",
"type": "text/x-python",
"language": "python",
"content": "def fibonacci(n: int) -> list[int]:\n \"\"\"Generate the first n Fibonacci numbers.\"\"\"\n if n <= 0:\n return []\n if n == 1:\n return [0]\n seq = [0, 1]\n for _ in range(2, n):\n seq.append(seq[-1] + seq[-2])\n return seq\n\nif __name__ == \"__main__\":\n print(fibonacci(10))\n"
}
}
Example: Mermaid Diagram
{
"name": "x_create_artifact",
"arguments": {
"identifier": "auth-flow",
"title": "Authentication Flow",
"type": "text/x-mermaid",
"content": "sequenceDiagram\n participant U as User\n participant A as API Gateway\n participant S as Auth Service\n participant D as Database\n U->>A: POST /login\n A->>S: Validate credentials\n S->>D: Query user\n D-->>S: User record\n S-->>A: JWT token\n A-->>U: 200 OK + token\n"
}
}
SSE Event
data: {"type":"x_artifact.created","identifier":"fibonacci.py","title":"Fibonacci Generator","version":1}
x_update_artifact
Updates a previously created artifact with new content, creating a new version. The update is a full replacement, the entire content is replaced, not a diff or patch. This tool is always injected in chat context.
Parameters
| Parameter | Type | Description |
|---|---|---|
| identifierrequired | string | The identifier of the artifact to update. Must match a previously created artifact in this chat. |
| contentrequired | string | The complete updated content. Replaces the previous version entirely. |
Use Cases
- Iterating on code based on user feedback
- Refining documents and reports
- Updating diagrams with new components
- Fixing bugs in previously generated code
Example Tool Call
{
"name": "x_update_artifact",
"arguments": {
"identifier": "fibonacci.py",
"content": "from functools import lru_cache\n\n@lru_cache(maxsize=None)\ndef fibonacci(n: int) -> int:\n \"\"\"Return the nth Fibonacci number (memoized).\"\"\"\n if n < 2:\n return n\n return fibonacci(n - 1) + fibonacci(n - 2)\n\ndef fibonacci_sequence(count: int) -> list[int]:\n \"\"\"Generate the first count Fibonacci numbers.\"\"\"\n return [fibonacci(i) for i in range(count)]\n\nif __name__ == \"__main__\":\n print(fibonacci_sequence(10))\n"
}
}
SSE Event
data: {"type":"x_artifact.updated","identifier":"fibonacci.py","version":2}
x_create_mockup
Creates a multi-file mockup bundle (HTML, CSS, JavaScript, and other
static assets) intended for visual preview. Unlike x_create_artifact,
which stores a single file, a mockup bundle groups several related files
that reference each other through relative paths. The bundle is rendered
inside a sandboxed iframe so the model can demonstrate visual designs and
interactive prototypes without exposing the host page to the bundle's
scripts.
Availability: create_mockup stays
registered on the router but is only surfaced to direct
POST /v1/responses callers that ship their own
tools array. The agentic chat surface advertises
x_add_mockup_file (incremental, one file per call) and
x_update_mockup (delete + bulk-replace against an existing
bundle) instead, and only when the workspace has
agenticByDefault enabled or the request explicitly opts
into agentic mode.
Parameters
| Parameter | Type | Description |
|---|---|---|
| identifierrequired | string | Stable slug for this bundle. 1-128 characters, ASCII alphanumeric, hyphens, underscores, and dots only. Used to reference the bundle for later updates. |
| titlerequired | string | Human-readable display title for the mockup. |
| entryrequired | string | Relative path within the bundle of the entry document loaded by the preview iframe (e.g. index.html). Must match one of the supplied file paths. |
| filesrequired | array | Array of file objects that make up the bundle. See file object schema below. At least one file is required and one of them must match entry. |
File Object Schema
| Field | Type | Description |
|---|---|---|
| pathrequired | string | Relative path within the bundle (e.g. index.html, css/main.css, js/app.js). Must not start with / or contain .. segments. |
| contentrequired | string | The file's content. Plain text for text files; base64 for binary assets when encoding is base64. |
| encodingoptional | string | Either utf8 (default) or base64. Use base64 for binary assets such as images. |
Bundle Preview URL
Each persisted file in a bundle is reachable via:
GET /v1/mockups/{bundleId}/{path}
The bundleId is returned in the x_mockup.created
event (see the streaming docs). The route is intended for sandboxed-iframe
rendering only, responses include a strict Content-Security-Policy and
are not meant to be embedded in unsandboxed contexts.
Use Cases
- Demonstrating a multi-file UI design (HTML + CSS + JS) inside the chat
- Building interactive prototypes with multiple linked pages
- Sharing a small static site preview without leaving the chat
Example Tool Call
{
"name": "x_create_mockup",
"arguments": {
"identifier": "landing-page",
"title": "Landing Page Mockup",
"entry": "index.html",
"files": [
{
"path": "index.html",
"content": "<!doctype html>\n<html>\n<head>\n <link rel=\"stylesheet\" href=\"css/main.css\">\n</head>\n<body>\n <h1>Hello</h1>\n <script src=\"js/app.js\"></script>\n</body>\n</html>\n"
},
{
"path": "css/main.css",
"content": "body { font-family: system-ui; margin: 2rem; }\nh1 { color: #2b6cb0; }\n"
},
{
"path": "js/app.js",
"content": "console.log('mockup loaded');\n"
}
]
}
}
SSE Event
data: {"type":"x_mockup.created","bundleId":"mkp_8f3k2m","identifier":"landing-page","title":"Landing Page Mockup","entry":"index.html","files":[{"path":"index.html","contentType":"text/html","size":182},{"path":"css/main.css","contentType":"text/css","size":74},{"path":"js/app.js","contentType":"application/javascript","size":28}]}
x_add_mockup_file
Writes a single file into a mockup bundle keyed by
identifier. The first call for a new identifier
creates the bundle (title is required on that call);
subsequent calls with the same identifier add or replace files in
that bundle. Each call carries one file's worth of content, so the
tool-call arguments JSON stays small and survives
output-token cuts that routinely truncate create_mockup's
atomic "every file in one call" payload.
Availability: x_add_mockup_file is the
preferred mockup-authoring path in agentic mode. It is advertised
only when the workspace has agenticByDefault enabled or
the request explicitly opts into agentic mode, and only when the
caller leaves the corresponding chat-settings toggle on.
Parameters
| Parameter | Type | Description |
|---|---|---|
| identifierrequired | string | Stable slug for the bundle. Reuse the same value across calls so files accumulate into one bundle. |
| titleoptional | string | Human-readable display title. Required on the first call for a given identifier; ignored on subsequent calls. |
| pathrequired | string | Relative path within the bundle (e.g. index.html, css/main.css, js/app.js). Must not start with / or contain .. segments. |
| contentrequired | string | The file's content. Plain text for text files; base64 for binary assets when encoding is base64. |
| encodingoptional | string | Either utf8 (default) or base64. Use base64 for binary assets such as images. |
| entryoptional | string | Entry document loaded by the preview iframe. Defaults to index.html on the first call. Honored on the first call only. |
Use Cases
- Any mockup whose files contain HTML, CSS, JavaScript, or other quote-laden content (avoids parser truncation that breaks
create_mockup). - Multi-file bundles built up over several turns of a conversation.
- Adding a file to an existing bundle without rewriting the whole bundle.
Example Tool Calls
Two calls sharing the same identifier assemble one bundle.
The first call creates the bundle and supplies title; the
second call adds another file.
{
"name": "x_add_mockup_file",
"arguments": {
"identifier": "landing-page",
"title": "Landing Page Mockup",
"path": "index.html",
"content": "<!doctype html>\n<html>\n<head>\n <link rel=\"stylesheet\" href=\"css/main.css\">\n</head>\n<body>\n <h1>Hello</h1>\n</body>\n</html>\n"
}
}
{
"name": "x_add_mockup_file",
"arguments": {
"identifier": "landing-page",
"path": "css/main.css",
"content": "body { font-family: system-ui; margin: 2rem; }\nh1 { color: #2b6cb0; }\n"
}
}
SSE Events
The first call for a new identifier emits x_mockup.created;
subsequent calls against the same identifier emit
x_mockup.updated with the changed paths. Failures emit
x_mockup.error. Payload schemas are documented in the
streaming reference.
x_update_mockup
Updates an existing mockup bundle. Files listed in files are
added or replaced; paths listed in delete are removed. Files
that are not mentioned remain unchanged. Each updated file gets its
version incremented. x_update_mockup is advertised only in
agentic mode, alongside x_add_mockup_file.
Parameters
| Parameter | Type | Description |
|---|---|---|
| identifieroptional | string | The identifier of a previously created bundle in this chat. One of identifier or bundle_id is required. |
| bundle_idoptional | string | The opaque bundle id returned in x_mockup.created. One of identifier or bundle_id is required. |
| titleoptional | string | New display title. Omit to leave unchanged. |
| entryoptional | string | New entry path. Must match an existing or newly added file. Omit to leave unchanged. |
| filesoptional | array | Files to add or replace. Same schema as create_mockup. |
| deleteoptional | array | Array of relative paths to remove from the bundle. The current entry path cannot be deleted unless entry is also reassigned. |
Use Cases
- Iterating on a mockup based on user feedback
- Adding a new page or asset to an existing bundle
- Removing files that are no longer part of the design
Example Tool Call
{
"name": "x_update_mockup",
"arguments": {
"identifier": "landing-page",
"files": [
{
"path": "css/main.css",
"content": "body { font-family: system-ui; margin: 2rem; background: #0b1220; color: #f0f4ff; }\nh1 { color: #7aa9ff; }\n"
}
],
"delete": ["js/app.js"]
}
}
SSE Event
data: {"type":"x_mockup.updated","bundleId":"mkp_8f3k2m","identifier":"landing-page","title":"Landing Page Mockup","entry":"index.html","changed":["css/main.css"],"deleted":["js/app.js"]}
x_ask_user
Pauses the response and asks the user a clarifying question. The model uses this when ambiguity prevents a useful response. Supports multiple interaction styles: free-text questions, multiple-choice options, confirmation prompts, rating scales, and structured form fields. This tool is always injected in chat context.
Parameters
| Parameter | Type | Description |
|---|---|---|
| questionrequired | string | The clarifying question to ask the user. |
| optionsoptional | string[] | List of suggested answers for the user to pick from. |
| allow_free_textoptional | boolean | Whether the user can type a free-text answer in addition to picking an option. Default: true. |
| multi_selectoptional | boolean | Whether the user can select more than one option. Default: false. |
| styleoptional | string | Card style. One of: default, confirm, rating, form. Default: default. |
| fieldsoptional | object[] | Form fields for form style cards. Each field has label, key, and optional placeholder and required. |
Fields Sub-Parameters (for form style)
| Field | Type | Description |
|---|---|---|
| labelrequired | string | Display label for the input field. |
| keyrequired | string | Machine-readable key for the field value. |
| placeholderoptional | string | Placeholder text shown in the input. |
| requiredoptional | boolean | Whether the field must be filled. Default: false. |
Card Styles
Default Style
A question with optional multiple-choice options and free-text input.
{
"name": "x_ask_user",
"arguments": {
"question": "Which programming language would you like the example in?",
"options": ["Python", "JavaScript", "Go", "Rust"],
"allow_free_text": true
}
}
Confirm Style
A Yes/No confirmation prompt.
{
"name": "x_ask_user",
"arguments": {
"question": "This will overwrite the existing configuration. Proceed?",
"style": "confirm"
}
}
Rating Style
A 1-5 rating scale.
{
"name": "x_ask_user",
"arguments": {
"question": "How satisfied are you with this solution?",
"style": "rating"
}
}
Form Style
Labeled input fields for collecting structured data.
{
"name": "x_ask_user",
"arguments": {
"question": "Please provide the database connection details:",
"style": "form",
"fields": [
{"label": "Host", "key": "host", "placeholder": "localhost", "required": true},
{"label": "Port", "key": "port", "placeholder": "5432", "required": true},
{"label": "Database Name", "key": "db_name", "placeholder": "myapp_production", "required": true},
{"label": "Username", "key": "username", "placeholder": "admin"},
{"label": "Password", "key": "password", "placeholder": "********"}
]
}
}
Use Cases
- Disambiguating vague requirements before generating code
- Collecting structured configuration input
- Getting user confirmation before destructive operations
- Offering choices when multiple valid approaches exist
SSE Events
The chat-completions SSE hook converts an x_ask_user tool
call into the events below. All field names are camelCase, matching
the emitter in
ChatCompletionsArtifactPostSynthesisHook:
data: {"type":"x_ask_user.question","askUserId":"ask_abc123","toolCallId":"call_42","question":"Which programming language?","options":["Python","JavaScript","Go","Rust"],"allowFreeText":true,"multiSelect":false,"style":"default","fields":[]}
data: {"type":"x_ask_user.pending_state","askUserId":"ask_abc123","toolCallId":"call_42","assistantContent":"...partial assistant text...","toolCalls":[]}
Notes: x_ask_user has no server-tool
implementation file, dispatch is SSE-only via the chat-completions
hook. Argument defaults derive from
optionalBool/optionalString reads
(allow_free_text defaults to false when the
model does not set it, despite the convenience example above).
x_save_memory
Saves a fact, preference, or instruction to the per-workspace memory store. The content is embedded as a vector and persisted to the database for future semantic recall. Near-duplicate content is automatically deduplicated via embedding similarity comparison. This tool is always injected in chat context and requires a chat ID.
Parameters
| Parameter | Type | Description |
|---|---|---|
| contentrequired | string | The fact, preference, or instruction to save to memory. |
| categoryoptional | string | Category of the memory entry. One of: preference, fact, instruction. When omitted, the system infers the category from the content. |
Deduplication
When saving, the system checks for existing memories with very high semantic similarity. If a near-duplicate is found, the save is skipped and the existing memory is returned instead. This prevents the memory store from accumulating redundant entries.
Use Cases
- Storing user formatting preferences
- Remembering project details and deadlines
- Saving coding style instructions
- Persisting domain-specific knowledge for the conversation
Example Tool Call
{
"name": "x_save_memory",
"arguments": {
"content": "User prefers TypeScript with strict mode enabled for all code examples",
"category": "preference"
}
}
Example Response
{
"status": "saved",
"memory_id": "mem_x7k9p2",
"content": "User prefers TypeScript with strict mode enabled for all code examples",
"category": "preference"
}
For full details on memory architecture, passive injection, and the management API, see Chat Memory.
x_recall_memory
Searches the per-workspace memory store using semantic similarity. Returns the top matching memories ranked by relevance score. This tool is always injected in chat context and requires a chat ID.
Parameters
| Parameter | Type | Description |
|---|---|---|
| queryrequired | string | Natural language query to search memories against. |
| limitoptional | integer | Max memories to return. |
| offsetoptional | integer | Zero-based row offset for pagination. |
Return Value
Returns a ranked markdown list of matching memories.
Use Cases
- Retrieving user preferences before generating output
- Looking up previously mentioned facts and context
- Checking for saved instructions before starting a task
Example Tool Call
{
"name": "x_recall_memory",
"arguments": {
"query": "What programming language does the user prefer?"
}
}
Example Response
{
"memories": [
{
"id": "mem_x7k9p2",
"content": "User prefers TypeScript with strict mode enabled for all code examples",
"category": "preference",
"relevance": 0.94,
"created_at": "2026-03-09T14:22:00Z"
},
{
"id": "mem_r3m8q1",
"content": "Always use ESLint with the recommended ruleset",
"category": "instruction",
"relevance": 0.78,
"created_at": "2026-03-09T14:18:00Z"
}
]
}
For full details on memory architecture, passive injection, and the management API, see Chat Memory.
x_forget
Retracts a memory, decision, milestone, or artifact the model created earlier in the session. The delete is soft: the entity remains visible in the app and the CLI and can be restored there. Use this for model self-correction only. User-requested deletion is done in the app or CLI, not via this tool.
Parameters
| Parameter | Type | Description |
|---|---|---|
| typerequired | string | Entity type to retract. One of: memory, decision, milestone, artifact. |
| idrequired | string | The handle the original tool returned: the memory UUID for memory, the dec_ id for decision, the mst_ id for milestone, or the identifier slug for artifact. |
Example Tool Call
{
"name": "x_forget",
"arguments": {
"type": "decision",
"id": "dec_a1b2c3d4e5f6g7h8"
}
}
Agentic Loop
When server-side tools are enabled, Xerotier uses an agentic loop to iteratively call tools and feed results back to the model. This loop is the foundational mechanism that powers all server-side tool execution, including research workflows and Deep Think.
How the Loop Works
- Model response, The model generates a response that may include one or more tool calls.
-
Tool execution, Xerotier executes the requested
tools server-side (e.g.,
x_web_search,x_fetch_url,x_code_search). - Result injection, Tool results are appended to the conversation as tool-role messages.
- Follow-up, The model generates another response based on the tool results. This response may include additional tool calls.
- Termination, Steps 2-4 repeat until the model produces a final text response (no tool calls) or the iteration limit is reached.
Limits
| Limit | Value | Description |
|---|---|---|
| Maximum iterations per request | 5 |
The loop runs at most 5 tool-call rounds before forcing a final response. |
| Per-tool execution timeout | 15 seconds |
Each individual tool call must complete within 15 seconds or it is cancelled. |
| Tool call rate limit | 45 / minute |
Maximum of 45 tool calls per minute per endpoint to prevent abuse. |
Automatic Behavior
The agentic loop runs automatically whenever tools such as
x_web_search, x_fetch_url,
x_code_search, or x_calculator are enabled
via web_search_options or the
web_search_preview tool type. No additional client-side
configuration is required, the router handles all tool execution,
result marshalling, and follow-up model calls transparently.
If the model does not invoke any tools, the loop completes in a single iteration and the response is returned directly. When the maximum iteration limit is reached, the model is prompted to produce a final answer using the information gathered so far.
Note: The Deep Think feature extends this agentic loop with a multi-phase planning and synthesis layer on top. Each Deep Think sub-task runs its own agentic loop independently.
Deep Think
Deep Think performs extended, multi-phase autonomous research on a query. The system decomposes the question into sub-tasks, executes each independently through the agentic loop, and synthesizes all findings into a comprehensive report. During execution the SSE stream emits progress events so clients can display real-time status for each sub-task.
How It Works
- Planning, The model decomposes the user query into focused sub-tasks (up to 10 by default), each with a specific search question and tool set.
-
Execution, Each sub-task runs through the agentic
loop sequentially. All research tools (
x_web_search,x_fetch_url,x_code_search,x_repo_overview,x_repo_read,x_calculator) are available per sub-task. - Synthesis, All sub-task results are combined and the model produces a single comprehensive report streamed as normal SSE content chunks.
Enabling Deep Think
Add x_deep_think: true to the
web_search_options object. All research tools are
automatically enabled for deep think requests.
{
"model": "my-model",
"messages": [
{"role": "user", "content": "Comprehensive analysis of quantum computing advances in 2026"}
],
"stream": true,
"web_search_options": {
"search_context_size": "medium",
"x_deep_think": true,
"x_tools": ["x_web_search", "x_fetch_url", "x_code_search", "x_calculator"]
}
}
web_search_options Fields
| Field | Type | Default | Description |
|---|---|---|---|
| x_deep_thinkoptional | boolean | false |
When true, activates the multi-phase deep think pipeline instead of the single agentic loop. |
Deep Think Lifecycle
The SSE stream for a deep think request follows this sequence.
Caveat: the x_deep_think.completed,
x_deep_think.error, and
x_deep_think.artifact_created event types are declared in
DeepThinkModels but are dashboard-synthesised or not
currently emitted on the public
/v1/chat/completions stream. Treat them as forward-looking
contract.
1. x_deep_think.plan_created , plan ready, includes title + sub-task count
2. x_deep_think.subtask_started , sub-task N begins (repeated per sub-task)
x_research.searching / x_research.reading , tool-level events within sub-task
3. x_deep_think.subtask_completed, sub-task N finished
... (repeat 2-3 for each sub-task) ...
4. x_deep_think.synthesizing , synthesis phase begins
data: {"choices":[...]} , normal SSE content chunks (final report)
5. x_deep_think.completed , pipeline finished, includes artifact info
6. data: [DONE] , stream terminates
Limits
| Limit | Value | Description |
|---|---|---|
| Max sub-tasks | 10 | Maximum sub-tasks the planner can create per request. |
| Sub-task iterations | 5 | Maximum agentic loop iterations per sub-task. |
Deep Think Events (SSE)
Deep think progress events are emitted as inline SSE data lines with a
type field prefixed by x_deep_think.. Clients
should check for this prefix and render progress UI accordingly.
| Event Type | Fields | Description |
|---|---|---|
x_deep_think.plan_created |
title, total_subtasks |
Emitted after the planning phase. Contains the research title and number of sub-tasks. |
x_deep_think.subtask_started |
subtask_id, subtask_query, subtask_index, total_subtasks |
Emitted when a sub-task begins execution. |
x_deep_think.subtask_completed |
subtask_id, subtask_index |
Emitted when a sub-task finishes. |
x_deep_think.synthesizing |
message |
Emitted when synthesis begins. Content chunks follow as normal SSE. |
x_deep_think.completed |
title, artifact_name |
Emitted when the pipeline finishes. The report is auto-saved as an artifact. |
x_deep_think.error |
message |
Emitted if a fatal error occurs during deep think. |
x_deep_think.discovery_started |
mode, message |
Emitted when the discovery phase begins (target-focused mode only). |
x_deep_think.discovery_completed |
pages_fetched, site_map, message |
Emitted when the discovery phase completes. Includes page count and site structure summary. |
x_deep_think.artifact_created |
artifact_type, artifact_title, message |
Emitted when a structured artifact (table, matrix, findings list) is created from sub-task results. |
Example Event Stream
data: {"type":"x_deep_think.plan_created","title":"Quantum Computing Advances","total_subtasks":4}
data: {"type":"x_deep_think.subtask_started","subtask_id":"1","subtask_query":"Latest quantum error correction breakthroughs","subtask_index":0,"total_subtasks":4}
data: {"type":"x_research.searching","name":"x_web_search","arguments":"{\"query\":\"quantum error correction 2026\"}"}
data: {"type":"x_deep_think.subtask_completed","subtask_id":"1","subtask_index":0}
data: {"type":"x_deep_think.subtask_started","subtask_id":"2","subtask_query":"Major quantum hardware milestones","subtask_index":1,"total_subtasks":4}
...
data: {"type":"x_deep_think.synthesizing","message":"Synthesizing final report..."}
data: {"choices":[{"index":0,"delta":{"content":"# Quantum Computing Advances\n\n"},"finish_reason":null}]}
...
data: {"type":"x_deep_think.completed","title":"Quantum Computing Advances","artifact_name":"deep-think-20260227-143012.md"}
data: [DONE]
SSE Events Reference
When streaming, server-side tool execution emits inline SSE events so
clients can display progress indicators. All vendor-specific events use
the x_ prefix for OpenAI spec compliance.
Research Events
| Event Type | Tool | Description |
|---|---|---|
x_research.searching |
x_web_search |
Emitted when a web search begins execution. |
x_research.reading |
x_fetch_url |
Emitted when a URL fetch begins execution. |
x_research.code_searching |
x_code_search |
Emitted when a code search begins execution. |
x_research.calculating |
x_calculator |
Emitted when a calculator evaluation begins. |
x_research.tool_call |
Any | Emitted for tool invocations that do not match a specific research event type. Includes tool_name, message. |
x_research.result |
All research tools | Emitted when a tool call completes with its result. |
x_research.complete |
-- | Declared in the event vocabulary and consumed by the chat dashboard, but not currently emitted by any router-side code path on the public /v1/chat/completions stream. Documented for future parity; do not depend on it from public-API clients yet. |
Artifact Events
| Event Type | Description |
|---|---|
x_artifact.created |
Emitted when a new artifact is created. Includes identifier, title, version. |
x_artifact.updated |
Emitted when an existing artifact is updated. Includes identifier, version. |
Ask User Events
| Event Type | Description |
|---|---|
x_ask_user.question |
Emitted when the model asks a clarifying question. CamelCase fields: askUserId, toolCallId, question, options, allowFreeText, multiSelect, style, fields. |
x_ask_user.pending_state |
Emitted with the assistant's partial content and tool state while awaiting user response. Fields: askUserId, toolCallId, assistantContent, toolCalls. |
Deep Think Events
| Event Type | Description |
|---|---|
x_deep_think.plan_created |
Emitted after the planning phase. Includes title, total_subtasks. |
x_deep_think.subtask_started |
Emitted when a sub-task begins. Includes subtask_id, subtask_query, subtask_index, total_subtasks. |
x_deep_think.subtask_completed |
Emitted when a sub-task finishes. Includes subtask_id, subtask_index. |
x_deep_think.synthesizing |
Emitted when synthesis begins. Includes message. |
x_deep_think.completed |
Emitted when the pipeline finishes. Includes title, artifact_name. |
x_deep_think.error |
Emitted if a fatal error occurs. Includes message. |
x_deep_think.discovery_started |
Emitted when discovery phase begins (target-focused mode). Includes mode, message. |
x_deep_think.discovery_completed |
Emitted when discovery phase completes. Includes pages_fetched, site_map, message. |
x_deep_think.artifact_created |
Emitted when a structured artifact is created. Includes artifact_type, artifact_title, message. |
File Search Events (Responses API)
Note: the OpenAI-compatible response.file_search_call.*
event types below are declared in the router's event vocabulary but
are not currently emitted by any router-side code path on the public
/v1/chat/completions stream. They are reserved for future
parity with the Responses API and should not be relied upon by
client integrations yet.
| Event Type | Description |
|---|---|
response.file_search_call.in_progress |
Emitted when a server-side file search invocation begins. Includes item_id, output_index. |
response.file_search_call.searching |
Emitted while the file search is actively executing. |
response.file_search_call.completed |
Emitted when the file search finishes. Includes the completed output item with results. |
Chat Metadata Events
| Event Type | Description |
|---|---|
x_chat.metadata |
Emitted at the end of a response stream with aggregated usage metadata including research token counts. Includes usage (object with research token breakdown). |
Example SSE Stream
data: {"type":"x_research.searching","name":"x_web_search","arguments":"{\"query\":\"rust async patterns 2026\"}"}
data: {"type":"x_research.result","name":"x_web_search","tool_call_id":"call_1"}
data: {"type":"x_research.reading","name":"x_fetch_url","arguments":"{\"url\":\"https://blog.rust-lang.org/...\"}"}
data: {"type":"x_research.result","name":"x_fetch_url","tool_call_id":"call_2"}
data: {"type":"x_research.calculating","name":"x_calculator","arguments":"{\"expression\":\"2^32\"}"}
data: {"type":"x_research.result","name":"x_calculator","tool_call_id":"call_3"}
data: {"type":"x_research.complete","elapsed_ms":4200,"input_tokens":12500,"output_tokens":850,"iterations":3,"sources":5}
data: {"choices":[{"index":0,"delta":{"content":"Based on my research..."},"finish_reason":null}]}
...
data: [DONE]
Rate Limiting & Caching
Per-Project Rate Limiting
Tool calls are rate limited per project (default 45 calls per minute). When the limit is exceeded, the tool returns an error response instead of executing.
Rate Limit Error Response
{
"error": "Research tool rate limit exceeded. Try again in 12 seconds."
}
Result Caching
Identical tool calls (same function name and arguments) within a 5-minute
window return cached results without re-executing. This prevents redundant
network calls when the model re-invokes the same search or fetch.
Auto-fetched URLs from x_web_search are also cached, so
subsequent explicit x_fetch_url calls for the same URL are free.
Tool Limits
| Limit | Value | Description |
|---|---|---|
| Rate limit | 45/min | Maximum tool calls per minute per project. |
| Cache TTL | 5 minutes | How long identical tool results are cached. |
| Auto-fetch count | 2 | Number of top URLs auto-fetched from web_search results. |
| Max URLs per fetch | 5 | Maximum URLs per fetch_url call. |
| Max follow links | 3 | Maximum same-domain links to auto-follow in discovery mode. |
| Character budget | 24,000 | Character budget for multi-URL fetch results, distributed across pages. |
Full API Examples
Chat Completions with Research Tools
curl -X POST https://api.xerotier.ai/proj_ABC123/my-endpoint/v1/chat/completions \
-H "Authorization: Bearer xero_myproject_your_api_key" \
-H "Content-Type: application/json" \
-d '{
"model": "my-model",
"messages": [
{"role": "user", "content": "What are the key differences between Rust and Go for building web servers?"}
],
"stream": true,
"web_search_options": {
"search_context_size": "medium",
"x_tools": ["x_web_search", "x_fetch_url", "x_code_search"]
}
}'
Chat Completions with All Tools Enabled
curl -X POST https://api.xerotier.ai/proj_ABC123/my-endpoint/v1/chat/completions \
-H "Authorization: Bearer xero_myproject_your_api_key" \
-H "Content-Type: application/json" \
-d '{
"model": "my-model",
"messages": [
{"role": "user", "content": "Compare the async runtimes in Rust and explain the tradeoffs"}
],
"stream": true,
"web_search_options": {
"search_context_size": "high",
"max_iterations": 8,
"x_tools": [
"x_web_search", "x_fetch_url", "x_code_search",
"x_repo_overview", "x_calculator"
]
}
}'
from openai import OpenAI
client = OpenAI(
base_url="https://api.xerotier.ai/proj_ABC123/my-endpoint/v1",
api_key="xero_myproject_your_api_key",
)
stream = client.chat.completions.create(
model="my-model",
messages=[
{"role": "user", "content": "Compare the async runtimes in Rust and explain the tradeoffs"}
],
stream=True,
extra_body={
"web_search_options": {
"search_context_size": "high",
"max_iterations": 8,
"x_tools": [
"x_web_search", "x_fetch_url", "x_code_search",
"x_repo_overview", "x_calculator"
]
}
},
)
for chunk in stream:
if chunk.choices and chunk.choices[0].delta.content:
print(chunk.choices[0].delta.content, end="")
print()
import OpenAI from "openai";
const client = new OpenAI({
baseURL: "https://api.xerotier.ai/proj_ABC123/my-endpoint/v1",
apiKey: "xero_myproject_your_api_key",
});
const stream = await client.chat.completions.create({
model: "my-model",
messages: [
{ role: "user", content: "Compare the async runtimes in Rust and explain the tradeoffs" }
],
stream: true,
web_search_options: {
search_context_size: "high",
max_iterations: 8,
x_tools: [
"x_web_search", "x_fetch_url", "x_code_search",
"x_repo_overview", "x_calculator"
],
},
});
for await (const chunk of stream) {
const content = chunk.choices?.[0]?.delta?.content;
if (content) process.stdout.write(content);
}
console.log();
Responses API with Web Search and File Search
curl -X POST https://api.xerotier.ai/proj_ABC123/my-endpoint/v1/responses \
-H "Authorization: Bearer xero_myproject_your_api_key" \
-H "Content-Type: application/json" \
-d '{
"model": "my-model",
"input": "Compare the uploaded design spec with current best practices for REST API design",
"stream": true,
"tools": [
{
"type": "web_search_preview",
"search_context_size": "medium",
"x_tools": ["x_web_search", "x_fetch_url"]
},
{
"type": "file_search",
"vector_store_ids": ["vs_abc123"]
}
]
}'
from openai import OpenAI
client = OpenAI(
base_url="https://api.xerotier.ai/proj_ABC123/my-endpoint/v1",
api_key="xero_myproject_your_api_key",
)
response = client.responses.create(
model="my-model",
input="Compare the uploaded design spec with current best practices for REST API design",
stream=True,
tools=[
{
"type": "web_search_preview",
"search_context_size": "medium",
"x_tools": ["x_web_search", "x_fetch_url"],
},
{
"type": "file_search",
"vector_store_ids": ["vs_abc123"],
},
],
)
for event in response:
if hasattr(event, "type"):
print(event)
import OpenAI from "openai";
const client = new OpenAI({
baseURL: "https://api.xerotier.ai/proj_ABC123/my-endpoint/v1",
apiKey: "xero_myproject_your_api_key",
});
const stream = await client.responses.create({
model: "my-model",
input: "Compare the uploaded design spec with current best practices for REST API design",
stream: true,
tools: [
{
type: "web_search_preview",
search_context_size: "medium",
x_tools: ["x_web_search", "x_fetch_url"],
},
{
type: "file_search",
vector_store_ids: ["vs_abc123"],
},
],
});
for await (const event of stream) {
console.log(event);
}
Multi-Tool Conversation Flow
This example shows a typical multi-tool chain where the model searches for information, fetches a page for detail, and uses the calculator to verify a number, all within a single agentic loop.
curl -X POST https://api.xerotier.ai/proj_ABC123/my-endpoint/v1/chat/completions \
-H "Authorization: Bearer xero_myproject_your_api_key" \
-H "Content-Type: application/json" \
-d '{
"model": "my-model",
"messages": [
{
"role": "user",
"content": "What is the current price of gold per ounce, and how much would 3.5 troy ounces cost in EUR at today'\''s exchange rate?"
}
],
"stream": true,
"web_search_options": {
"search_context_size": "medium",
"max_iterations": 6,
"x_tools": ["x_web_search", "x_fetch_url", "x_calculator"]
}
}'
import json
from openai import OpenAI
client = OpenAI(
base_url="https://api.xerotier.ai/proj_ABC123/my-endpoint/v1",
api_key="xero_myproject_your_api_key",
)
# The model may chain: web_search -> fetch_url -> calculator
# all server-side in a single request
stream = client.chat.completions.create(
model="my-model",
messages=[
{
"role": "user",
"content": "What is the current price of gold per ounce, "
"and how much would 3.5 troy ounces cost in EUR "
"at today's exchange rate?"
}
],
stream=True,
extra_body={
"web_search_options": {
"search_context_size": "medium",
"max_iterations": 6,
"x_tools": ["x_web_search", "x_fetch_url", "x_calculator"]
}
},
)
for chunk in stream:
raw = chunk.model_dump()
# Check for research progress events
if "type" in raw and raw["type"].startswith("x_research."):
event_type = raw["type"]
if event_type == "x_research.searching":
print(f"[Searching] {raw.get('arguments', '')}")
elif event_type == "x_research.reading":
print(f"[Reading] {raw.get('arguments', '')}")
elif event_type == "x_research.calculating":
print(f"[Calculating] {raw.get('arguments', '')}")
elif event_type == "x_research.complete":
print(f"[Research complete] {raw.get('iterations', 0)} iterations, "
f"{raw.get('elapsed_ms', 0)}ms")
continue
# Normal content chunks
if chunk.choices and chunk.choices[0].delta.content:
print(chunk.choices[0].delta.content, end="")
print()
import OpenAI from "openai";
const client = new OpenAI({
baseURL: "https://api.xerotier.ai/proj_ABC123/my-endpoint/v1",
apiKey: "xero_myproject_your_api_key",
});
// The model may chain: web_search -> fetch_url -> calculator
// all server-side in a single request
const stream = await client.chat.completions.create({
model: "my-model",
messages: [
{
role: "user",
content: "What is the current price of gold per ounce, " +
"and how much would 3.5 troy ounces cost in EUR " +
"at today's exchange rate?",
}
],
stream: true,
web_search_options: {
search_context_size: "medium",
max_iterations: 6,
x_tools: ["x_web_search", "x_fetch_url", "x_calculator"],
},
});
for await (const chunk of stream) {
const raw = chunk;
// Check for research progress events
if (raw.type && raw.type.startsWith("x_research.")) {
if (raw.type === "x_research.searching") {
console.log(`[Searching] ${raw.arguments || ""}`);
} else if (raw.type === "x_research.reading") {
console.log(`[Reading] ${raw.arguments || ""}`);
} else if (raw.type === "x_research.calculating") {
console.log(`[Calculating] ${raw.arguments || ""}`);
} else if (raw.type === "x_research.complete") {
console.log(`[Research complete] ${raw.iterations || 0} iterations, ${raw.elapsed_ms || 0}ms`);
}
continue;
}
// Normal content chunks
const content = chunk.choices?.[0]?.delta?.content;
if (content) process.stdout.write(content);
}
console.log();
Deep Think (curl)
curl -X POST https://api.xerotier.ai/proj_ABC123/my-endpoint/v1/chat/completions \
-H "Authorization: Bearer xero_myproject_your_api_key" \
-H "Content-Type: application/json" \
-d '{
"model": "my-model",
"messages": [
{"role": "user", "content": "Comprehensive analysis of quantum computing advances in 2026"}
],
"stream": true,
"web_search_options": {
"search_context_size": "medium",
"x_deep_think": true,
"x_tools": ["x_web_search", "x_fetch_url", "x_code_search", "x_calculator"]
}
}'
Deep Think (Python)
import json
from openai import OpenAI
client = OpenAI(
base_url="https://api.xerotier.ai/proj_ABC123/my-endpoint/v1",
api_key="xero_myproject_your_api_key",
)
stream = client.chat.completions.create(
model="my-model",
messages=[
{"role": "user", "content": "Comprehensive analysis of quantum computing advances in 2026"}
],
stream=True,
extra_body={
"web_search_options": {
"search_context_size": "medium",
"x_deep_think": True,
"x_tools": ["x_web_search", "x_fetch_url", "x_code_search", "x_calculator"]
}
},
)
for chunk in stream:
raw = chunk.model_dump()
# Check for deep think progress events
if "type" in raw and raw["type"].startswith("x_deep_think."):
print(f"[{raw['type']}]", json.dumps(raw, indent=2))
continue
# Normal content chunks
if chunk.choices and chunk.choices[0].delta.content:
print(chunk.choices[0].delta.content, end="")
print()
import OpenAI from "openai";
const client = new OpenAI({
baseURL: "https://api.xerotier.ai/proj_ABC123/my-endpoint/v1",
apiKey: "xero_myproject_your_api_key",
});
const stream = await client.chat.completions.create({
model: "my-model",
messages: [
{ role: "user", content: "Comprehensive analysis of quantum computing advances in 2026" }
],
stream: true,
web_search_options: {
search_context_size: "medium",
x_deep_think: true,
x_tools: ["x_web_search", "x_fetch_url", "x_code_search", "x_calculator"],
},
});
for await (const chunk of stream) {
const raw = chunk;
// Check for deep think progress events
if (raw.type && raw.type.startsWith("x_deep_think.")) {
console.log(`[${raw.type}]`, JSON.stringify(raw, null, 2));
continue;
}
// Normal content chunks
const content = chunk.choices?.[0]?.delta?.content;
if (content) process.stdout.write(content);
}
console.log();
Deep think requests use streaming and emit progress events inline. The final report is streamed as normal content chunks after all sub-tasks complete. The report is also auto-saved as an artifact in the chat interface.
Artifact Creation Flow
curl -X POST https://api.xerotier.ai/proj_ABC123/my-endpoint/v1/chat/completions \
-H "Authorization: Bearer xero_myproject_your_api_key" \
-H "Content-Type: application/json" \
-d '{
"model": "my-model",
"messages": [
{
"role": "user",
"content": "Create a Python script that implements a binary search tree with insert, search, and in-order traversal methods."
}
],
"stream": true
}'
import json
from openai import OpenAI
client = OpenAI(
base_url="https://api.xerotier.ai/proj_ABC123/my-endpoint/v1",
api_key="xero_myproject_your_api_key",
)
# In the chat context, create_artifact and update_artifact
# are always available. The model will use them when appropriate.
stream = client.chat.completions.create(
model="my-model",
messages=[
{
"role": "user",
"content": "Create a Python script that implements a "
"binary search tree with insert, search, and "
"in-order traversal methods."
}
],
stream=True,
)
for chunk in stream:
raw = chunk.model_dump()
# Check for artifact events
if "type" in raw:
if raw["type"] == "x_artifact.created":
print(f"\n[Artifact created] {raw.get('identifier', '')} "
f"- {raw.get('title', '')} (v{raw.get('version', 1)})")
elif raw["type"] == "x_artifact.updated":
print(f"\n[Artifact updated] {raw.get('identifier', '')} "
f"(v{raw.get('version', '')})")
continue
if chunk.choices and chunk.choices[0].delta.content:
print(chunk.choices[0].delta.content, end="")
print()
import OpenAI from "openai";
const client = new OpenAI({
baseURL: "https://api.xerotier.ai/proj_ABC123/my-endpoint/v1",
apiKey: "xero_myproject_your_api_key",
});
// In the chat context, create_artifact and update_artifact
// are always available. The model will use them when appropriate.
const stream = await client.chat.completions.create({
model: "my-model",
messages: [
{
role: "user",
content: "Create a Python script that implements a " +
"binary search tree with insert, search, and " +
"in-order traversal methods.",
}
],
stream: true,
});
for await (const chunk of stream) {
const raw = chunk;
// Check for artifact events
if (raw.type) {
if (raw.type === "x_artifact.created") {
console.log(`\n[Artifact created] ${raw.identifier || ""} ` +
`- ${raw.title || ""} (v${raw.version || 1})`);
} else if (raw.type === "x_artifact.updated") {
console.log(`\n[Artifact updated] ${raw.identifier || ""} ` +
`(v${raw.version || ""})`);
}
continue;
}
const content = chunk.choices?.[0]?.delta?.content;
if (content) process.stdout.write(content);
}
console.log();