Web Search - Xerotier

Overview

The web_search server-side tool searches the web using built-in search and returns structured results including titles, URLs, and snippets. The top 2 result URLs are automatically fetched in parallel and appended as fetched_pages, providing immediate full-page context enrichment without a second tool call.

Web search runs server-side inside the agentic loop. When the model decides to search, the router executes the tool, injects the results back into the conversation, and the model continues generating. Clients receive a single streaming response without managing the tool loop themselves.

When to Use Web Search

Current events and breaking news that the model may not know
Fact checking and real-time data verification
Product research, pricing, and comparisons
Documentation and API reference lookup
Any query that benefits from up-to-date information

Note: Web search is a research tool and must be explicitly opted in. See Enabling Web Search below. Other tools such as create_artifact and ask_user are injected automatically. See Server-Side Tools for the full tool reference.

Enabling Web Search

Web search must be opted into on each request. There are two ways to enable it depending on which API you are using.

Chat Completions API

Add a web_search_options object to your /v1/chat/completions request. Set x_tools to include "web_search":

JSON

                    {
  "model": "my-model",
  "messages": [
    {"role": "user", "content": "What are the latest Rust async patterns?"}
  ],
  "stream": true,
  "web_search_options": {
    "search_context_size": "medium",
    "x_tools": ["web_search", "fetch_url"]
  }
}
                

Responses API

Include a tool object with "type": "web_search_preview" in the tools array:

JSON

                    {
  "model": "my-model",
  "input": "Find the latest research on transformer architectures",
  "stream": true,
  "tools": [
    {
      "type": "web_search_preview",
      "search_context_size": "medium",
      "x_tools": ["web_search", "fetch_url"]
    }
  ]
}
                

web_search_options Fields

Field	Type	Default	Description
search_context_sizeoptional	string	`"medium"`	Controls how much search context is used. One of: `low`, `medium`, `high`.
max_iterationsoptional	integer	`5`	Maximum agentic loop iterations. Range 1-10.
x_toolsoptional	string[]	`["web_search", "fetch_url"]`	Which research tools to enable. When omitted or empty, defaults to `["web_search", "fetch_url"]`. Including `web_search` auto-includes `fetch_url`.
x_deep_thinkoptional	boolean	`false`	Enables the multi-phase Deep Think pipeline. See the Deep Think section.

web_search Tool Parameters

When the model decides to invoke web search, it generates a tool call with the following argument:

Parameter	Type	Description
queryrequired	string	The search query to look up on the web. The model formulates this from the user message context.

Example Tool Call (generated by model)

JSON

                    {
  "name": "web_search",
  "arguments": {
    "query": "rust async trait stabilization 2026"
  }
}
                

Response Format

The tool returns a JSON object with the following fields. The router injects this as a tool-role message back to the model.

Field	Type	Description
`answer`	string	Direct answer from the search engine, if available. May be empty.
`abstract`	string	Summary text from the search engine, if available. May be empty.
`results`	object[]	Array of search result objects, each with `title`, `url`, and `snippet`.
`fetched_pages`	object[]	Array of auto-fetched page content for the top 2 URLs. Each entry has `url` and `content`.

Example Response

JSON

                    {
  "answer": "",
  "abstract": "",
  "results": [
    {
      "title": "Async Trait Methods Stabilized in Rust 1.85",
      "url": "https://blog.rust-lang.org/2026/02/20/async-traits.html",
      "snippet": "Rust 1.85 stabilizes async fn in traits, enabling..."
    },
    {
      "title": "Understanding async traits in Rust",
      "url": "https://docs.rs/async-trait/latest/guide",
      "snippet": "A comprehensive guide to using async trait methods..."
    }
  ],
  "fetched_pages": [
    {
      "url": "https://blog.rust-lang.org/2026/02/20/async-traits.html",
      "content": "Announcing Rust 1.85. We are happy to announce that async fn in traits..."
    },
    {
      "url": "https://docs.rs/async-trait/latest/guide",
      "content": "Async Trait Guide. This guide covers the fundamentals of async trait..."
    }
  ]
}
                

Auto-Fetch Enrichment

After returning search results, the router automatically fetches the top 2 result URLs in parallel and appends the extracted text as fetched_pages. This gives the model both the search snippets and full page content in a single tool call, reducing the number of loop iterations needed.

Each auto-fetched page shares an equal portion of a 12,000-character budget. Pages exceeding their share are truncated.

Auto-fetched URLs are also cached. If the model subsequently calls fetch_url for a URL that was already auto-fetched, the cached result is returned instantly without a network request.

SSE Events

When streaming, the router emits inline SSE events to indicate web search progress. All vendor events use the x_ prefix for OpenAI spec compliance.

Event Type	Fields	Description
`x_research.searching`	`name`, `arguments`	Emitted when a web search begins. Contains the tool name and JSON-encoded arguments.
`x_research.result`	`name`, `tool_call_id`	Emitted when the web search completes and results are available.
`x_research.reading`	`name`, `arguments`	Emitted when auto-fetch or an explicit `fetch_url` call begins fetching a URL.
`x_research.complete`	`elapsed_ms`, `input_tokens`, `output_tokens`, `iterations`, `sources`	Emitted when the entire research phase finishes, before content chunks begin.
`x_chat.metadata`	`usage`	Emitted at the end of a response stream with aggregated usage metadata including research token counts.

Example SSE Stream

SSE

                    data: {"type":"x_research.searching","name":"web_search","arguments":"{\"query\":\"rust async trait stabilization 2026\"}"}

data: {"type":"x_research.result","name":"web_search","tool_call_id":"call_1"}

data: {"type":"x_research.reading","name":"fetch_url","arguments":"{\"url\":\"https://blog.rust-lang.org/...\"}"}

data: {"type":"x_research.result","name":"fetch_url","tool_call_id":"call_2"}

data: {"type":"x_research.complete","elapsed_ms":3100,"input_tokens":8400,"output_tokens":620,"iterations":2,"sources":3}

data: {"choices":[{"index":0,"delta":{"content":"Based on my research..."},"finish_reason":null}]}

...

data: [DONE]

Code Examples

curl -- Chat Completions

curl

                    curl -X POST https://api.xerotier.ai/proj_ABC123/my-endpoint/v1/chat/completions \
  -H "Authorization: Bearer xero_myproject_your_api_key" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "my-model",
    "messages": [
      {"role": "user", "content": "What are the latest Rust async patterns?"}
    ],
    "stream": true,
    "web_search_options": {
      "search_context_size": "medium",
      "x_tools": ["web_search", "fetch_url"]
    }
  }'
                

Python (OpenAI SDK)

Python

                    from openai import OpenAI

client = OpenAI(
    base_url="https://api.xerotier.ai/proj_ABC123/my-endpoint/v1",
    api_key="xero_myproject_your_api_key",
)

stream = client.chat.completions.create(
    model="my-model",
    messages=[
        {"role": "user", "content": "What are the latest Rust async patterns?"}
    ],
    stream=True,
    extra_body={
        "web_search_options": {
            "search_context_size": "medium",
            "x_tools": ["web_search", "fetch_url"],
        }
    },
)

for chunk in stream:
    raw = chunk.model_dump()
    # Handle research progress events
    if "type" in raw and raw["type"].startswith("x_research."):
        if raw["type"] == "x_research.searching":
            print(f"[Searching] {raw.get('arguments', '')}")
        elif raw["type"] == "x_research.complete":
            print(f"[Done] {raw.get('iterations', 0)} iterations, "
                  f"{raw.get('elapsed_ms', 0)}ms")
        continue
    # Normal content chunks
    if chunk.choices and chunk.choices[0].delta.content:
        print(chunk.choices[0].delta.content, end="")
print()
                

Node.js (OpenAI SDK)

Node.js

                    import OpenAI from "openai";

const client = new OpenAI({
    baseURL: "https://api.xerotier.ai/proj_ABC123/my-endpoint/v1",
    apiKey: "xero_myproject_your_api_key",
});

const stream = await client.chat.completions.create({
    model: "my-model",
    messages: [
        { role: "user", content: "What are the latest Rust async patterns?" }
    ],
    stream: true,
    web_search_options: {
        search_context_size: "medium",
        x_tools: ["web_search", "fetch_url"],
    },
});

for await (const chunk of stream) {
    const raw = chunk;
    // Handle research progress events
    if (raw.type && raw.type.startsWith("x_research.")) {
        if (raw.type === "x_research.searching") {
            console.log(`[Searching] ${raw.arguments || ""}`);
        } else if (raw.type === "x_research.complete") {
            console.log(`[Done] ${raw.iterations || 0} iterations, ${raw.elapsed_ms || 0}ms`);
        }
        continue;
    }
    // Normal content chunks
    const content = chunk.choices?.[0]?.delta?.content;
    if (content) process.stdout.write(content);
}
console.log();
                

curl -- Responses API

curl

                    curl -X POST https://api.xerotier.ai/proj_ABC123/my-endpoint/v1/responses \
  -H "Authorization: Bearer xero_myproject_your_api_key" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "my-model",
    "input": "Find the latest research on transformer architectures",
    "stream": true,
    "tools": [
      {
        "type": "web_search_preview",
        "search_context_size": "medium",
        "x_tools": ["web_search", "fetch_url"]
      }
    ]
  }'
                

Python -- Responses API

Python

                    from openai import OpenAI

client = OpenAI(
    base_url="https://api.xerotier.ai/proj_ABC123/my-endpoint/v1",
    api_key="xero_myproject_your_api_key",
)

response = client.responses.create(
    model="my-model",
    input="Find the latest research on transformer architectures",
    stream=True,
    tools=[
        {
            "type": "web_search_preview",
            "search_context_size": "medium",
            "x_tools": ["web_search", "fetch_url"],
        }
    ],
)

for event in response:
    if hasattr(event, "type"):
        print(event)
                

Node.js -- Responses API

Node.js

                    import OpenAI from "openai";

const client = new OpenAI({
    baseURL: "https://api.xerotier.ai/proj_ABC123/my-endpoint/v1",
    apiKey: "xero_myproject_your_api_key",
});

const stream = await client.responses.create({
    model: "my-model",
    input: "Find the latest research on transformer architectures",
    stream: true,
    tools: [
        {
            type: "web_search_preview",
            search_context_size: "medium",
            x_tools: ["web_search", "fetch_url"],
        }
    ],
});

for await (const event of stream) {
    console.log(event);
}
                

Multi-Tool Chain (web_search + fetch_url + calculator)

This example shows a query that may trigger a chain of tool calls within the same agentic loop: searching for a price, fetching a page for detail, and using the calculator to convert currencies.

curl

                    curl -X POST https://api.xerotier.ai/proj_ABC123/my-endpoint/v1/chat/completions \
  -H "Authorization: Bearer xero_myproject_your_api_key" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "my-model",
    "messages": [
      {
        "role": "user",
        "content": "What is the current price of gold per ounce, and how much would 3.5 troy ounces cost in EUR at today'\''s exchange rate?"
      }
    ],
    "stream": true,
    "web_search_options": {
      "search_context_size": "medium",
      "max_iterations": 6,
      "x_tools": ["web_search", "fetch_url", "calculator"]
    }
  }'
                

Error Handling

Rate Limiting

Tool calls are rate limited to 45 calls per minute per project. When the limit is exceeded, the tool returns an error object instead of search results. The model receives this as a tool result and may include a message about the rate limit in its response.

JSON

                    {
  "error": "Research tool rate limit exceeded. Try again in 12 seconds."
}
                

Result Caching

Identical tool calls (same query) within a 5-minute window return cached results without re-executing the search. This prevents redundant network calls when the model re-invokes the same query.

Limits Reference

Limit	Value	Description
Rate limit	45 calls/min	Maximum tool calls per minute per project.
Cache TTL	5 minutes	How long identical tool results are cached.
Max iterations	5 (default)	Maximum agentic loop iterations per request. Configurable up to 10.
Auto-fetch count	2	Number of top URLs auto-fetched from web search results.
Auto-fetch budget	12,000 chars	Total character budget shared across auto-fetched pages.
Tool execution timeout	15 seconds	Each tool call must complete within 15 seconds.

Related Pages

Server-Side Tools -- Full reference for all 18 built-in server-side tools including fetch_url, code_search, create_artifact, memory tools, project intelligence tools, and more.
Tool Calling -- Client-side function calling, structured output, and parallel tool calls using the Chat Completions API.
Streaming API -- Details on consuming SSE streams including research progress events.