xeroctl chat
Send a chat completion from the terminal, with streaming on stdout, retries handled by the router, and JSON for pipeline composition. Or seed a new workspace chat from a saved template in one call. Same OpenAI shape your scripts already speak.
Overview
The chat command group covers two flows. The
send subcommand drives the OpenAI-compatible
/v1/chat/completions route against a configured
endpoint, single-shot, streaming, or an interactive
multi-turn session. The new subcommand applies
a chat template to a workspace, seeding a fresh chat with
the template's system prompt, tool allowlist, and approval
policy, and (optionally) posts an initial user message to
that chat.
send talks to /v1/chat/completions
on the configured endpoint. new talks to
/v1/workspaces/:wid/chat-templates/:tid/apply
and (when --message is supplied)
/v1/chats/:cid/messages.
For template authoring and lifecycle, see xeroctl templates.
Usage Pattern
xeroctl chat send --endpoint <slug> --message <text> [flags]
xeroctl chat send --endpoint <slug> --interactive [flags]
xeroctl chat new --workspace <wid> --template <id-or-name> [--message <text>]
Backward Compatibility
Bare xeroctl chat still works.
send is the default subcommand, so
xeroctl chat --endpoint ... --message ...
routes to chat send unchanged. New scripts
should prefer the explicit form.
send
Sends a chat completion request to a Xerotier endpoint.
Supports single-shot, streaming, and interactive multi-turn
modes. An endpoint slug (--endpoint) is always
required; --message is required unless
--interactive is set.
xeroctl chat send --endpoint my-endpoint --message "Hello, how are you?"
With a System Message
xeroctl chat send --endpoint my-endpoint \
--system "You are a helpful coding assistant." \
--message "Write a Python function to reverse a string"
System Prompt From a File
Load the system prompt from a file using --from-file. --from-file takes precedence over --system when both are passed.
xeroctl chat send --endpoint my-endpoint \
--from-file ./system-prompt.txt \
--message "What is the policy on refunds?"
Streaming
Add --stream to receive tokens as they are
generated. A spinner appears while waiting for the first
token; a throughput summary is printed to stderr after the
response completes.
xeroctl chat send --endpoint my-endpoint \
--message "Tell me a story about a robot learning to paint" \
--stream
Example streaming footer:
-- 312 tokens in 4.2s (74.3 tok/s)
Show Token Usage
Append --show-usage to print prompt, completion, total, cached, and reasoning token counts after the response.
xeroctl chat send --endpoint my-endpoint \
--message "Summarize quantum entanglement" \
--show-usage
Store the Completion
Pass --store to request server-side storage so the completion can be retrieved later via xeroctl completions.
xeroctl chat send --endpoint my-endpoint \
--message "Draft a release announcement" \
--store
JSON Output
Use -o json for the raw API response object.
xeroctl chat send --endpoint my-endpoint --message "Hello" -o json
send Options
| Option | Type | Description |
|---|---|---|
--endpoint <slug>required |
string | Endpoint slug to send the request to. |
-m, --message <text> |
string | User message to send. Required unless --interactive is specified. |
--system <text> |
string | System message to set context for the model. |
--from-file <path> |
string | Read the system prompt from a file. Takes precedence over --system when both are specified. |
--model <name> |
string | Model name passed through to the API. Informational only, the endpoint configuration determines the actual model used. |
--max-tokens <n> |
integer | Maximum number of tokens to generate. Defaults to 4096 in interactive mode when not specified. |
--temperature <f> |
float | Sampling temperature between 0.0 and 2.0. Higher values produce more varied output. |
--top-p <f> |
float | Nucleus sampling threshold between 0.0 and 1.0. |
--frequency-penalty <f> |
float | Frequency penalty between -2.0 and 2.0. Reduces repetition of already-used tokens. |
--presence-penalty <f> |
float | Presence penalty between -2.0 and 2.0. Encourages the model to discuss new topics. |
--seed <n> |
integer | Random seed for reproducible output. |
--stream |
flag | Stream the response token by token. A throughput summary is printed after completion. |
--store |
flag | Request server-side storage of the completion for later retrieval. |
--show-usage |
flag | Print detailed token usage (prompt, completion, total, cached, reasoning) after the response. |
--interactive |
flag | Start an interactive multi-turn chat session. Conversation history is maintained client-side. |
--agentic |
flag | Route the request through the router's agentic tool-calling loop by setting web_search_options.x_agentic = true on the request body. Prints a -> agentic mode breadcrumb to stderr unless --quiet is set. |
--enrich |
flag | Opt into router-side enrichment by sending X-Xerotier-Vendor-Events: 1. The router then emits vendor SSE events (x_artifact.*, x_mockup.*, x_research.*) and runs enrichment hooks such as gap-analysis follow-ups. Omitted by default so the wire stays OpenAI spec-faithful. |
--quiet |
flag | Suppress informational breadcrumbs on stderr, including the -> agentic mode line printed by --agentic and the post-stream throughput summary. Errors still print. |
Agentic Tool-Calling Loop
Pass --agentic to route the request through the
router's server-side agentic loop. The CLI sets
web_search_options.x_agentic = true on the chat
completion body; the router then drives tool calls on the
user's behalf and returns the final assistant turn. A
-> agentic mode breadcrumb is written to
stderr (suppressed under --quiet) so scripted
callers can confirm the mode without parsing the response.
Vendor Enrichment Events
Pass --enrich to receive router-side enrichment.
The CLI adds the X-Xerotier-Vendor-Events: 1
header to the request; the router uses that header to gate
vendor SSE events (the x_artifact.*,
x_mockup.*, and x_research.*
families) and to run enrichment hooks such as gap-analysis
follow-ups. Strict OpenAI clients should leave this flag
off so the response stream stays spec-faithful.
send Interactive Mode
Use --interactive to start a multi-turn chat
session. The full conversation history is maintained
client-side and sent with each request. Interactive mode
always streams responses. Press Ctrl+D or type
/quit to end the session.
xeroctl chat send --endpoint my-endpoint --interactive
With a Preset System Message
xeroctl chat send --endpoint my-endpoint \
--interactive \
--system "You are a SQL expert. Keep answers concise."
With a System Prompt File
xeroctl chat send --endpoint my-endpoint \
--interactive \
--from-file ./persona.txt
A status bar at the prompt shows the endpoint slug, optional model name, cumulative input/output token counts, and message count:
endpoint: my-endpoint | model: llama-3 | tokens: 148/532 | msgs: 6
Slash Commands
| Command | Description |
|---|---|
/help |
Show available commands and keyboard shortcuts. |
/clear |
Clear conversation history while keeping the system message. |
/system <msg> |
Set or replace the system message mid-session. |
/save <file> |
Save the current conversation history to a JSON file. |
/load <file> |
Load a previously saved conversation from a JSON file. |
/tokens |
Show cumulative token usage and message count for the session. |
/quit |
End the session and print total token usage. Also triggered by Ctrl+D. |
Keyboard Shortcuts
GNU readline chords. Ctrl is literal on macOS Terminal and iTerm, not Cmd.
| Shortcut | Action |
|---|---|
| Up / Down | Recall previous inputs from history. |
| Tab | Auto-complete slash commands. |
| Ctrl+A / Ctrl+E | Jump to start or end of line. |
| Ctrl+U / Ctrl+K | Clear text before or after the cursor. |
| Ctrl+W | Delete the previous word. |
| Ctrl+D | End the session (EOF). |
new
Creates a new chat in a workspace by applying a chat
template. The new chat is seeded with the template's system
prompt, tool allowlist, and approval policy, and pins the
template's current version. When --message is
supplied, the command additionally posts that text as the
first user message on the new chat.
The --template argument accepts either a
ctpl_ external id or a kebab-case template
name. Names are resolved against the workspace's visible
templates before the apply call. See
xeroctl templates
for how to author and manage these templates.
xeroctl chat new --workspace ws_kube_prod --template drain-kubernetes-node
xeroctl chat new --workspace ws_kube_prod --template drain-kubernetes-node \
--message "Drain node worker-07 for maintenance"
Options
| Option | Required | Description |
|---|---|---|
--workspace <wid> |
Yes | Target workspace external id. |
--template <id-or-name> |
Yes | Template external id (ctpl_...) or kebab-case name. |
-m, --message <text> |
No | Optional initial user message posted to the new chat after apply. |
Output
Default (table) output:
chat created from template
chat_id: chat_01HXXXX
url: /chats/chat_01HXXXX
template_id: ctpl_01HXXXX
template_version: 3
With -o json, the apply response is rendered verbatim.
Combine --dry-run (global flag) with chat new to preview the apply call without hitting the router.
Examples
Quick Endpoint Smoke Test
xeroctl chat send --endpoint my-endpoint --message "What is 2+2?"
Deterministic Output With a Seed
xeroctl chat send --endpoint my-endpoint \
--message "Pick a random number between 1 and 10" \
--seed 42 \
--temperature 0.0
Stream a Long Generation
xeroctl chat send --endpoint my-endpoint \
--message "Write a 500-word essay on renewable energy" \
--max-tokens 700 \
--stream
Batch-Test Multiple Endpoints
#!/bin/bash
for ep in endpoint-prod endpoint-staging endpoint-dev; do
echo "=== $ep ==="
xeroctl chat send --endpoint "$ep" \
--message "Respond with the word OK only." \
--show-usage
done
Use a Persona File for Interactive Sessions
echo "You are a terse senior engineer. Answer only in bullet points." > persona.txt
xeroctl chat send --endpoint my-endpoint --interactive --from-file persona.txt
Apply a Template and Post the First Message
xeroctl chat new --workspace ws_kube_prod \
--template drain-kubernetes-node \
--message "Drain node worker-07 for maintenance"
Legacy Form (Still Supported)
The pre-Plan-C invocation continues to work because send is the default subcommand:
xeroctl chat --endpoint my-endpoint --message "Hello, how are you?"