xeroctl chat

Overview

The chat command group covers two flows. The send subcommand drives the OpenAI-compatible /v1/chat/completions route against a configured endpoint, single-shot, streaming, or an interactive multi-turn session. The new subcommand applies a chat template to a workspace, seeding a fresh chat with the template's system prompt, tool allowlist, and approval policy, and (optionally) posts an initial user message to that chat.

send talks to /v1/chat/completions on the configured endpoint. new talks to /v1/workspaces/:wid/chat-templates/:tid/apply and (when --message is supplied) /v1/chats/:cid/messages.

For template authoring and lifecycle, see xeroctl templates.

Usage Pattern

bash

                    xeroctl chat send --endpoint <slug> --message <text> [flags]
xeroctl chat send --endpoint <slug> --interactive [flags]
xeroctl chat new  --workspace <wid> --template <id-or-name> [--message <text>]
                

Backward Compatibility

Bare xeroctl chat still works. send is the default subcommand, so xeroctl chat --endpoint ... --message ... routes to chat send unchanged. New scripts should prefer the explicit form.

send

Sends a chat completion request to a Xerotier endpoint. Supports single-shot, streaming, and interactive multi-turn modes. An endpoint slug (--endpoint) is always required; --message is required unless --interactive is set.

bash

xeroctl chat send --endpoint my-endpoint --message "Hello, how are you?"

With a System Message

bash

                    xeroctl chat send --endpoint my-endpoint \
  --system "You are a helpful coding assistant." \
  --message "Write a Python function to reverse a string"
                

System Prompt From a File

Load the system prompt from a file using --from-file. --from-file takes precedence over --system when both are passed.

bash

                    xeroctl chat send --endpoint my-endpoint \
  --from-file ./system-prompt.txt \
  --message "What is the policy on refunds?"
                

Streaming

Add --stream to receive tokens as they are generated. A spinner appears while waiting for the first token; a throughput summary is printed to stderr after the response completes.

bash

                    xeroctl chat send --endpoint my-endpoint \
  --message "Tell me a story about a robot learning to paint" \
  --stream
                

Example streaming footer:

Output

-- 312 tokens in 4.2s (74.3 tok/s)

Show Token Usage

Append --show-usage to print prompt, completion, total, cached, and reasoning token counts after the response.

bash

                    xeroctl chat send --endpoint my-endpoint \
  --message "Summarize quantum entanglement" \
  --show-usage
                

Store the Completion

Pass --store to request server-side storage so the completion can be retrieved later via xeroctl completions.

bash

                    xeroctl chat send --endpoint my-endpoint \
  --message "Draft a release announcement" \
  --store
                

JSON Output

Use -o json for the raw API response object.

bash

xeroctl chat send --endpoint my-endpoint --message "Hello" -o json

send Options

Option	Type	Description
`--endpoint <slug>`required	string	Endpoint slug to send the request to.
`-m, --message <text>`	string	User message to send. Required unless `--interactive` is specified.
`--system <text>`	string	System message to set context for the model.
`--from-file <path>`	string	Read the system prompt from a file. Takes precedence over `--system` when both are specified.
`--model <name>`	string	Model name passed through to the API. Informational only, the endpoint configuration determines the actual model used.
`--max-tokens <n>`	integer	Maximum number of tokens to generate. Defaults to 4096 in interactive mode when not specified.
`--temperature <f>`	float	Sampling temperature between 0.0 and 2.0. Higher values produce more varied output.
`--top-p <f>`	float	Nucleus sampling threshold between 0.0 and 1.0.
`--frequency-penalty <f>`	float	Frequency penalty between -2.0 and 2.0. Reduces repetition of already-used tokens.
`--presence-penalty <f>`	float	Presence penalty between -2.0 and 2.0. Encourages the model to discuss new topics.
`--seed <n>`	integer	Random seed for reproducible output.
`--stream`	flag	Stream the response token by token. A throughput summary is printed after completion.
`--store`	flag	Request server-side storage of the completion for later retrieval.
`--show-usage`	flag	Print detailed token usage (prompt, completion, total, cached, reasoning) after the response.
`--interactive`	flag	Start an interactive multi-turn chat session. Conversation history is maintained client-side.
`--agentic`	flag	Route the request through the router's agentic tool-calling loop by setting `web_search_options.x_agentic = true` on the request body. Prints a `-> agentic mode` breadcrumb to stderr unless `--quiet` is set.
`--enrich`	flag	Opt into router-side enrichment by sending `X-Xerotier-Vendor-Events: 1`. The router then emits vendor SSE events (`x_artifact.`, `x_mockup.`, `x_research.*`) and runs enrichment hooks such as gap-analysis follow-ups. Omitted by default so the wire stays OpenAI spec-faithful.
`--quiet`	flag	Suppress informational breadcrumbs on stderr, including the `-> agentic mode` line printed by `--agentic` and the post-stream throughput summary. Errors still print.

Agentic Tool-Calling Loop

Pass --agentic to route the request through the router's server-side agentic loop. The CLI sets web_search_options.x_agentic = true on the chat completion body; the router then drives tool calls on the user's behalf and returns the final assistant turn. A -> agentic mode breadcrumb is written to stderr (suppressed under --quiet) so scripted callers can confirm the mode without parsing the response.

Vendor Enrichment Events

Pass --enrich to receive router-side enrichment. The CLI adds the X-Xerotier-Vendor-Events: 1 header to the request; the router uses that header to gate vendor SSE events (the x_artifact.*, x_mockup.*, and x_research.* families) and to run enrichment hooks such as gap-analysis follow-ups. Strict OpenAI clients should leave this flag off so the response stream stays spec-faithful.

send Interactive Mode

Use --interactive to start a multi-turn chat session. The full conversation history is maintained client-side and sent with each request. Interactive mode always streams responses. Press Ctrl+D or type /quit to end the session.

bash

xeroctl chat send --endpoint my-endpoint --interactive

With a Preset System Message

bash

                    xeroctl chat send --endpoint my-endpoint \
  --interactive \
  --system "You are a SQL expert. Keep answers concise."
                

With a System Prompt File

bash

                    xeroctl chat send --endpoint my-endpoint \
  --interactive \
  --from-file ./persona.txt
                

A status bar at the prompt shows the endpoint slug, optional model name, cumulative input/output token counts, and message count:

Status bar example

endpoint: my-endpoint | model: llama-3 | tokens: 148/532 | msgs: 6

Slash Commands

Command	Description
`/help`	Show available commands and keyboard shortcuts.
`/clear`	Clear conversation history while keeping the system message.
`/system <msg>`	Set or replace the system message mid-session.
`/save <file>`	Save the current conversation history to a JSON file.
`/load <file>`	Load a previously saved conversation from a JSON file.
`/tokens`	Show cumulative token usage and message count for the session.
`/quit`	End the session and print total token usage. Also triggered by `Ctrl+D`.

Keyboard Shortcuts

GNU readline chords. Ctrl is literal on macOS Terminal and iTerm, not Cmd.

Shortcut	Action
`Up` / `Down`	Recall previous inputs from history.
`Tab`	Auto-complete slash commands.
`Ctrl+A` / `Ctrl+E`	Jump to start or end of line.
`Ctrl+U` / `Ctrl+K`	Clear text before or after the cursor.
`Ctrl+W`	Delete the previous word.
`Ctrl+D`	End the session (EOF).

new

Creates a new chat in a workspace by applying a chat template. The new chat is seeded with the template's system prompt, tool allowlist, and approval policy, and pins the template's current version. When --message is supplied, the command additionally posts that text as the first user message on the new chat.

The --template argument accepts either a ctpl_ external id or a kebab-case template name. Names are resolved against the workspace's visible templates before the apply call. See xeroctl templates for how to author and manage these templates.

bash

                    xeroctl chat new --workspace ws_kube_prod --template drain-kubernetes-node
xeroctl chat new --workspace ws_kube_prod --template drain-kubernetes-node \
  --message "Drain node worker-07 for maintenance"
                

Options

Option	Required	Description
`--workspace <wid>`	Yes	Target workspace external id.
`--template <id-or-name>`	Yes	Template external id (`ctpl_...`) or kebab-case name.
`-m, --message <text>`	No	Optional initial user message posted to the new chat after apply.

Output

Default (table) output:

text

                    chat created from template
chat_id: chat_01HXXXX
url: /chats/chat_01HXXXX
template_id: ctpl_01HXXXX
template_version: 3
                

With -o json, the apply response is rendered verbatim.

Combine --dry-run (global flag) with chat new to preview the apply call without hitting the router.

Examples

Quick Endpoint Smoke Test

bash

xeroctl chat send --endpoint my-endpoint --message "What is 2+2?"

Deterministic Output With a Seed

bash

                    xeroctl chat send --endpoint my-endpoint \
  --message "Pick a random number between 1 and 10" \
  --seed 42 \
  --temperature 0.0
                

Stream a Long Generation

bash

                    xeroctl chat send --endpoint my-endpoint \
  --message "Write a 500-word essay on renewable energy" \
  --max-tokens 700 \
  --stream
                

Batch-Test Multiple Endpoints

bash

                    #!/bin/bash
for ep in endpoint-prod endpoint-staging endpoint-dev; do
  echo "=== $ep ==="
  xeroctl chat send --endpoint "$ep" \
    --message "Respond with the word OK only." \
    --show-usage
done
                

Use a Persona File for Interactive Sessions

bash

                    echo "You are a terse senior engineer. Answer only in bullet points." > persona.txt
xeroctl chat send --endpoint my-endpoint --interactive --from-file persona.txt
                

Apply a Template and Post the First Message

bash

                    xeroctl chat new --workspace ws_kube_prod \
  --template drain-kubernetes-node \
  --message "Drain node worker-07 for maintenance"
                

Legacy Form (Still Supported)

The pre-Plan-C invocation continues to work because send is the default subcommand:

bash

xeroctl chat --endpoint my-endpoint --message "Hello, how are you?"

Overview

Usage Pattern

Backward Compatibility

send

With a System Message

System Prompt From a File

Streaming

Show Token Usage

Store the Completion

JSON Output

send Options

Agentic Tool-Calling Loop

Vendor Enrichment Events

send Interactive Mode

With a Preset System Message

With a System Prompt File

Slash Commands

Keyboard Shortcuts

new

Options

Output

Examples

Quick Endpoint Smoke Test

Deterministic Output With a Seed

Stream a Long Generation

Batch-Test Multiple Endpoints

Use a Persona File for Interactive Sessions

Apply a Template and Post the First Message

Legacy Form (Still Supported)

xeroctl CLI Hub

xeroctl templates

Chat Completions API