// Tools

xeroctl responses

Drive the OpenAI Responses API from a shell. Create, list, inspect, cancel, and delete stored responses; chain follow-ups with previous-response-id; stream or stay interactive across turns without holding your own message history.

Overview

Two subcommands target /v1/responses:

  • The default subcommand (no extra keyword) lists, inspects, deletes, cancels, and fetches input items for stored responses.
  • The create subcommand generates a new response in single-shot, streaming, or interactive mode.

An endpoint slug (--endpoint) is always required.

Responses vs. Chat Completions: The Responses API uses previous_response_id for server-side conversation chaining. The Chat Completions API (xeroctl chat) uses client-side message arrays. Both target the same models but use different storage and retrieval paths.

List Responses

Omit a response ID to list all stored responses for the endpoint:

bash
xeroctl responses --endpoint my-endpoint

Filter by Status

bash
xeroctl responses --endpoint my-endpoint --status completed xeroctl responses --endpoint my-endpoint --status in_progress xeroctl responses --endpoint my-endpoint --status cancelled

Filter coverage: The router-side response.status enumerates six values: queued, in_progress, completed, failed, cancelled, and incomplete. The CLI --status filter currently accepts only completed, in_progress, and cancelled. To inspect a response in any of the other three states, list without --status and locate the row by ID.

Limit Results

bash
xeroctl responses --endpoint my-endpoint --limit 20

The list output is a table with columns: ID, Model, Status, Input tokens, Output tokens, and Created (relative time). Status values are colour-coded.

Get Response

Provide a response ID to fetch full details:

bash
xeroctl responses resp_abc123 --endpoint my-endpoint

Displays: ID, model, status, created/completed timestamps, input/output token counts, service tier, previous response ID, store flag, and any attached metadata.

JSON Output

bash
xeroctl responses resp_abc123 --endpoint my-endpoint -o json

Add -o json to the list, get, or input-items subcommand to emit the raw router payload to stdout. The shape follows the OpenAI Responses API:

  • list: a { "object": "list", "data": [ <response>, ... ], "first_id", "last_id", "has_more" } envelope.
  • get: a single response object including id, model, status, created_at, completed_at, previous_response_id, store, usage, metadata, and output[].
  • input-items: a { "object": "list", "data": [ <item>, ... ] } envelope where each item carries type, id, role, and content[] with untruncated text.

Delete Response

Delete a stored response. A confirmation prompt is shown by default:

bash
xeroctl responses resp_abc123 --endpoint my-endpoint --delete

Skip Confirmation

bash
xeroctl responses resp_abc123 --endpoint my-endpoint --delete --force

Dry Run

bash
xeroctl responses resp_abc123 --endpoint my-endpoint --delete --dry-run

Cancel Response

Cancel a response that is currently in progress:

bash
xeroctl responses resp_abc123 --endpoint my-endpoint --cancel

Dry Run

Pair --cancel with the global --dry-run flag to preview the operation without contacting the router:

bash
xeroctl responses resp_abc123 --endpoint my-endpoint --cancel --dry-run

What happens to an in-flight stream: Cancellation aborts server-side generation and closes the HTTP connection; the Responses API does not guarantee a dedicated terminal SSE event. A reader of the --stream output may observe the stream simply end without a response.completed frame. See the Responses API reference for the full cancel-semantics contract.

Input Items

Retrieve the input items (messages) associated with a stored response:

bash
xeroctl responses resp_abc123 --endpoint my-endpoint --input-items

Each input item shows type, ID, role, text (truncated to 200 characters), and content part count. Use -o json to retrieve the full untruncated content:

bash
xeroctl responses resp_abc123 --endpoint my-endpoint --input-items -o json

Create Response

Use the create subcommand to generate a new response:

bash
xeroctl responses create "What is the capital of France?" --endpoint my-endpoint

The first line of stdout is the response identifier, which can be passed back as --previous-response-id on a follow-up turn:

Output
ID: resp_abc123

With Instructions and Parameters

bash
xeroctl responses create "Explain quantum entanglement" \ --endpoint my-endpoint \ --instructions "Be concise. Use plain language." \ --max-output-tokens 300 \ --temperature 0.7

Chain From a Previous Response

Use --previous-response-id to continue an existing conversation server-side:

bash
xeroctl responses create "Now explain it in simple terms" \ --endpoint my-endpoint \ --previous-response-id resp_abc123

Control Storage

Use --store to explicitly request storage, or --no-store to opt out. When neither flag is set, the endpoint default applies:

bash
xeroctl responses create "Draft a cover letter" --endpoint my-endpoint --store xeroctl responses create "Hello" --endpoint my-endpoint --no-store

Reasoning Effort

For models that support extended reasoning, set the effort level with --reasoning-effort:

bash
xeroctl responses create "Solve: if x^2 - 5x + 6 = 0, what is x?" \ --endpoint my-endpoint \ --reasoning-effort high

Valid values: low, medium, high.

Streaming Mode

Add --stream to the create subcommand to request a server-sent-events response. Token throughput statistics are printed to stderr after completion.

bash
xeroctl responses create "Write a haiku about the ocean" \ --endpoint my-endpoint \ --stream

Current behaviour (incremental rendering disabled): The CLI consumes the SSE stream but, in the current release, does not print intermediate text deltas. The assembled response is rendered once the terminal response.completed frame arrives, then the throughput summary is written to stderr. From the user's perspective, --stream currently behaves the same as the non-streaming single-shot path for stdout output; only the wire transport differs. The throughput summary remains accurate.

Example throughput summary (printed to stderr):

Output
-- 38 tokens in 0.9s (42.2 tok/s)

Interactive Mode

Use --interactive with the create subcommand to start a multi-turn session. Unlike xeroctl chat --interactive, conversation context is maintained server-side via previous_response_id chaining. Each turn issues a streaming request implicitly; the --stream flag is redundant when --interactive is supplied and is neither required nor rejected.

bash
xeroctl responses create --interactive --endpoint my-endpoint

With Preset Instructions

bash
xeroctl responses create --interactive \ --endpoint my-endpoint \ --instructions "You are a senior database architect. Keep answers brief."

Resume a Previous Session

Pass a known response ID to resume an existing conversation chain:

bash
xeroctl responses create --interactive \ --endpoint my-endpoint \ --previous-response-id resp_abc123

The status bar at the prompt shows the endpoint slug, optional model name, cumulative token counts formatted as tokens: <input>/<output>, and the turn count:

Status bar example
endpoint: my-endpoint | tokens: 212/480 | turns: 4

In the example above, 212 is the cumulative input-token count and 480 is the cumulative output-token count across every turn of the session.

Output Streams

The responses command splits its output across stdout and stderr so that machine-readable content can be piped while progress indicators stay on the terminal:

  • stdout: the assembled response text, key/value summaries (including the ID: line from create), and any -o json payload.
  • stderr: the spinner, the post-completion throughput summary (for example, -- 38 tokens in 0.9s (42.2 tok/s)), and any warning lines.

To suppress the spinner and throughput summary, pass the global --quiet flag. The flag does not affect stdout content; piping 2>/dev/null achieves the same effect for ad-hoc invocations. See the xeroctl CLI hub for the full list of global flags.

All Options

Management Options (default subcommand)

Option Type Description
--endpoint <slug>required string Endpoint slug for all operations.
[response-id] string Positional response ID. Omit to list all responses.
--delete flag Delete the identified response. Requires a response ID.
--cancel flag Cancel a running response. Requires a response ID.
--input-items flag Retrieve input items for a response. Requires a response ID.
--force flag Skip the confirmation prompt for delete operations.
--limit <n> integer Maximum number of results when listing.
--status <s> string Filter list by status. Accepted values: completed, in_progress, cancelled. The router-side enum also includes queued, failed, and incomplete, but these cannot currently be selected via this flag.
--dry-run flag Global flag inherited from xeroctl. Applied to this command, it gates --delete and --cancel: the CLI prints the operation it would perform and exits without contacting the router.

--delete, --cancel, and --input-items are mutually exclusive; at most one may be supplied to a single invocation of the default subcommand.

Create Options (create subcommand)

Option Type Description
--endpoint <slug>required string Endpoint slug to send the request to.
[message] string User message. Required unless --interactive is specified.
--stream flag Stream the response token by token.
--interactive flag Start an interactive multi-turn session using server-side chaining.
--instructions <text> string System instructions for the model.
--model <name> string Model name passed through to the API. The endpoint configuration determines the actual model.
--temperature <f> float Sampling temperature between 0.0 and 2.0.
--top-p <f> float Nucleus sampling threshold between 0.0 and 1.0.
--max-output-tokens <n> integer Maximum output tokens to generate.
--previous-response-id <id> string ID of a previous response to chain from. Provides server-side conversation context.
--store flag Explicitly request server-side storage of the response.
--no-store flag Explicitly opt out of server-side storage. Cannot be combined with --store.
--reasoning-effort <level> string Reasoning effort for models that support it: low, medium, or high. The CLI rejects any other value. Models that do not implement reasoning silently ignore the field on the wire, matching OpenAI spec behaviour.

Interactive Commands

The following slash commands are available during an interactive session:

Command Description
/help Show available commands and keyboard shortcuts.
/clear Reset the conversation by clearing the previous_response_id chain.
/instructions <msg> Set or replace system instructions mid-session. Without an argument, displays the current instructions.
/tokens Show cumulative token usage, turn count, and current chain ID.
/quit End the session and print total token usage. Also triggered by Control+D.

Keyboard Shortcuts

Shortcut Action
Up / Down Recall previous inputs from history.
Tab Auto-complete slash commands.
Control+A / Control+E Jump to start or end of line.
Control+U / Control+K Clear text before or after the cursor.
Control+W Delete the previous word.
Control+D End the session (EOF).

Page Navigation

Two physical keys move the viewport between sections on this page (Mac, Linux, any keyboard layout). Disabled while a text field is focused.

Shortcut Action
[ Jump to the previous section heading.
] Jump to the next section heading.

Examples

Full Workflow: Create, List, Inspect, Delete

bash
# Create and store a response xeroctl responses create "What is recursion?" \ --endpoint my-endpoint --store # List stored responses to find the ID xeroctl responses --endpoint my-endpoint # Inspect the response details xeroctl responses resp_abc123 --endpoint my-endpoint # View what was sent to the model xeroctl responses resp_abc123 --endpoint my-endpoint --input-items # Delete when done xeroctl responses resp_abc123 --endpoint my-endpoint --delete --force

Multi-Turn Conversation via CLI Flags

bash
# Turn 1: capture the ID printed on the first line of stdout RESP_ID=$(xeroctl responses create "Explain REST APIs briefly" \ --endpoint my-endpoint --store | awk '/^ID:/ {print $2; exit}') # Turn 2: chain from the first response xeroctl responses create "Give me a curl example" \ --endpoint my-endpoint \ --previous-response-id "$RESP_ID"

Stream a Reasoning Model Response

bash
xeroctl responses create \ "Prove that there are infinitely many prime numbers" \ --endpoint my-reasoning-endpoint \ --stream \ --reasoning-effort high \ --max-output-tokens 1024