xeroctl responses

Overview

Two subcommands target /v1/responses:

The default subcommand (no extra keyword) lists, inspects, deletes, cancels, and fetches input items for stored responses.
The create subcommand generates a new response in single-shot, streaming, or interactive mode.

An endpoint slug (--endpoint) is always required.

Responses vs. Chat Completions: The Responses API uses previous_response_id for server-side conversation chaining. The Chat Completions API (xeroctl chat) uses client-side message arrays. Both target the same models but use different storage and retrieval paths.

List Responses

Omit a response ID to list all stored responses for the endpoint:

bash

xeroctl responses --endpoint my-endpoint

Filter by Status

bash

                    xeroctl responses --endpoint my-endpoint --status completed
xeroctl responses --endpoint my-endpoint --status in_progress
xeroctl responses --endpoint my-endpoint --status cancelled
                

Filter coverage: The router-side response.status enumerates six values: queued, in_progress, completed, failed, cancelled, and incomplete. The CLI --status filter currently accepts only completed, in_progress, and cancelled. To inspect a response in any of the other three states, list without --status and locate the row by ID.

Limit Results

bash

xeroctl responses --endpoint my-endpoint --limit 20

The list output is a table with columns: ID, Model, Status, Input tokens, Output tokens, and Created (relative time). Status values are colour-coded.

Get Response

Provide a response ID to fetch full details:

bash

xeroctl responses resp_abc123 --endpoint my-endpoint

Displays: ID, model, status, created/completed timestamps, input/output token counts, service tier, previous response ID, store flag, and any attached metadata.

JSON Output

bash

xeroctl responses resp_abc123 --endpoint my-endpoint -o json

Add -o json to the list, get, or input-items subcommand to emit the raw router payload to stdout. The shape follows the OpenAI Responses API:

list: a { "object": "list", "data": [ <response>, ... ], "first_id", "last_id", "has_more" } envelope.
get: a single response object including id, model, status, created_at, completed_at, previous_response_id, store, usage, metadata, and output[].
input-items: a { "object": "list", "data": [ <item>, ... ] } envelope where each item carries type, id, role, and content[] with untruncated text.

Delete Response

Delete a stored response. A confirmation prompt is shown by default:

bash

xeroctl responses resp_abc123 --endpoint my-endpoint --delete

Skip Confirmation

bash

xeroctl responses resp_abc123 --endpoint my-endpoint --delete --force

Dry Run

bash

xeroctl responses resp_abc123 --endpoint my-endpoint --delete --dry-run

Cancel Response

Cancel a response that is currently in progress:

bash

xeroctl responses resp_abc123 --endpoint my-endpoint --cancel

Dry Run

Pair --cancel with the global --dry-run flag to preview the operation without contacting the router:

bash

xeroctl responses resp_abc123 --endpoint my-endpoint --cancel --dry-run

What happens to an in-flight stream: Cancellation aborts server-side generation and closes the HTTP connection; the Responses API does not guarantee a dedicated terminal SSE event. A reader of the --stream output may observe the stream simply end without a response.completed frame. See the Responses API reference for the full cancel-semantics contract.

Input Items

Retrieve the input items (messages) associated with a stored response:

bash

xeroctl responses resp_abc123 --endpoint my-endpoint --input-items

Each input item shows type, ID, role, text (truncated to 200 characters), and content part count. Use -o json to retrieve the full untruncated content:

bash

xeroctl responses resp_abc123 --endpoint my-endpoint --input-items -o json

Create Response

Use the create subcommand to generate a new response:

bash

                    xeroctl responses create "What is the capital of France?" --endpoint my-endpoint
                

The first line of stdout is the response identifier, which can be passed back as --previous-response-id on a follow-up turn:

Output

ID: resp_abc123

With Instructions and Parameters

bash

                    xeroctl responses create "Explain quantum entanglement" \
  --endpoint my-endpoint \
  --instructions "Be concise. Use plain language." \
  --max-output-tokens 300 \
  --temperature 0.7
                

Chain From a Previous Response

Use --previous-response-id to continue an existing conversation server-side:

bash

                    xeroctl responses create "Now explain it in simple terms" \
  --endpoint my-endpoint \
  --previous-response-id resp_abc123
                

Control Storage

Use --store to explicitly request storage, or --no-store to opt out. When neither flag is set, the endpoint default applies:

bash

                    xeroctl responses create "Draft a cover letter" --endpoint my-endpoint --store
xeroctl responses create "Hello" --endpoint my-endpoint --no-store
                

Reasoning Effort

For models that support extended reasoning, set the effort level with --reasoning-effort:

bash

                    xeroctl responses create "Solve: if x^2 - 5x + 6 = 0, what is x?" \
  --endpoint my-endpoint \
  --reasoning-effort high
                

Valid values: low, medium, high.

Streaming Mode

Add --stream to the create subcommand to request a server-sent-events response. Token throughput statistics are printed to stderr after completion.

bash

                    xeroctl responses create "Write a haiku about the ocean" \
  --endpoint my-endpoint \
  --stream
                

Current behaviour (incremental rendering disabled): The CLI consumes the SSE stream but, in the current release, does not print intermediate text deltas. The assembled response is rendered once the terminal response.completed frame arrives, then the throughput summary is written to stderr. From the user's perspective, --stream currently behaves the same as the non-streaming single-shot path for stdout output; only the wire transport differs. The throughput summary remains accurate.

Example throughput summary (printed to stderr):

Output

-- 38 tokens in 0.9s (42.2 tok/s)

Interactive Mode

Use --interactive with the create subcommand to start a multi-turn session. Unlike xeroctl chat --interactive, conversation context is maintained server-side via previous_response_id chaining. Each turn issues a streaming request implicitly; the --stream flag is redundant when --interactive is supplied and is neither required nor rejected.

bash

xeroctl responses create --interactive --endpoint my-endpoint

With Preset Instructions

bash

                    xeroctl responses create --interactive \
  --endpoint my-endpoint \
  --instructions "You are a senior database architect. Keep answers brief."
                

Resume a Previous Session

Pass a known response ID to resume an existing conversation chain:

bash

                    xeroctl responses create --interactive \
  --endpoint my-endpoint \
  --previous-response-id resp_abc123
                

The status bar at the prompt shows the endpoint slug, optional model name, cumulative token counts formatted as tokens: <input>/<output>, and the turn count:

Status bar example

endpoint: my-endpoint | tokens: 212/480 | turns: 4

In the example above, 212 is the cumulative input-token count and 480 is the cumulative output-token count across every turn of the session.

Output Streams

The responses command splits its output across stdout and stderr so that machine-readable content can be piped while progress indicators stay on the terminal:

stdout: the assembled response text, key/value summaries (including the ID: line from create), and any -o json payload.
stderr: the spinner, the post-completion throughput summary (for example, -- 38 tokens in 0.9s (42.2 tok/s)), and any warning lines.

To suppress the spinner and throughput summary, pass the global --quiet flag. The flag does not affect stdout content; piping 2>/dev/null achieves the same effect for ad-hoc invocations. See the xeroctl CLI hub for the full list of global flags.

All Options

Management Options (default subcommand)

Option	Type	Description
`--endpoint <slug>`required	string	Endpoint slug for all operations.
`[response-id]`	string	Positional response ID. Omit to list all responses.
`--delete`	flag	Delete the identified response. Requires a response ID.
`--cancel`	flag	Cancel a running response. Requires a response ID.
`--input-items`	flag	Retrieve input items for a response. Requires a response ID.
`--force`	flag	Skip the confirmation prompt for delete operations.
`--limit <n>`	integer	Maximum number of results when listing.
`--status <s>`	string	Filter list by status. Accepted values: `completed`, `in_progress`, `cancelled`. The router-side enum also includes `queued`, `failed`, and `incomplete`, but these cannot currently be selected via this flag.
`--dry-run`	flag	Global flag inherited from xeroctl. Applied to this command, it gates `--delete` and `--cancel`: the CLI prints the operation it would perform and exits without contacting the router.

--delete, --cancel, and --input-items are mutually exclusive; at most one may be supplied to a single invocation of the default subcommand.

Create Options (`create` subcommand)

Option	Type	Description
`--endpoint <slug>`required	string	Endpoint slug to send the request to.
`[message]`	string	User message. Required unless `--interactive` is specified.
`--stream`	flag	Stream the response token by token.
`--interactive`	flag	Start an interactive multi-turn session using server-side chaining.
`--instructions <text>`	string	System instructions for the model.
`--model <name>`	string	Model name passed through to the API. The endpoint configuration determines the actual model.
`--temperature <f>`	float	Sampling temperature between 0.0 and 2.0.
`--top-p <f>`	float	Nucleus sampling threshold between 0.0 and 1.0.
`--max-output-tokens <n>`	integer	Maximum output tokens to generate.
`--previous-response-id <id>`	string	ID of a previous response to chain from. Provides server-side conversation context.
`--store`	flag	Explicitly request server-side storage of the response.
`--no-store`	flag	Explicitly opt out of server-side storage. Cannot be combined with `--store`.
`--reasoning-effort <level>`	string	Reasoning effort for models that support it: `low`, `medium`, or `high`. The CLI rejects any other value. Models that do not implement reasoning silently ignore the field on the wire, matching OpenAI spec behaviour.

Interactive Commands

The following slash commands are available during an interactive session:

Command	Description
`/help`	Show available commands and keyboard shortcuts.
`/clear`	Reset the conversation by clearing the `previous_response_id` chain.
`/instructions <msg>`	Set or replace system instructions mid-session. Without an argument, displays the current instructions.
`/tokens`	Show cumulative token usage, turn count, and current chain ID.
`/quit`	End the session and print total token usage. Also triggered by `Control`+`D`.

Keyboard Shortcuts

Shortcut	Action
`Up` / `Down`	Recall previous inputs from history.
`Tab`	Auto-complete slash commands.
`Control`+`A` / `Control`+`E`	Jump to start or end of line.
`Control`+`U` / `Control`+`K`	Clear text before or after the cursor.
`Control`+`W`	Delete the previous word.
`Control`+`D`	End the session (EOF).

Page Navigation

Two physical keys move the viewport between sections on this page (Mac, Linux, any keyboard layout). Disabled while a text field is focused.

Shortcut	Action
`[`	Jump to the previous section heading.
`]`	Jump to the next section heading.

Examples

Full Workflow: Create, List, Inspect, Delete

bash

                    # Create and store a response
xeroctl responses create "What is recursion?" \
  --endpoint my-endpoint --store

# List stored responses to find the ID
xeroctl responses --endpoint my-endpoint

# Inspect the response details
xeroctl responses resp_abc123 --endpoint my-endpoint

# View what was sent to the model
xeroctl responses resp_abc123 --endpoint my-endpoint --input-items

# Delete when done
xeroctl responses resp_abc123 --endpoint my-endpoint --delete --force
                

Multi-Turn Conversation via CLI Flags

bash

                    # Turn 1: capture the ID printed on the first line of stdout
RESP_ID=$(xeroctl responses create "Explain REST APIs briefly" \
  --endpoint my-endpoint --store | awk '/^ID:/ {print $2; exit}')

# Turn 2: chain from the first response
xeroctl responses create "Give me a curl example" \
  --endpoint my-endpoint \
  --previous-response-id "$RESP_ID"
                

Stream a Reasoning Model Response

bash

                    xeroctl responses create \
  "Prove that there are infinitely many prime numbers" \
  --endpoint my-reasoning-endpoint \
  --stream \
  --reasoning-effort high \
  --max-output-tokens 1024
                

Overview

List Responses

Filter by Status

Limit Results

Get Response

JSON Output

Delete Response

Skip Confirmation

Dry Run

Cancel Response

Dry Run

Input Items

Create Response

With Instructions and Parameters

Chain From a Previous Response

Control Storage

Reasoning Effort

Streaming Mode

Interactive Mode

With Preset Instructions

Resume a Previous Session

Output Streams

All Options

Management Options (default subcommand)

Create Options (create subcommand)

Interactive Commands

Keyboard Shortcuts

Page Navigation

Examples

Full Workflow: Create, List, Inspect, Delete

Multi-Turn Conversation via CLI Flags

Stream a Reasoning Model Response

xeroctl CLI Hub

Responses API

xeroctl chat

Create Options (`create` subcommand)