xeroctl responses

Manage stored responses and create new ones using the OpenAI Responses API. Supports single-shot, streaming, and interactive multi-turn sessions via server-side chaining. See the xeroctl CLI hub for installation and global options.

Overview

The responses command wraps the OpenAI-compatible /v1/responses endpoint. It provides two subcommands:

  • The default subcommand (no extra keyword) lists, inspects, deletes, cancels, and fetches input items for stored responses.
  • The create subcommand generates a new response in single-shot, streaming, or interactive mode.

An endpoint slug (--endpoint) is always required for every operation.

Responses vs. Chat Completions: The Responses API uses previous_response_id for server-side conversation chaining. The Chat Completions API (xeroctl chat) uses client-side message arrays. Both target the same models but use different storage and retrieval paths.

List Responses

Omit a response ID to list all stored responses for the endpoint:

bash
xeroctl responses --endpoint my-endpoint

Filter by Status

bash
xeroctl responses --endpoint my-endpoint --status completed xeroctl responses --endpoint my-endpoint --status in_progress xeroctl responses --endpoint my-endpoint --status cancelled

Limit Results

bash
xeroctl responses --endpoint my-endpoint --limit 20

The list output is a table with columns: ID, Model, Status, Input tokens, Output tokens, and Created (relative time). Status values are colour-coded.

Get Response

Provide a response ID to fetch full details:

bash
xeroctl responses resp_abc123 --endpoint my-endpoint

Displays: ID, model, status, created/completed timestamps, input/output token counts, service tier, previous response ID, store flag, and any attached metadata.

JSON Output

bash
xeroctl responses resp_abc123 --endpoint my-endpoint -o json

Delete Response

Delete a stored response. A confirmation prompt is shown by default:

bash
xeroctl responses resp_abc123 --endpoint my-endpoint --delete

Skip Confirmation

bash
xeroctl responses resp_abc123 --endpoint my-endpoint --delete --force

Dry Run

bash
xeroctl responses resp_abc123 --endpoint my-endpoint --delete --dry-run

Cancel Response

Cancel a response that is currently in progress:

bash
xeroctl responses resp_abc123 --endpoint my-endpoint --cancel

Input Items

Retrieve the input items (messages) associated with a stored response:

bash
xeroctl responses resp_abc123 --endpoint my-endpoint --input-items

Each input item shows type, ID, role, text (truncated to 200 characters), and content part count. Use -o json to retrieve the full untruncated content:

bash
xeroctl responses resp_abc123 --endpoint my-endpoint --input-items -o json

Create Response

Use the create subcommand to generate a new response:

bash
xeroctl responses create "What is the capital of France?" --endpoint my-endpoint

With Instructions and Parameters

bash
xeroctl responses create "Explain quantum entanglement" \ --endpoint my-endpoint \ --instructions "Be concise. Use plain language." \ --max-output-tokens 300 \ --temperature 0.7

Chain From a Previous Response

Use --previous-response-id to continue an existing conversation server-side:

bash
xeroctl responses create "Now explain it in simple terms" \ --endpoint my-endpoint \ --previous-response-id resp_abc123

Control Storage

Use --store to explicitly request storage, or --no-store to opt out. When neither flag is set, the endpoint default applies:

bash
xeroctl responses create "Draft a cover letter" --endpoint my-endpoint --store xeroctl responses create "Hello" --endpoint my-endpoint --no-store

Reasoning Effort

For models that support extended reasoning, set the effort level with --reasoning-effort:

bash
xeroctl responses create "Solve: if x^2 - 5x + 6 = 0, what is x?" \ --endpoint my-endpoint \ --reasoning-effort high

Valid values: low, medium, high.

Streaming Mode

Add --stream to the create subcommand to receive tokens as they are generated. A spinner is displayed until the first token arrives. Token throughput statistics are printed to stderr after completion.

bash
xeroctl responses create "Write a haiku about the ocean" \ --endpoint my-endpoint \ --stream

Example throughput summary (printed to stderr):

Output
-- 38 tokens in 0.9s (42.2 tok/s)

Interactive Mode

Use --interactive with the create subcommand to start a multi-turn session. Unlike xeroctl chat --interactive, conversation context is maintained server-side via previous_response_id chaining. Each turn streams the response.

bash
xeroctl responses create --interactive --endpoint my-endpoint

With Preset Instructions

bash
xeroctl responses create --interactive \ --endpoint my-endpoint \ --instructions "You are a senior database architect. Keep answers brief."

Resume a Previous Session

Pass a known response ID to resume an existing conversation chain:

bash
xeroctl responses create --interactive \ --endpoint my-endpoint \ --previous-response-id resp_abc123

The status bar at the prompt shows endpoint slug, optional model name, cumulative token counts, and turn count:

Status bar example
endpoint: my-endpoint | tokens: 212/480 | turns: 4

All Options

Management Options (default subcommand)

Option Type Description
--endpoint <slug>required string Endpoint slug for all operations.
[response-id] string Positional response ID. Omit to list all responses.
--delete flag Delete the identified response. Requires a response ID.
--cancel flag Cancel a running response. Requires a response ID.
--input-items flag Retrieve input items for a response. Requires a response ID.
--force flag Skip the confirmation prompt for delete operations.
--limit <n> integer Maximum number of results when listing.
--status <s> string Filter list by status: completed, in_progress, or cancelled.

Create Options (create subcommand)

Option Type Description
--endpoint <slug>required string Endpoint slug to send the request to.
[message] string User message. Required unless --interactive is specified.
--stream flag Stream the response token by token.
--interactive flag Start an interactive multi-turn session using server-side chaining.
--instructions <text> string System instructions for the model.
--model <name> string Model name passed through to the API. The endpoint configuration determines the actual model.
--temperature <f> float Sampling temperature between 0.0 and 2.0.
--top-p <f> float Nucleus sampling threshold between 0.0 and 1.0.
--max-output-tokens <n> integer Maximum output tokens to generate.
--previous-response-id <id> string ID of a previous response to chain from. Provides server-side conversation context.
--store flag Explicitly request server-side storage of the response.
--no-store flag Explicitly opt out of server-side storage. Cannot be combined with --store.
--reasoning-effort <level> string Reasoning effort for models that support it: low, medium, or high.

Interactive Commands

The following slash commands are available during an interactive session:

Command Description
/help Show available commands and keyboard shortcuts.
/clear Reset the conversation by clearing the previous_response_id chain.
/instructions <msg> Set or replace system instructions mid-session. Without an argument, displays the current instructions.
/tokens Show cumulative token usage, turn count, and current chain ID.
/quit End the session and print total token usage. Also triggered by Ctrl+D.

Keyboard Shortcuts

Shortcut Action
Up / Down Recall previous inputs from history.
Tab Auto-complete slash commands.
Ctrl+A / Ctrl+E Jump to start or end of line.
Ctrl+U / Ctrl+K Clear text before or after the cursor.
Ctrl+W Delete the previous word.
Ctrl+D End the session (EOF).

Examples

Full Workflow: Create, List, Inspect, Delete

bash
# Create and store a response xeroctl responses create "What is recursion?" \ --endpoint my-endpoint --store # List stored responses to find the ID xeroctl responses --endpoint my-endpoint # Inspect the response details xeroctl responses resp_abc123 --endpoint my-endpoint # View what was sent to the model xeroctl responses resp_abc123 --endpoint my-endpoint --input-items # Delete when done xeroctl responses resp_abc123 --endpoint my-endpoint --delete --force

Multi-Turn Conversation via CLI Flags

bash
# Turn 1 xeroctl responses create "Explain REST APIs briefly" \ --endpoint my-endpoint --store # Note the response ID from the output, e.g. resp_turn1 # Turn 2: chain from the first response xeroctl responses create "Give me a curl example" \ --endpoint my-endpoint \ --previous-response-id resp_turn1

Stream a Reasoning Model Response

bash
xeroctl responses create \ "Prove that there are infinitely many prime numbers" \ --endpoint my-reasoning-endpoint \ --stream \ --reasoning-effort high \ --max-output-tokens 1024