xeroctl responses
Manage stored responses and create new ones using the OpenAI Responses API. Supports single-shot, streaming, and interactive multi-turn sessions via server-side chaining. See the xeroctl CLI hub for installation and global options.
Overview
The responses command wraps the OpenAI-compatible
/v1/responses endpoint. It provides two subcommands:
- The default subcommand (no extra keyword) lists, inspects, deletes, cancels, and fetches input items for stored responses.
- The
createsubcommand generates a new response in single-shot, streaming, or interactive mode.
An endpoint slug (--endpoint) is always required for every operation.
Responses vs. Chat Completions: The Responses API uses
previous_response_id for server-side conversation chaining.
The Chat Completions API (xeroctl chat) uses client-side message arrays.
Both target the same models but use different storage and retrieval paths.
List Responses
Omit a response ID to list all stored responses for the endpoint:
xeroctl responses --endpoint my-endpoint
Filter by Status
xeroctl responses --endpoint my-endpoint --status completed
xeroctl responses --endpoint my-endpoint --status in_progress
xeroctl responses --endpoint my-endpoint --status cancelled
Limit Results
xeroctl responses --endpoint my-endpoint --limit 20
The list output is a table with columns: ID, Model, Status, Input tokens, Output tokens, and Created (relative time). Status values are colour-coded.
Get Response
Provide a response ID to fetch full details:
xeroctl responses resp_abc123 --endpoint my-endpoint
Displays: ID, model, status, created/completed timestamps, input/output token counts, service tier, previous response ID, store flag, and any attached metadata.
JSON Output
xeroctl responses resp_abc123 --endpoint my-endpoint -o json
Delete Response
Delete a stored response. A confirmation prompt is shown by default:
xeroctl responses resp_abc123 --endpoint my-endpoint --delete
Skip Confirmation
xeroctl responses resp_abc123 --endpoint my-endpoint --delete --force
Dry Run
xeroctl responses resp_abc123 --endpoint my-endpoint --delete --dry-run
Cancel Response
Cancel a response that is currently in progress:
xeroctl responses resp_abc123 --endpoint my-endpoint --cancel
Input Items
Retrieve the input items (messages) associated with a stored response:
xeroctl responses resp_abc123 --endpoint my-endpoint --input-items
Each input item shows type, ID, role, text (truncated to 200 characters), and content part count. Use -o json to retrieve the full untruncated content:
xeroctl responses resp_abc123 --endpoint my-endpoint --input-items -o json
Create Response
Use the create subcommand to generate a new response:
xeroctl responses create "What is the capital of France?" --endpoint my-endpoint
With Instructions and Parameters
xeroctl responses create "Explain quantum entanglement" \
--endpoint my-endpoint \
--instructions "Be concise. Use plain language." \
--max-output-tokens 300 \
--temperature 0.7
Chain From a Previous Response
Use --previous-response-id to continue an existing conversation server-side:
xeroctl responses create "Now explain it in simple terms" \
--endpoint my-endpoint \
--previous-response-id resp_abc123
Control Storage
Use --store to explicitly request storage, or --no-store to opt out. When neither flag is set, the endpoint default applies:
xeroctl responses create "Draft a cover letter" --endpoint my-endpoint --store
xeroctl responses create "Hello" --endpoint my-endpoint --no-store
Reasoning Effort
For models that support extended reasoning, set the effort level with --reasoning-effort:
xeroctl responses create "Solve: if x^2 - 5x + 6 = 0, what is x?" \
--endpoint my-endpoint \
--reasoning-effort high
Valid values: low, medium, high.
Streaming Mode
Add --stream to the create subcommand to receive tokens
as they are generated. A spinner is displayed until the first token arrives. Token
throughput statistics are printed to stderr after completion.
xeroctl responses create "Write a haiku about the ocean" \
--endpoint my-endpoint \
--stream
Example throughput summary (printed to stderr):
-- 38 tokens in 0.9s (42.2 tok/s)
Interactive Mode
Use --interactive with the create subcommand to start a
multi-turn session. Unlike xeroctl chat --interactive, conversation
context is maintained server-side via previous_response_id chaining.
Each turn streams the response.
xeroctl responses create --interactive --endpoint my-endpoint
With Preset Instructions
xeroctl responses create --interactive \
--endpoint my-endpoint \
--instructions "You are a senior database architect. Keep answers brief."
Resume a Previous Session
Pass a known response ID to resume an existing conversation chain:
xeroctl responses create --interactive \
--endpoint my-endpoint \
--previous-response-id resp_abc123
The status bar at the prompt shows endpoint slug, optional model name, cumulative token counts, and turn count:
endpoint: my-endpoint | tokens: 212/480 | turns: 4
All Options
Management Options (default subcommand)
| Option | Type | Description |
|---|---|---|
--endpoint <slug>required |
string | Endpoint slug for all operations. |
[response-id] |
string | Positional response ID. Omit to list all responses. |
--delete |
flag | Delete the identified response. Requires a response ID. |
--cancel |
flag | Cancel a running response. Requires a response ID. |
--input-items |
flag | Retrieve input items for a response. Requires a response ID. |
--force |
flag | Skip the confirmation prompt for delete operations. |
--limit <n> |
integer | Maximum number of results when listing. |
--status <s> |
string | Filter list by status: completed, in_progress, or cancelled. |
Create Options (create subcommand)
| Option | Type | Description |
|---|---|---|
--endpoint <slug>required |
string | Endpoint slug to send the request to. |
[message] |
string | User message. Required unless --interactive is specified. |
--stream |
flag | Stream the response token by token. |
--interactive |
flag | Start an interactive multi-turn session using server-side chaining. |
--instructions <text> |
string | System instructions for the model. |
--model <name> |
string | Model name passed through to the API. The endpoint configuration determines the actual model. |
--temperature <f> |
float | Sampling temperature between 0.0 and 2.0. |
--top-p <f> |
float | Nucleus sampling threshold between 0.0 and 1.0. |
--max-output-tokens <n> |
integer | Maximum output tokens to generate. |
--previous-response-id <id> |
string | ID of a previous response to chain from. Provides server-side conversation context. |
--store |
flag | Explicitly request server-side storage of the response. |
--no-store |
flag | Explicitly opt out of server-side storage. Cannot be combined with --store. |
--reasoning-effort <level> |
string | Reasoning effort for models that support it: low, medium, or high. |
Interactive Commands
The following slash commands are available during an interactive session:
| Command | Description |
|---|---|
/help |
Show available commands and keyboard shortcuts. |
/clear |
Reset the conversation by clearing the previous_response_id chain. |
/instructions <msg> |
Set or replace system instructions mid-session. Without an argument, displays the current instructions. |
/tokens |
Show cumulative token usage, turn count, and current chain ID. |
/quit |
End the session and print total token usage. Also triggered by Ctrl+D. |
Keyboard Shortcuts
| Shortcut | Action |
|---|---|
| Up / Down | Recall previous inputs from history. |
| Tab | Auto-complete slash commands. |
| Ctrl+A / Ctrl+E | Jump to start or end of line. |
| Ctrl+U / Ctrl+K | Clear text before or after the cursor. |
| Ctrl+W | Delete the previous word. |
| Ctrl+D | End the session (EOF). |
Examples
Full Workflow: Create, List, Inspect, Delete
# Create and store a response
xeroctl responses create "What is recursion?" \
--endpoint my-endpoint --store
# List stored responses to find the ID
xeroctl responses --endpoint my-endpoint
# Inspect the response details
xeroctl responses resp_abc123 --endpoint my-endpoint
# View what was sent to the model
xeroctl responses resp_abc123 --endpoint my-endpoint --input-items
# Delete when done
xeroctl responses resp_abc123 --endpoint my-endpoint --delete --force
Multi-Turn Conversation via CLI Flags
# Turn 1
xeroctl responses create "Explain REST APIs briefly" \
--endpoint my-endpoint --store
# Note the response ID from the output, e.g. resp_turn1
# Turn 2: chain from the first response
xeroctl responses create "Give me a curl example" \
--endpoint my-endpoint \
--previous-response-id resp_turn1
Stream a Reasoning Model Response
xeroctl responses create \
"Prove that there are infinitely many prime numbers" \
--endpoint my-reasoning-endpoint \
--stream \
--reasoning-effort high \
--max-output-tokens 1024