xeroctl responses
Drive the OpenAI Responses API from a shell. Create, list, inspect, cancel, and delete stored responses; chain follow-ups with previous-response-id; stream or stay interactive across turns without holding your own message history.
Overview
Two subcommands target /v1/responses:
- The default subcommand (no extra keyword) lists, inspects, deletes, cancels, and fetches input items for stored responses.
- The
createsubcommand generates a new response in single-shot, streaming, or interactive mode.
An endpoint slug (--endpoint) is always required.
Responses vs. Chat Completions: The Responses API uses
previous_response_id for server-side conversation chaining.
The Chat Completions API (xeroctl chat) uses client-side message arrays.
Both target the same models but use different storage and retrieval paths.
List Responses
Omit a response ID to list all stored responses for the endpoint:
xeroctl responses --endpoint my-endpoint
Filter by Status
xeroctl responses --endpoint my-endpoint --status completed
xeroctl responses --endpoint my-endpoint --status in_progress
xeroctl responses --endpoint my-endpoint --status cancelled
Filter coverage: The router-side
response.status enumerates six values:
queued, in_progress,
completed, failed,
cancelled, and incomplete. The CLI
--status filter currently accepts only
completed, in_progress, and
cancelled. To inspect a response in any of the
other three states, list without --status and
locate the row by ID.
Limit Results
xeroctl responses --endpoint my-endpoint --limit 20
The list output is a table with columns: ID, Model, Status, Input tokens, Output tokens, and Created (relative time). Status values are colour-coded.
Get Response
Provide a response ID to fetch full details:
xeroctl responses resp_abc123 --endpoint my-endpoint
Displays: ID, model, status, created/completed timestamps, input/output token counts, service tier, previous response ID, store flag, and any attached metadata.
JSON Output
xeroctl responses resp_abc123 --endpoint my-endpoint -o json
Add -o json to the list, get, or input-items subcommand to emit the raw router payload to stdout. The shape follows the OpenAI Responses API:
- list: a
{ "object": "list", "data": [ <response>, ... ], "first_id", "last_id", "has_more" }envelope. - get: a single
responseobject includingid,model,status,created_at,completed_at,previous_response_id,store,usage,metadata, andoutput[]. - input-items: a
{ "object": "list", "data": [ <item>, ... ] }envelope where each item carriestype,id,role, andcontent[]with untruncated text.
Delete Response
Delete a stored response. A confirmation prompt is shown by default:
xeroctl responses resp_abc123 --endpoint my-endpoint --delete
Skip Confirmation
xeroctl responses resp_abc123 --endpoint my-endpoint --delete --force
Dry Run
xeroctl responses resp_abc123 --endpoint my-endpoint --delete --dry-run
Cancel Response
Cancel a response that is currently in progress:
xeroctl responses resp_abc123 --endpoint my-endpoint --cancel
Dry Run
Pair --cancel with the global --dry-run flag to preview the operation without contacting the router:
xeroctl responses resp_abc123 --endpoint my-endpoint --cancel --dry-run
What happens to an in-flight stream:
Cancellation aborts server-side generation and closes the
HTTP connection; the Responses API does not guarantee a
dedicated terminal SSE event. A reader of the
--stream output may observe the stream simply
end without a response.completed frame. See
the Responses API reference
for the full cancel-semantics contract.
Input Items
Retrieve the input items (messages) associated with a stored response:
xeroctl responses resp_abc123 --endpoint my-endpoint --input-items
Each input item shows type, ID, role, text (truncated to 200 characters), and content part count. Use -o json to retrieve the full untruncated content:
xeroctl responses resp_abc123 --endpoint my-endpoint --input-items -o json
Create Response
Use the create subcommand to generate a new response:
xeroctl responses create "What is the capital of France?" --endpoint my-endpoint
The first line of stdout is the response identifier, which can be passed back as --previous-response-id on a follow-up turn:
ID: resp_abc123
With Instructions and Parameters
xeroctl responses create "Explain quantum entanglement" \
--endpoint my-endpoint \
--instructions "Be concise. Use plain language." \
--max-output-tokens 300 \
--temperature 0.7
Chain From a Previous Response
Use --previous-response-id to continue an existing conversation server-side:
xeroctl responses create "Now explain it in simple terms" \
--endpoint my-endpoint \
--previous-response-id resp_abc123
Control Storage
Use --store to explicitly request storage, or --no-store to opt out. When neither flag is set, the endpoint default applies:
xeroctl responses create "Draft a cover letter" --endpoint my-endpoint --store
xeroctl responses create "Hello" --endpoint my-endpoint --no-store
Reasoning Effort
For models that support extended reasoning, set the effort level with --reasoning-effort:
xeroctl responses create "Solve: if x^2 - 5x + 6 = 0, what is x?" \
--endpoint my-endpoint \
--reasoning-effort high
Valid values: low, medium, high.
Streaming Mode
Add --stream to the create subcommand to request a
server-sent-events response. Token throughput statistics are printed to stderr
after completion.
xeroctl responses create "Write a haiku about the ocean" \
--endpoint my-endpoint \
--stream
Current behaviour (incremental rendering disabled):
The CLI consumes the SSE stream but, in the current release, does not
print intermediate text deltas. The assembled response is rendered once
the terminal response.completed frame arrives, then the
throughput summary is written to stderr. From the user's perspective,
--stream currently behaves the same as the non-streaming
single-shot path for stdout output; only the wire transport differs.
The throughput summary remains accurate.
Example throughput summary (printed to stderr):
-- 38 tokens in 0.9s (42.2 tok/s)
Interactive Mode
Use --interactive with the create subcommand to start a
multi-turn session. Unlike xeroctl chat --interactive, conversation
context is maintained server-side via previous_response_id chaining.
Each turn issues a streaming request implicitly; the --stream flag
is redundant when --interactive is supplied and is neither required
nor rejected.
xeroctl responses create --interactive --endpoint my-endpoint
With Preset Instructions
xeroctl responses create --interactive \
--endpoint my-endpoint \
--instructions "You are a senior database architect. Keep answers brief."
Resume a Previous Session
Pass a known response ID to resume an existing conversation chain:
xeroctl responses create --interactive \
--endpoint my-endpoint \
--previous-response-id resp_abc123
The status bar at the prompt shows the endpoint slug, optional model name, cumulative token counts formatted as tokens: <input>/<output>, and the turn count:
endpoint: my-endpoint | tokens: 212/480 | turns: 4
In the example above, 212 is the cumulative input-token count and 480 is the cumulative output-token count across every turn of the session.
Output Streams
The responses command splits its output across stdout and stderr so that machine-readable content can be piped while progress indicators stay on the terminal:
- stdout: the assembled response text, key/value summaries (including the
ID:line fromcreate), and any-o jsonpayload. - stderr: the spinner, the post-completion throughput summary (for example,
-- 38 tokens in 0.9s (42.2 tok/s)), and any warning lines.
To suppress the spinner and throughput summary, pass the global --quiet flag. The flag does not affect stdout content; piping 2>/dev/null achieves the same effect for ad-hoc invocations. See the xeroctl CLI hub for the full list of global flags.
All Options
Management Options (default subcommand)
| Option | Type | Description |
|---|---|---|
--endpoint <slug>required |
string | Endpoint slug for all operations. |
[response-id] |
string | Positional response ID. Omit to list all responses. |
--delete |
flag | Delete the identified response. Requires a response ID. |
--cancel |
flag | Cancel a running response. Requires a response ID. |
--input-items |
flag | Retrieve input items for a response. Requires a response ID. |
--force |
flag | Skip the confirmation prompt for delete operations. |
--limit <n> |
integer | Maximum number of results when listing. |
--status <s> |
string | Filter list by status. Accepted values: completed, in_progress, cancelled. The router-side enum also includes queued, failed, and incomplete, but these cannot currently be selected via this flag. |
--dry-run |
flag | Global flag inherited from xeroctl. Applied to this command, it gates --delete and --cancel: the CLI prints the operation it would perform and exits without contacting the router. |
--delete, --cancel, and --input-items are mutually exclusive; at most one may be supplied to a single invocation of the default subcommand.
Create Options (create subcommand)
| Option | Type | Description |
|---|---|---|
--endpoint <slug>required |
string | Endpoint slug to send the request to. |
[message] |
string | User message. Required unless --interactive is specified. |
--stream |
flag | Stream the response token by token. |
--interactive |
flag | Start an interactive multi-turn session using server-side chaining. |
--instructions <text> |
string | System instructions for the model. |
--model <name> |
string | Model name passed through to the API. The endpoint configuration determines the actual model. |
--temperature <f> |
float | Sampling temperature between 0.0 and 2.0. |
--top-p <f> |
float | Nucleus sampling threshold between 0.0 and 1.0. |
--max-output-tokens <n> |
integer | Maximum output tokens to generate. |
--previous-response-id <id> |
string | ID of a previous response to chain from. Provides server-side conversation context. |
--store |
flag | Explicitly request server-side storage of the response. |
--no-store |
flag | Explicitly opt out of server-side storage. Cannot be combined with --store. |
--reasoning-effort <level> |
string | Reasoning effort for models that support it: low, medium, or high. The CLI rejects any other value. Models that do not implement reasoning silently ignore the field on the wire, matching OpenAI spec behaviour. |
Interactive Commands
The following slash commands are available during an interactive session:
| Command | Description |
|---|---|
/help |
Show available commands and keyboard shortcuts. |
/clear |
Reset the conversation by clearing the previous_response_id chain. |
/instructions <msg> |
Set or replace system instructions mid-session. Without an argument, displays the current instructions. |
/tokens |
Show cumulative token usage, turn count, and current chain ID. |
/quit |
End the session and print total token usage. Also triggered by Control+D. |
Keyboard Shortcuts
| Shortcut | Action |
|---|---|
| Up / Down | Recall previous inputs from history. |
| Tab | Auto-complete slash commands. |
| Control+A / Control+E | Jump to start or end of line. |
| Control+U / Control+K | Clear text before or after the cursor. |
| Control+W | Delete the previous word. |
| Control+D | End the session (EOF). |
Page Navigation
Two physical keys move the viewport between sections on this page (Mac, Linux, any keyboard layout). Disabled while a text field is focused.
| Shortcut | Action |
|---|---|
| [ | Jump to the previous section heading. |
| ] | Jump to the next section heading. |
Examples
Full Workflow: Create, List, Inspect, Delete
# Create and store a response
xeroctl responses create "What is recursion?" \
--endpoint my-endpoint --store
# List stored responses to find the ID
xeroctl responses --endpoint my-endpoint
# Inspect the response details
xeroctl responses resp_abc123 --endpoint my-endpoint
# View what was sent to the model
xeroctl responses resp_abc123 --endpoint my-endpoint --input-items
# Delete when done
xeroctl responses resp_abc123 --endpoint my-endpoint --delete --force
Multi-Turn Conversation via CLI Flags
# Turn 1: capture the ID printed on the first line of stdout
RESP_ID=$(xeroctl responses create "Explain REST APIs briefly" \
--endpoint my-endpoint --store | awk '/^ID:/ {print $2; exit}')
# Turn 2: chain from the first response
xeroctl responses create "Give me a curl example" \
--endpoint my-endpoint \
--previous-response-id "$RESP_ID"
Stream a Reasoning Model Response
xeroctl responses create \
"Prove that there are infinitely many prime numbers" \
--endpoint my-reasoning-endpoint \
--stream \
--reasoning-effort high \
--max-output-tokens 1024