Chat Memory

Persistent per-chat knowledge with semantic recall and automatic context injection.

Overview

Chat memory allows conversations to accumulate knowledge over time. The model can save and recall facts, preferences, and instructions as the conversation progresses. Users can also manually create memories from any message in the chat interface.

Memories are scoped per-chat -- each conversation maintains its own isolated memory store. When memories are saved, they are embedded as vectors and stored for relevance-based semantic search. This means the system retrieves the most contextually relevant memories rather than relying on keyword matching or recency alone.

Key characteristics:

  • Per-chat scoping -- memories do not leak between conversations.
  • Semantic vector search for relevance-based recall.
  • Automatic deduplication of near-duplicate memories.
  • Soft-delete support for safe memory removal.
  • Both model-driven and user-driven memory creation.

How It Works

Chat memory operates through two complementary mechanisms: automatic (model-driven) and manual (user-driven).

Automatic Memory (Model-Driven)

During a conversation, the model can call two built-in tools to interact with the memory system:

  • save_memory -- The model stores a piece of knowledge (a fact, preference, or instruction) into the chat's memory. The content is embedded as a vector and persisted to the database.
  • recall_memory -- The model queries the memory store with a natural language query and receives the top matching memories ranked by semantic relevance.

The model decides autonomously when to save or recall memories based on the conversation context.

Manual Memory (User-Driven)

Users can click the "Remember this" button on any message in the chat interface. This creates a memory from the message content, embedding it and storing it in the same memory store that the model accesses. Manual memories are tagged with source_type: user to distinguish them from model-created memories.

Passive Injection

At the start of each turn, the system automatically performs a semantic search against the chat's memory store using the current user message as the query. The top relevant memories are injected into the system prompt as contextual knowledge. This happens transparently -- no user or model action is required. See the Passive Injection section for details.

Built-in Tools

save_memory

Stores a piece of knowledge in the chat's memory. The content is embedded as a vector and persisted. Near-duplicate content is automatically deduplicated -- if a memory with very high semantic similarity already exists, the save is skipped and the existing memory is returned.

ParameterTypeRequiredDescription
contentrequired string Yes The text content to store as a memory.
categoryoptional string No Category for the memory. One of: preference, fact, instruction. When omitted, the system infers the category from the content.

Example Tool Call

JSON
{ "name": "save_memory", "arguments": { "content": "User prefers responses in bullet-point format", "category": "preference" } }

recall_memory

Queries the chat's memory store using semantic search. Returns the top 5 matching memories ranked by relevance score.

ParameterTypeRequiredDescription
queryrequired string Yes Natural language query to search memories against.

Example Tool Call

JSON
{ "name": "recall_memory", "arguments": { "query": "What formatting preferences does the user have?" } }

Example Response

JSON
{ "memories": [ { "id": "mem_abc123", "content": "User prefers responses in bullet-point format", "category": "preference", "relevance": 0.92, "created_at": "2026-03-07T10:30:00Z" }, { "id": "mem_def456", "content": "User wants concise answers, no longer than 3 paragraphs", "category": "instruction", "relevance": 0.85, "created_at": "2026-03-07T10:25:00Z" } ] }

Memory Management API

REST endpoints for managing memories programmatically. All endpoints require authentication and operate within the scope of a single chat.

Create Memory from Message

POST /chat/api/{chatId}/memory

Creates a new memory from an existing message in the chat.

Request Body

FieldTypeRequiredDescription
messageIdrequired string Yes The ID of the message to create a memory from.
contentrequired string Yes The text content to store as the memory.

curl

curl
curl -X POST https://api.xerotier.ai/chat/api/550e8400-e29b-41d4-a716-446655440000/memory \ -H "Cookie: session=your_session_token" \ -H "Content-Type: application/json" \ -d '{ "messageId": "msg_abc123", "content": "The project deadline is March 15, 2026" }'

List Memories

GET /chat/api/{chatId}/memories

Returns all active (non-deleted) memories for the specified chat, ordered by creation date descending.

curl

curl
curl https://api.xerotier.ai/chat/api/550e8400-e29b-41d4-a716-446655440000/memories \ -H "Cookie: session=your_session_token"

Response

JSON
{ "memories": [ { "id": "mem_abc123", "content": "The project deadline is March 15, 2026", "category": "fact", "source_type": "user", "created_at": "2026-03-07T10:30:00Z", "updated_at": "2026-03-07T10:30:00Z" } ] }

Update Memory

PATCH /chat/api/{chatId}/memories/{memoryId}

Updates the content of an existing memory. The embedding is regenerated automatically to reflect the new content.

Request Body

FieldTypeRequiredDescription
contentrequired string Yes The new text content for the memory.

curl

curl
curl -X PATCH https://api.xerotier.ai/chat/api/550e8400-e29b-41d4-a716-446655440000/memories/mem_abc123 \ -H "Cookie: session=your_session_token" \ -H "Content-Type: application/json" \ -d '{ "content": "The project deadline has been extended to April 1, 2026" }'

Delete Memory

DELETE /chat/api/{chatId}/memories/{memoryId}

Soft-deletes a memory. The record remains in the database with a deleted_at timestamp but is excluded from all queries and semantic searches.

curl

curl
curl -X DELETE https://api.xerotier.ai/chat/api/550e8400-e29b-41d4-a716-446655440000/memories/mem_abc123 \ -H "Cookie: session=your_session_token"

Returns HTTP 204 No Content on success.

Python (requests)

Python
import requests base = "https://api.xerotier.ai/chat/api" chat_id = "550e8400-e29b-41d4-a716-446655440000" cookies = {"session": "your_session_token"} # Create a memory from a message requests.post(f"{base}/{chat_id}/memory", cookies=cookies, json={ "messageId": "msg_abc123", "content": "The project deadline is March 15, 2026" }) # List all memories memories = requests.get(f"{base}/{chat_id}/memories", cookies=cookies).json() for mem in memories["memories"]: print(f"[{mem['category']}] {mem['content']}") # Update a memory requests.patch( f"{base}/{chat_id}/memories/mem_abc123", cookies=cookies, json={"content": "Deadline extended to April 1, 2026"} ) # Delete a memory requests.delete(f"{base}/{chat_id}/memories/mem_abc123", cookies=cookies)

Node.js (fetch)

JavaScript
const base = "https://api.xerotier.ai/chat/api"; const chatId = "550e8400-e29b-41d4-a716-446655440000"; const headers = { "Cookie": "session=your_session_token", "Content-Type": "application/json" }; // Create a memory from a message await fetch(`${base}/${chatId}/memory`, { method: "POST", headers, body: JSON.stringify({ messageId: "msg_abc123", content: "The project deadline is March 15, 2026" }) }); // List all memories const memRes = await fetch(`${base}/${chatId}/memories`, { headers }); const memories = await memRes.json(); memories.memories.forEach(mem => console.log(`[${mem.category}] ${mem.content}`) ); // Update a memory await fetch(`${base}/${chatId}/memories/mem_abc123`, { method: "PATCH", headers, body: JSON.stringify({ content: "Deadline extended to April 1, 2026" }) }); // Delete a memory await fetch(`${base}/${chatId}/memories/mem_abc123`, { method: "DELETE", headers });

Passive Injection

Passive injection is the automatic process by which relevant memories are included in the model's context at the start of each turn. This requires no action from the user or the model -- it happens transparently during context assembly.

How It Works

  1. The user sends a message in the chat.
  2. During context assembly, the system takes the current user message and runs a semantic search (cosine similarity) against the chat_memories table for that chat.
  3. The top 3-5 most relevant memories (above a minimum similarity threshold) are selected.
  4. These memories are injected into the system prompt as a "Known facts about this conversation" section, formatted as a numbered list.
  5. The model receives the enriched context and can use the injected memories to produce more informed responses.

Injected Context Example

The following shows how injected memories appear in the system prompt that the model receives:

System Prompt Fragment
Known facts about this conversation: 1. User prefers responses in bullet-point format (preference) 2. The project deadline is March 15, 2026 (fact) 3. Always include code examples in Python (instruction)

The number of injected memories varies between 3 and 5 depending on how many memories exceed the relevance threshold. If no memories are sufficiently relevant, the section is omitted entirely.

Chat UI

The chat interface provides several UI elements for interacting with the memory system.

Remember This Button

Each message in the chat displays a "Remember this" button in its action bar. Clicking it creates a memory from the message content. The button provides visual feedback (a brief highlight) to confirm the memory was saved.

Memories Sidebar Panel

The toolbar includes a toggle button to open the Memories sidebar panel. This panel displays all active memories for the current chat in a scrollable list.

  • Each memory entry shows its content, category, source type, and creation date.
  • Memories can be edited inline -- click the edit icon, modify the text, and save.
  • Memories can be deleted via the delete icon on each entry.
  • A memory count badge on the toolbar toggle shows the total number of active memories.

Memory Count Badge

The toolbar toggle button displays a small badge showing the current count of active memories for the chat. This updates in real time as memories are created or deleted, whether by the model or the user.

Data Model

Each memory is associated with a single chat and optionally linked to the message that created it.

Memory Object

FieldTypeDescription
id string Memory identifier in mem_xxx format.
content string The text content of the memory.
category string (nullable) Optional category: preference, fact, or instruction.
source_type string How the memory was created: model (via save_memory tool) or user (via "Remember this" button).
created_at string ISO 8601 timestamp when the memory was created.
updated_at string ISO 8601 timestamp when the memory was last updated.