Skip to content
Subs -25% LAUNCH-SUB
Claws -25% LAUNCH-CLAWS

Embedding/RAG Add-on

5 min read Addons Last updated February 10, 2026

What the Embedding Add-on Does

The embedding add-on enables Retrieval-Augmented Generation (RAG) for your OpenClaw instance. With RAG, your instance can search through uploaded documents, knowledge bases, and custom data to provide answers grounded in your specific content rather than relying solely on the LLM's general knowledge.

Without this add-on, your instance can only respond based on the LLM's built-in training data.

How RAG Works

RAG combines document retrieval with LLM generation in two steps:

  1. Retrieval — When a user asks a question, the system searches your uploaded documents for relevant passages using vector similarity (embeddings)
  2. Generation — The relevant passages are included as context when the LLM generates its response, producing answers grounded in your actual data

This means the LLM can answer questions about your company's specific products, policies, internal documentation, or any other content you upload.

Embedding Models

ClawHosters offers two embedding models with different trade-offs:

Model Best For Speed Accuracy
MiniLM General-purpose retrieval, fast processing Fast Good
Qwen3 Higher accuracy, multilingual content Moderate Better

You choose the embedding model when subscribing. The model determines how your documents are indexed and searched.

MiniLM

A lightweight model optimized for speed. Works well for English content and general knowledge bases. Lower cost per document processed.

Qwen3

A larger model with stronger multilingual support and higher retrieval accuracy. Better suited for technical documentation, multi-language content, or cases where retrieval precision matters.

Pricing

Pricing depends on the embedding model and pack size:

MiniLM Pricing

Pack Monthly Price Best For
Starter €2 Small knowledge bases, testing
Standard €6 Medium document collections
Pro €20 Large knowledge bases

Qwen3 Pricing

Pack Monthly Price Best For
Starter €3 Small multilingual collections
Standard €10 Medium document sets with high accuracy needs
Pro €35 Large-scale retrieval workloads

Usage is tracked by the number of documents processed and queries made.

Setting Up the Embedding Add-on

  1. Open your instance in the ClawHosters dashboard
  2. Go to Add-ons > Embedding
  3. Choose an embedding model (MiniLM or Qwen3)
  4. Choose a pack size (Starter, Standard, or Pro)
  5. Confirm your subscription

After subscribing, you can start uploading documents through the instance's knowledge base interface.

Uploading Documents

Once the add-on is active, upload documents through your instance dashboard:

  • Supported formats — PDF, TXT, Markdown, HTML, CSV
  • Upload limit — Depends on your pack size
  • Processing time — Documents are indexed automatically after upload. MiniLM processes faster; Qwen3 takes slightly longer for higher accuracy.

Each document is split into chunks, converted to vector embeddings, and stored for retrieval. The chunking strategy is handled automatically.

Requirements

The embedding add-on requires:

  • An active LLM subscription (BYOK or managed pack) — retrieved documents are passed to the LLM for response generation
  • An active instance in "Running" status

Usage Tracking

Embedding usage is tracked on the add-ons page:

  • Documents indexed — How many documents are stored in your knowledge base
  • Queries this period — How many RAG queries were processed
  • Storage used — How much vector storage your knowledge base occupies

What Happens When You Run Out

If your embedding pack's query or document limit is reached:

  • New document uploads are paused
  • RAG queries may be rate-limited or return errors
  • Your instance continues working normally for non-RAG conversations
  • Limits reset at the start of the next billing period, or you can upgrade your pack

Choosing the Right Model

Consideration Choose MiniLM Choose Qwen3
Budget Lower cost Higher cost
Language Primarily English Multilingual content
Speed Faster indexing and queries Slightly slower
Accuracy Good for most use cases Better for precision-critical retrieval
Document size Any Any

If you are unsure, start with MiniLM. You can switch models later, though re-indexing your documents is required.

Managing Your Subscription

Changing Models

Switching from MiniLM to Qwen3 (or vice versa) requires re-indexing all documents. The switch takes effect immediately, and documents are re-processed in the background.

Upgrading or Downgrading Packs

Pack upgrades take effect immediately. Downgrades take effect at the start of the next billing period.

Cancelling

Cancel the embedding add-on from the add-ons page. Your knowledge base and indexed documents are retained for 30 days. If you resubscribe within that period, your data is restored. After 30 days, all indexed data is permanently deleted.

Troubleshooting

RAG responses do not reference uploaded documents

  • Verify the embedding add-on is active on the add-ons page
  • Check that documents have finished indexing (processing indicator on the knowledge base page)
  • Ensure the user's question is relevant to the uploaded content — RAG only retrieves contextually similar passages
  • Confirm the LLM add-on is active — RAG without an LLM cannot generate responses

Document upload fails

  • Check that the file format is supported (PDF, TXT, Markdown, HTML, CSV)
  • Verify your pack has not reached its document limit
  • Large files may take longer to process — wait for the indexing to complete before uploading more

"Embedding add-on not available" error

  • The add-on requires an active instance in "Running" status
  • Instances in error, stopped, or paused states cannot process embeddings

Related Documentation