Embedding Models
Embedding models convert text into vector representations, enabling semantic search and knowledge base operations in Xagent.What are Embeddings?
Embeddings are numerical representations of text that capture semantic meaning. Similar concepts have similar embeddings, allowing Xagent to:- Find relevant information in knowledge bases
- Compare document similarity
- Power semantic search
- Enable RAG (Retrieval-Augmented Generation)
Purpose
Knowledge Base Operations:- Index uploaded documents
- Enable semantic search across documents
- Retrieve relevant context for tasks
- Documents are chunked and converted to embeddings
- Stored in a vector database
- When a task queries the knowledge base, Xagent searches for similar embeddings
- Retrieved content is provided as context to the LLM
When to Configure
Required for:- Using knowledge base functionality
- Uploading and searching documents
- Building RAG systems
- Semantic document retrieval
- Customer support (search FAQs, documentation)
- Research assistant (query paper/database)
- Legal analysis (search case law, contracts)
- Technical documentation (search manuals, guides)
Supported Providers
OpenAI & OpenAI-compatible
Models:- text-embedding-3-small
- text-embedding-3-large
- text-embedding-ada-002
- Get API key from OpenAI Platform
- For compatible services, provide base URL and API key
- Select embedding model
- General-purpose embeddings
- Wide availability
- Strong ecosystem
DashScope (Alibaba Cloud)
Models:- text-embedding-v4
- text-embedding-v3
- text-embedding-v2
- Get API key from DashScope Console
- Configure API key
- Select embedding model
- Chinese language optimization
- Cost-effective for Asian languages
- Alibaba Cloud integration
Xinference
Models:- bge-large-en-v1.5
- bge-base-en
- All-MiniLM-L6-v2
- Other HuggingFace embedding models
- Deploy Xinference server
- Launch embedding model (e.g., bge-base-en-v1.5)
- Configure base URL in Xagent
- Select model from Xinference
- Data privacy
- Self-hosted deployment
- Custom model support
- Using HuggingFace open-source models
To use HuggingFace models, deploy them via Xinference.
Configuration
Step 1: Add Embedding Provider
- Go to Models in the left sidebar
- Click Add Model or Add Provider
- Select embedding provider
Step 2: Configure Model
- Select the embedding model
- Enter API credentials
- Configure parameters (dimensions, etc.)
- Test the connection
Step 3: Set as Default
Set as the default embedding model for:- Knowledge base indexing
- Semantic search operations
- Vector database operations
Model Parameters
Dimensions
The size of the embedding vector. Considerations:- Lower dimensions (384-768): Faster, less storage, good for simple tasks
- Medium dimensions (768-1536): Balanced performance
- Higher dimensions (3072+): Better accuracy, more storage/cost
- Higher dimensions = better semantic understanding
- But increased storage and computation
- Choose based on your requirements
Chunk Size
When documents are indexed, they’re split into chunks before embedding. Typical sizes:- 512-1000 tokens for general documents
- Larger for technical documentation
- Smaller for precise retrieval
- Smaller chunks: More precise retrieval, more chunks to manage
- Larger chunks: More context per chunk, less precise
Best Practices
Choose the Right Model
For general use:- OpenAI text-embedding-3-small
- Balances cost and performance
- OpenAI text-embedding-3-large
- BAAI/bge-large-en-v1.5
- Open-source models (HuggingFace)
- Self-hosted options
Optimize Chunking
- Match chunk size to typical query length
- Consider document structure (paragraphs, sections)
- Test different chunk sizes for your use case
- Use overlap to maintain context
Monitor Performance
- Track retrieval quality
- Monitor search latency
- Evaluate storage requirements
- Adjust based on usage patterns
Troubleshooting
Poor Search Results
Check:- Embedding model quality
- Chunk size and strategy
- Query formulation
- Document content quality
- Using a higher-quality model
- Adjusting chunk size
- Improving document formatting
- Adding more relevant documents
High Latency
Optimize:- Use smaller/faster models
- Reduce embedding dimensions
- Implement caching
- Consider vector database optimization
Storage Issues
Solutions:- Use lower-dimensional embeddings
- Implement efficient chunking
- Clean up unused embeddings
- Consider storage compression
Integration with Knowledge Base
Embedding models work seamlessly with Xagent’s knowledge base:- Upload documents to knowledge base
- Automatic indexing using embedding model
- Semantic search powered by embeddings
- Context retrieval during task execution
Embedding models are required for knowledge base functionality. Configure one before uploading documents.
Next Steps
- Knowledge Base Overview - Learn about knowledge features
- LLM Models - Configure language models
- Vision LLMs - Configure vision models
- Model Overview - Understanding all model types