Image Generation Models
Image generation models enable Xagent to create and modify images through text prompts.Purpose
Enable Image Tools:generate_image- Create images from text descriptionsedit_image- Modify existing images- Visual content creation and editing
- Accept text prompts describing desired image
- Generate new images or modify existing ones
- Return image files saved to workspace
When to Use
Required for:- Image generation tasks
- Image editing and modification
- Visual content creation
- Design and marketing materials
- Image understanding or analysis (uses Vision LLMs)
- Image OCR or text extraction (uses Vision LLMs)
- Chart/graph analysis (uses Vision LLMs)
Supported Providers
OpenAI & OpenAI-compatible
Models: gpt-image-1 and compatible models Setup:- Get API key from OpenAI Platform
- For compatible services, provide base URL and API key
- Select image model
- High-quality image generation
- Creative tasks
- Marketing materials
- Wide compatibility
DashScope (Alibaba Cloud)
Models: qwen-image Setup:- Get API key from DashScope Console
- Configure API key
- Select model
- Chinese language optimization
- Cost-effective for Asian markets
- Alibaba Cloud integration
- JPG, JPEG, PNG, BMP, TIFF, WEBP
- Max file size: 10MB for editing
Xinference
Models: Stable Diffusion variants (stable-diffusion-2-1, etc.) Setup:- Deploy Xinference server
- Launch image generation model
- Configure base URL in Xagent
- Select model from Xinference
- Self-hosted deployment
- Data privacy
- Cost control
- Using open-source models
- List available models from server
- Support for various Stable Diffusion models
- Image-to-image and inpainting capabilities
Configuration
Step 1: Add Image Provider
- Go to Models in the sidebar
- Click Add Model or Add Provider
- Select image generation provider
- Enter API credentials
Step 2: Configure Model
- Select the image model
- Configure parameters:
- Image size (e.g., 10241024, 17921024)
- Response format (url or b64_json)
- Number of images (n)
- Test generation
Step 3: Set Abilities
Image models can have these abilities:- generate - Create new images from text
- edit - Modify existing images
Not all image models support both generation and editing. Check provider documentation.
Usage Examples
Generating Images
Editing Images
Multi-Image Editing
Troubleshooting
Generation Failed
Check:- API key is valid
- Model supports generation ability
- Prompt follows guidelines
- Content policy violations
Poor Quality
Try:- Improving prompt specificity
- Adding style instructions
- Using negative prompts
- Trying different model
- Adjusting image size
Slow Generation
Optimize:- Reduce image size
- Consider faster model
- Check network connectivity
Edits Not Working
Verify:- Model supports edit ability
- Edit instructions are clear
- Original image is accessible
- Image format is supported
- File size within limits (DashScope: 10MB)
File Not Found
Check:- Image path is correct
- File exists in workspace
- Use workspace file browser to verify path
- URL is accessible
Security & Privacy
Important considerations:- Generated images are saved to workspace
- Check provider’s data retention policy
- Be aware of content policies
- Copyright and usage rights
- Review provider content policy
- Ensure rights to generated content
- Consider compliance requirements
- Be mindful of copyright
Next Steps
- LLM Models - Configure language models
- Vision LLMs - Configure image understanding models
- Embedding Models - Configure vector embeddings
- Model Overview - Understanding all model types