Save Collection Config

curl --request POST \ --url https://api.example.com/api/kb/collections/{collection}/config \ --header 'Authorization: Bearer <token>' \ --header 'Content-Type: application/json' \ --data ' { "deepdoc_processing_mode": "<string>", "deepdoc_parallel_threads": 2, "deepdoc_reserve_cpu": 1, "deepdoc_limiter_capacity": 2, "deepdoc_pipeline_monitor": true, "deepdoc_pipeline_s1_workers": 2, "deepdoc_gpu_sessions": 1, "embedding_base_url": "<string>", "embedding_api_key": "<string>", "embedding_timeout_sec": 1, "parse_method": "default", "chunk_strategy": "recursive", "chunk_method": "<string>", "chunk_size": 1000, "chunk_overlap": 200, "headers_to_split_on": [ [ "<string>", "<string>" ] ], "separators": [ "<string>" ], "use_token_count": false, "tiktoken_encoding": "cl100k_base", "enable_protected_content": true, "protected_patterns": [ "<string>" ], "table_context_size": 0, "image_context_size": 0, "embedding_model_id": "<string>", "collection_locked": false, "allow_mixed_parse_methods": false, "skip_config_validation": false, "embedding_batch_size": 10, "embedding_concurrent": 10, "embedding_use_async": false, "max_retries": 3, "retry_delay": 1 } '

{ "status": "<string>", "collection": "<string>", "message": "<string>", "warnings": [ "<string>" ], "affected_documents": [ { "doc_id": "<string>", "status": "pending", "message": "<string>" } ], "deleted_counts": {} }

Authorizations

Authorization

string

header

required

Bearer authentication header of the form Bearer <token>, where <token> is your auth token.

Path Parameters

collection

string

required

Body

application/json

Configuration values for the document ingestion pipeline.

deepdoc_processing_mode

string | null

DeepDoc processing mode (e.g., 'pipeline', 'default').

deepdoc_parallel_threads

integer | null

DeepDoc parallel threads (DEEPDOC_PARALLEL_THREADS).

Required range: x >= 1

deepdoc_reserve_cpu

integer | null

DeepDoc reserved CPU cores (DEEPDOC_RESERVE_CPU).

Required range: x >= 0

deepdoc_limiter_capacity

integer | null

DeepDoc CapacityLimiter capacity (DEEPDOC_LIMITER_CAPACITY).

Required range: x >= 1

deepdoc_pipeline_monitor

boolean | null

Enable DeepDoc pipeline monitor (DEEPDOC_PIPELINE_MONITOR).

deepdoc_pipeline_s1_workers

integer | null

DeepDoc S1 worker count (DEEPDOC_PIPELINE_S1_WORKERS).

Required range: x >= 1

deepdoc_gpu_sessions

integer | null

DeepDoc GPU sessions count/preference (DEEPDOC_GPU_SESSIONS).

Required range: x >= 0

embedding_base_url

string | null

Override DashScope base URL for embedding requests.

embedding_api_key

string | null

Override DashScope API key for embedding requests.

embedding_timeout_sec

number | null

Override embedding request timeout (seconds).

Required range: x > 0

parse_method

enum<string>

default:default

Parse method used during parse_document step

Available options:

default,

pypdf,

pdfplumber,

unstructured,

pymupdf,

deepdoc

chunk_strategy

enum<string>

default:recursive

Chunk strategy passed to chunk_document

Available options:

recursive,

fixed_size,

markdown

chunk_method

string | null

Custom chunk method identifier. If provided, takes precedence over chunk_strategy

chunk_size

integer | null

default:1000

Chunk size passed to chunk_document; must be a positive integer. If None, semantic splitting is used without size limits.

Required range: x > 0

chunk_overlap

integer

default:200

Chunk overlap passed to chunk_document; must be non-negative

Required range: x >= 0

headers_to_split_on

tuple[] | null

Markdown headers split rules for markdown strategy

Show child attributes

separators

string[] | null

Custom separators for recursive/markdown strategies

use_token_count

boolean

default:false

If True, chunk_size and chunk_overlap are in tokens (tiktoken); only applies to RECURSIVE strategy

tiktoken_encoding

string

default:cl100k_base

tiktoken encoding name when use_token_count=True (e.g. cl100k_base for GPT-4/3.5). Should align with config.DEFAULT_TIKTOKEN_ENCODING.

enable_protected_content

boolean

default:true

If True, do not split inside code blocks, formulas, tables (P1).

protected_patterns

string[] | null

Optional regex patterns for protected regions; None uses config default.

table_context_size

integer

default:0

Chars from prev/next chunk to attach to table chunks; 0 = off (P2).

Required range: x >= 0

image_context_size

integer

default:0

Chars from prev/next chunk to attach to image chunks; 0 = off (P2).

Required range: x >= 0

embedding_model_id

string | null

Embedding model identifier registered in AgentOS model hub. If omitted, the pipeline attempts to auto-detect a single available embedding model.

collection_locked

boolean

default:false

Whether to lock collection configuration. When True, enforces strict config validation.

allow_mixed_parse_methods

boolean

default:false

Whether to allow mixed parse methods within the collection. When False, enforces type-based parse method consistency.

skip_config_validation

boolean

default:false

Skip collection configuration validation. Use with caution.

embedding_batch_size

integer

default:10

Batch size for embedding provider requests; must be positive

embedding_concurrent

integer

default:10

Maximum concurrent requests for embedding computation when using async mode (for models that don't support batch processing, e.g., text-embedding-v4). Must be positive. Adjust based on machine configuration and API rate limits.

embedding_use_async

boolean

default:false

Whether to use async concurrent processing for embeddings. Set to True for models that don't support batch processing (e.g., text-embedding-v4). When True, embeddings are processed concurrently using asyncio instead of batch API calls.

max_retries

integer

default:3

Maximum number of retries for embedding provider failures; must be non-negative

Required range: x >= 0

retry_delay

number

default:1

Delay in seconds between embedding retries; must be non-negative

Required range: x >= 0

Response

Successful Response

Response payload for collection-level management operations.

status

string

required

Operation status: success|partial_success|error

collection

string

required

Collection identifier affected by the operation

message

string

required

Human-readable summary of the collection operation

warnings

string[]

Non-fatal issues encountered while processing the collection

affected_documents

CollectionOperationDetail · object[]

Subset of documents impacted by the collection operation

Show child attributes

deleted_counts

Deleted Counts · object

Aggregated deletion counts per table when applicable

Show child attributes

Authentication

Agents

Tasks

Files

Knowledge Base

Memory

Models

Tools

Cloud storage

Channels

Skills

Templates

MCP

Custom APIs

Text2SQL

Monitor

Widget

WebSocket

Save Collection Config

Authorizations

Path Parameters

Body

Response

Authentication

Agents

Tasks

Files

Knowledge Base

Memory

Models

Tools

Cloud storage

Channels

Skills

Templates

MCP

Custom APIs

Text2SQL

Monitor

Widget

WebSocket

Documentation Index

Authorizations

Path Parameters

Body

Response