Ingest - Xagent

curl --request POST \ --url https://api.example.com/api/kb/ingest \ --header 'Authorization: Bearer <token>' \ --header 'Content-Type: multipart/form-data' \ --form file='@example-file' \ --form 'collection=<string>' \ --form chunk_size=1 \ --form chunk_overlap=1 \ --form 'separators=<string>' \ --form embedding_model_id=text-embedding-v4 \ --form embedding_batch_size=1 \ --form max_retries=1 \ --form retry_delay=1

{ "status": "<string>", "message": "<string>", "doc_id": "<string>", "parse_hash": "<string>", "chunk_count": 0, "embedding_count": 0, "vector_count": 0, "completed_steps": [ { "name": "<string>", "metadata": {} } ], "failed_step": "<string>", "warnings": [ "<string>" ], "file_id": "<string>" }

Authorizations

Authorization

string

header

required

Bearer authentication header of the form Bearer <token>, where <token> is your auth token.

Body

multipart/form-data

file

required

collection

string

parse_method

enum<string> | null

Parser used during ingestion. Options: default, pypdf, pdfplumber, unstructured, pymupdf, deepdoc

Available options:

default,

pypdf,

pdfplumber,

unstructured,

pymupdf,

deepdoc

chunk_strategy

enum<string> | null

Chunking strategy. Options: recursive (default), fixed_size, markdown

Available options:

recursive,

fixed_size,

markdown

chunk_size

integer | null

Chunk size in characters (default: 1000)

Required range: x > 0

chunk_overlap

integer | null

Chunk overlap in characters (default: 200)

Required range: x >= 0

separators

string | null

Custom chunk separators as JSON array of strings, e.g. ["\n\n", "\n", "。"]. Only used when chunk_strategy is recursive. Omit or empty to use default separators.

embedding_model_id

string

default:text-embedding-v4

Embedding model ID (default: text-embedding-v4)

embedding_batch_size

integer | null

Batch size for embedding (default: 10)

Required range: x > 0

max_retries

integer | null

Maximum retries for embedding failures (default: 3)

Required range: x >= 0

retry_delay

number | null

Delay between retries in seconds (default: 1.0)

Required range: x >= 0

Response

Successful Response

Structured response for the document ingestion pipeline.

status

string

required

Pipeline status: success|error|partial

message

string

required

Human-readable summary of pipeline result

doc_id

string | null

Document identifier produced by register_document

parse_hash

string | null

Parse hash produced during parse_document step

chunk_count

integer

default:0

Number of chunks created; must be non-negative

Required range: x >= 0

embedding_count

integer

default:0

Number of embeddings generated; must be non-negative

Required range: x >= 0

vector_count

integer

default:0

Number of vectors written to storage; must be non-negative

Required range: x >= 0

completed_steps

IngestionStepResult · object[]

List of successfully completed steps

Show child attributes

failed_step

string | null

Pipeline step where failure occurred, if any

warnings

string[]

Non-fatal warnings encountered

file_id

string | null

Uploaded file ID for preview/download via /api/files (when ingest registers the file)

Documentation Index

Authorizations

Body

Response