Ingest

POST Ingest Text Document

Ingest a text document with metadata. The document will be chunked and indexed for semantic search.

Parameters:

  • content: Text content to ingest

  • metadata: (Optional) Dictionary of metadata

  • rules: (Optional) List of processing rules to apply. Each rule can be:

    • Metadata extraction rule with a JSON schema

    • Natural language rule with a transformation prompt

Returns: Document object with the following fields:

  • external_id: Unique document identifier

  • content_type: Content type (always "text/plain" for text)

  • filename: Always None for text documents

  • metadata: Combined user-provided and rule-extracted metadata

  • storage_info: Empty for text documents

  • system_metadata: System-managed metadata (created_at, updated_at, version)

  • access_control: Access control lists (readers, writers, admins)

  • chunk_ids: List of chunk identifiers

Response:

POST Ingest File Document

Upload and ingest a file document. Supports various file types including PDFs, Word documents, presentations, and more. The file will be processed, chunked, and indexed for semantic search.

Parameters:

  • file: File to ingest (path string, bytes, file object, or Path)

  • filename: Name of the file

  • content_type: MIME type (optional, will be guessed if not provided)

  • metadata: Optional dictionary of metadata

  • rules: (Optional) List of processing rules to apply to extracted text

Returns: Document object with storage information including:

  • All fields from text documents

  • storage_info: Contains bucket and key information for file storage

  • filename: Original filename

  • content_type: MIME type of the file

Response:

Last updated