DataBridge Docs
  • DataBridge Docs
  • Getting Started
    • Installation
    • Quick Start
  • API Reference
    • Overview
    • Endpoints
      • Ingest
      • Search
      • Query
      • Cache
      • Response Models
  • User Guides
    • Shell
    • Document Ingestion
    • Processing Rules
    • Semantic Search
    • Completions
    • Monitoring & Observability
Powered by GitBook
On this page
  • Ingest
  • Search
  • Query
  • Cache
  • Response Models
  • Response Models
  • Document
  • ChunkResult
  • DocumentResult
  • DocumentContent
  • CompletionResponse
  1. API Reference

Endpoints

PreviousOverviewNextIngest

Last updated 3 months ago

DataBridge's API is organized into several logical paths:

Endpoints for ingesting and managing documents:

  • Text document ingestion

  • File document ingestion

  • Document metadata management

Endpoints for semantic search functionality:

  • Document chunk search

  • Document listing and retrieval

  • Document metadata search

Endpoints for AI-powered query operations:

  • AI completions with context

  • RAG-based question answering

  • Context-aware responses

Coming soon

Common data models used across the API:

  • Document models

  • Search result models

  • Query response models

Response Models

Document

class Document:
    external_id: str
    owner: Dict[str, str]
    content_type: str
    filename: Optional[str]
    metadata: Dict[str, Any]  # user-defined metadata
    storage_info: Dict[str, str]  # storage backend info
    system_metadata: Dict[str, Any]  # creation date, version, etc.
    additional_metadata: Dict[str, Any]  # e.g., frame descriptions and transcripts for videos
    access_control: Dict[str, List[str]]  # readers, writers, admins
    chunk_ids: List[str]

ChunkResult

class ChunkResult:
    content: str
    score: float
    document_id: str  # external_id
    chunk_number: int
    metadata: Dict[str, Any]
    content_type: str
    filename: Optional[str]
    download_url: Optional[str]

    def augmented_content(self, doc: DocumentResult) -> str:
        """Get augmented content for video chunks with frame/transcript info"""

DocumentResult

class DocumentResult:
    score: float  # Highest chunk score
    document_id: str  # external_id
    metadata: Dict[str, Any]
    content: DocumentContent  # type and value fields
    additional_metadata: Dict[str, Any]  # e.g., frame descriptions and transcripts

DocumentContent

class DocumentContent:
    type: Literal["url", "string"]  # Content type
    value: str  # URL or actual content
    filename: Optional[str]  # Required for URL type, None for string type

CompletionResponse

class CompletionResponse:
    completion: str
    usage: TokenUsage  # completion_tokens, prompt_tokens, total_tokens

Ingest
Search
Query
Cache
Response Models