Overview

DataBridge provides a RESTful API for document operations. While most users will interact through our Python SDK, you can use the API directly from any language.

Base URL

For local development:

http://localhost:8000

Authentication

Note: If you're running DataBridge in development mode (dev_mode=true in databridge.toml), authentication is not required and you can skip this section.

In production mode, all requests require a JWT token in the Authorization header:

Authorization: Bearer eyJhbGciOiJS...

To get your authentication token:

  1. Visit the API documentation at http://localhost:8000/docs

  2. Find the /local/generate_uri endpoint

  3. Use the endpoint to generate a URI containing your token

  4. Extract the token portion from the URI (the part between : and @)

The generated URI will be in the format: databridge://owner_id:token@host:port

Core Operations

Document Management

  • POST /ingest/text - Ingest text documents

  • POST /ingest/file - Ingest file documents

  • GET /documents - List documents with pagination and filtering

    • Query Parameters:

      • skip (optional): Number of documents to skip (default: 0)

      • limit (optional): Maximum number of documents to return (default: 100)

      • filters (optional): Metadata filters in JSON format

  • GET /documents/{id} - Get document details

Search & Retrieval

  • POST /query - Generate AI completions using relevant document context

    • Parameters:

      • query: The question or prompt

      • filters (optional): Metadata filters

      • k (optional): Number of chunks to use as context (default: 4)

      • min_score (optional): Minimum similarity score (default: 0.0)

      • max_tokens (optional): Maximum tokens in completion

      • temperature (optional): Sampling temperature for completion

  • POST /retrieve/chunks - Search for relevant document chunks

    • Parameters:

      • query: The search query

      • filters (optional): Metadata filters

      • k (optional): Number of chunks to return (default: 4)

      • min_score (optional): Minimum similarity score (default: 0.0)

  • POST /retrieve/docs - Search for relevant documents

    • Parameters:

      • Same as /retrieve/chunks

Usage & Telemetry

  • GET /usage/stats - Get usage statistics for the authenticated user

  • GET /usage/recent - Get recent usage records

    • Query Parameters:

      • operation_type (optional): Filter by operation type

      • since (optional): Filter by timestamp

      • status (optional): Filter by operation status

Request/Response Format

All requests and responses use JSON, except for file uploads which use multipart/form-data.

Example Text Ingestion Request

POST /ingest/text
Authorization: Bearer eyJhbGciOiJS...
Content-Type: application/json

{
    "content": "Machine learning is transforming industries...",
    "metadata": {
        "title": "ML Overview",
        "category": "tech"
    }
}

Example Query Request

POST /query
Authorization: Bearer eyJhbGciOiJS...
Content-Type: application/json

{
    "query": "What are the main applications of machine learning?",
    "k": 3,
    "filters": {
        "category": "tech"
    },
    "max_tokens": 150,
    "temperature": 0.7
}

Example Query Response

{
    "completion": "Based on the retrieved context, machine learning has several key applications...",
    "usage": {
        "completion_tokens": 45,
        "prompt_tokens": 120,
        "total_tokens": 165
    }
}

Example Retrieve Chunks Request

POST /retrieve/chunks
Authorization: Bearer eyJhbGciOiJS...
Content-Type: application/json

{
    "query": "machine learning applications",
    "k": 3,
    "filters": {
        "category": "tech"
    },
    "min_score": 0.5
}

Example Retrieve Chunks Response

[
    {
        "content": "Machine learning is transforming...",
        "score": 0.89,
        "document_id": "doc_abc123",
        "chunk_number": 0,
        "metadata": {
            "title": "ML Overview",
            "category": "tech"
        },
        "content_type": "text/plain",
        "filename": null,
        "download_url": null
    }
]

Example Usage Stats Response

{
    "ingest_text": 10,
    "ingest_file": 5,
    "query": 25,
    "retrieve_chunks": 30
}

Example Recent Usage Response

[
    {
        "timestamp": "2024-01-20T10:30:00Z",
        "operation_type": "query",
        "tokens_used": 150,
        "user_id": "user123",
        "duration_ms": 250,
        "status": "success",
        "metadata": {
            "query": "What are the benefits of exercise?"
        }
    }
]

Common Response Codes

Code
Description

200

Success

400

Bad Request - Check request parameters

401

Unauthorized - Invalid/missing token

403

Forbidden - Insufficient permissions

404

Not Found

422

Validation Error

500

Internal Server Error

Available SDKs

  • Python SDK - Official Python client with both sync and async support

Last updated