Installation
Last updated
Last updated
This guide covers setting up a DataBridge server. If you just want to use an existing DataBridge server, see our instead.
There are two ways to set up DataBridge:
(Recommended)
The Docker setup is the recommended way to get started quickly with all components preconfigured.
Docker and Docker Compose installed on your system
At least 10GB of free disk space (for models and data)
8GB+ RAM recommended
Clone the repository and navigate to the project directory:
Start all services:
This command will:
Build all required containers
Download necessary AI models (nomic-embed-text and llama3.2)
Initialize the PostgreSQL database with pgvector
Start all services
The initial setup may take 5-10 minutes depending on your internet speed.
For subsequent runs:
If you already have Ollama or PostgreSQL running on your machine, you can configure DataBridge to use these existing instances instead of starting new containers.
Modify databridge.toml
to point to your local Ollama instance:
Remove the Ollama service from docker-compose.yml
:
Delete the ollama
service section
Remove ollama
from the depends_on
section of the DataBridge service
Remove the ollama_data
volume
Add host.docker.internal support (required for Linux):
Start only the required services:
Make sure your local Ollama instance:
Is running and accessible on port 11434
Has the required models installed (nomic-embed-text
and llama3.2
)
Modify the POSTGRES_URI
in your environment or docker-compose.yml
:
Remove the PostgreSQL service from docker-compose.yml
:
Delete the postgres
service section
Remove the postgres_data
volume
Update the depends_on
section of the DataBridge service
Make sure your PostgreSQL instance:
Has pgvector extension installed
Is accessible from Docker containers
Has the necessary database and permissions set up
Start only the DataBridge service:
This section covers setting up DataBridge manually if you prefer more control over the installation.
Python 3.12 is supported, but other versions may work:
Copy the example environment file and create your own .env
:
Then edit the .env
file with your settings:
For local development, you can enable development mode in databridge.toml
:
Note: Development mode should only be used for local development and testing. Always configure proper authentication in production.
If running with postgres locally:
This script will automatically:
Configure your database
Set up your storage
Create the required vector index
Once your server is running (either through Docker or manual installation), you can access it in several ways:
API: http://localhost:8000
API Documentation: http://localhost:8000/docs
Health Check: http://localhost:8000/health
Visit the API documentation at http://localhost:8000/docs
Find and use the /local/generate_uri
endpoint to generate your admin URI
Save this URI - you'll need it to connect to your server
With your URI, you can interact with DataBridge in several ways:
The UI provides a visual interface for prototyping and testing. To set it up:
Navigate to the UI directory:
Install dependencies and start:
The UI will be available at http://localhost:3000. Use your generated URI to connect.
You need a MongoDB Atlas cluster with Vector Search enabled
Create a database named as per your DATABRIDGE_DB setting
The server will automatically create required collections and indexes
Create an S3 bucket for document storage
Create an IAM user with permissions for this bucket
Use the access keys in your .env file
OpenAI API key: Required if using OpenAI for embeddings
Unstructured API key: Required for document parsing
For more details on Docker setup, configuration, and troubleshooting, see our .
See the to begin using your server