Installation
This guide covers setting up a DataBridge server. If you just want to use an existing DataBridge server, see our Quick Start Guide instead.
There are two ways to set up DataBridge:
Docker Installation (Recommended)
Docker Installation
The Docker setup is the recommended way to get started quickly with all components preconfigured.
Prerequisites
Docker and Docker Compose installed on your system
At least 10GB of free disk space (for models and data)
8GB+ RAM recommended
Quick Start
Clone the repository and navigate to the project directory:
Start all services:
This command will:
Build all required containers
Download necessary AI models (nomic-embed-text and llama3.2)
Initialize the PostgreSQL database with pgvector
Start all services
The initial setup may take 5-10 minutes depending on your internet speed.
For subsequent runs:
For more details on Docker setup, configuration, and troubleshooting, see our Docker Guide.
Using Existing Services
If you already have Ollama or PostgreSQL running on your machine, you can configure DataBridge to use these existing instances instead of starting new containers.
Using Existing Ollama
Modify
databridge.toml
to point to your local Ollama instance:
Remove the Ollama service from
docker-compose.yml
:Delete the
ollama
service sectionRemove
ollama
from thedepends_on
section of the DataBridge serviceRemove the
ollama_data
volume
Add host.docker.internal support (required for Linux):
Start only the required services:
Make sure your local Ollama instance:
Is running and accessible on port 11434
Has the required models installed (
nomic-embed-text
andllama3.2
)
Using Existing PostgreSQL
Modify the
POSTGRES_URI
in your environment ordocker-compose.yml
:
Remove the PostgreSQL service from
docker-compose.yml
:Delete the
postgres
service sectionRemove the
postgres_data
volumeUpdate the
depends_on
section of the DataBridge service
Make sure your PostgreSQL instance:
Has pgvector extension installed
Is accessible from Docker containers
Has the necessary database and permissions set up
Start only the DataBridge service:
Manual Installation
This section covers setting up DataBridge manually if you prefer more control over the installation.
1. Clone the Repository
2. Setup Python Environment
Python 3.12 is supported, but other versions may work:
3. Install Dependencies
4. Configure Environment
Copy the example environment file and create your own .env
:
Then edit the .env
file with your settings:
For local development, you can enable development mode in databridge.toml
:
Note: Development mode should only be used for local development and testing. Always configure proper authentication in production.
5. Setup PostgreSQL (Default Database)
If running with postgres locally:
6. Run Quick Setup
This script will automatically:
Configure your database
Set up your storage
Create the required vector index
7. Start the Server
Accessing Your DataBridge Server
Once your server is running (either through Docker or manual installation), you can access it in several ways:
1. Server Access Points
API: http://localhost:8000
API Documentation: http://localhost:8000/docs
Health Check: http://localhost:8000/health
2. Getting Your Access URI
Visit the API documentation at http://localhost:8000/docs
Find and use the
/local/generate_uri
endpoint to generate your admin URISave this URI - you'll need it to connect to your server
3. Ways to Use DataBridge
With your URI, you can interact with DataBridge in several ways:
Using the Shell
Using the Python SDK
Using the UI Component
The UI provides a visual interface for prototyping and testing. To set it up:
Navigate to the UI directory:
Install dependencies and start:
The UI will be available at http://localhost:3000. Use your generated URI to connect.
Additional Configuration
MongoDB Setup
You need a MongoDB Atlas cluster with Vector Search enabled
Create a database named as per your DATABRIDGE_DB setting
The server will automatically create required collections and indexes
AWS S3 Setup
Create an S3 bucket for document storage
Create an IAM user with permissions for this bucket
Use the access keys in your .env file
API Keys
OpenAI API key: Required if using OpenAI for embeddings
Unstructured API key: Required for document parsing
Next Steps
See the Quick Start Guide to begin using your server
Last updated