# Installation

This guide covers setting up a DataBridge server. If you just want to use an existing DataBridge server, see our [Quick Start Guide](/databridge-docs/getting-started/quickstart.md) instead.

There are two ways to set up DataBridge:

1. [Docker Installation](#docker-installation) (Recommended)
2. [Manual Installation](#manual-installation)

## Docker Installation

The Docker setup is the recommended way to get started quickly with all components preconfigured.

### Prerequisites

* Docker and Docker Compose installed on your system
* At least 10GB of free disk space (for models and data)
* 8GB+ RAM recommended

### Quick Start

1. Clone the repository and navigate to the project directory:

```bash
git clone https://github.com/databridge-org/databridge-core.git
cd databridge-core
```

2. Start all services:

```bash
docker compose up --build
```

This command will:

* Build all required containers
* Download necessary AI models (nomic-embed-text and llama3.2)
* Initialize the PostgreSQL database with pgvector
* Start all services

The initial setup may take 5-10 minutes depending on your internet speed.

3. For subsequent runs:

```bash
docker compose up    # Start all services
docker compose down  # Stop all services
```

For more details on Docker setup, configuration, and troubleshooting, see our [Docker Guide](https://github.com/databridge-org/databridge-core/blob/main/DOCKER.md).

### Using Existing Services

If you already have Ollama or PostgreSQL running on your machine, you can configure DataBridge to use these existing instances instead of starting new containers.

#### Using Existing Ollama

1. Modify `databridge.toml` to point to your local Ollama instance:

```toml
[completion]
provider = "ollama"
model_name = "llama3.2"
base_url = "http://host.docker.internal:11434"  # Points to host machine's Ollama

[embedding]
provider = "ollama"
model_name = "nomic-embed-text"
base_url = "http://host.docker.internal:11434"  # Points to host machine's Ollama
```

2. Remove the Ollama service from `docker-compose.yml`:
   * Delete the `ollama` service section
   * Remove `ollama` from the `depends_on` section of the DataBridge service
   * Remove the `ollama_data` volume
3. Add host.docker.internal support (required for Linux):

```yaml
services:
  databridge:
    extra_hosts:
      - "host.docker.internal:host-gateway"
```

4. Start only the required services:

```bash
docker compose up postgres databridge
```

Make sure your local Ollama instance:

* Is running and accessible on port 11434
* Has the required models installed (`nomic-embed-text` and `llama3.2`)

#### Using Existing PostgreSQL

1. Modify the `POSTGRES_URI` in your environment or `docker-compose.yml`:

```yaml
services:
  databridge:
    environment:
      - POSTGRES_URI=postgresql+asyncpg://your_user:your_password@host.docker.internal:5432/your_db
```

2. Remove the PostgreSQL service from `docker-compose.yml`:
   * Delete the `postgres` service section
   * Remove the `postgres_data` volume
   * Update the `depends_on` section of the DataBridge service
3. Make sure your PostgreSQL instance:
   * Has pgvector extension installed
   * Is accessible from Docker containers
   * Has the necessary database and permissions set up
4. Start only the DataBridge service:

```bash
docker compose up databridge
```

## Manual Installation

This section covers setting up DataBridge manually if you prefer more control over the installation.

### 1. Clone the Repository

```bash
git clone https://github.com/databridge-org/databridge-core.git
```

### 2. Setup Python Environment

Python 3.12 is supported, but other versions may work:

```bash
cd databridge-core
python -m venv .venv
source .venv/bin/activate # On Windows: .venv\Scripts\activate
```

### 3. Install Dependencies

```bash
pip install -r requirements.txt
```

### 4. Configure Environment

Copy the example environment file and create your own `.env`:

```bash
cp .env.example .env
```

Then edit the `.env` file with your settings:

```env
JWT_SECRET_KEY="..."  # Required in production, optional in dev mode
POSTGRES_URI="postgresql+asyncpg://postgres:postgres@localhost:5432/databridge" # Required for PostgreSQL database
MONGODB_URI="..." # Optional: Only needed if using MongoDB

UNSTRUCTURED_API_KEY="..." # Optional: Needed for parsing via unstructured API
OPENAI_API_KEY="..." # Optional: Needed for OpenAI embeddings and completions
ASSEMBLYAI_API_KEY="..." # Optional: Needed for combined parser
ANTHROPIC_API_KEY="..." # Optional: Needed for contextual parser
AWS_ACCESS_KEY="..." # Optional: Needed for AWS S3 storage
AWS_SECRET_ACCESS_KEY="..." # Optional: Needed for AWS S3 storage
```

For local development, you can enable development mode in `databridge.toml`:

```toml
[auth]
dev_mode = true  # Set to true to disable authentication for local development
```

> **Note**: Development mode should only be used for local development and testing. Always configure proper authentication in production.

### 5. Setup PostgreSQL (Default Database)

If running with postgres locally:

```bash
brew install postgresql@14
brew install pgvector
brew services start postgresql@14
createdb databridge
createuser -s postgres
```

### 6. Run Quick Setup

```bash
python quick_setup.py
```

This script will automatically:

* Configure your database
* Set up your storage
* Create the required vector index

### 7. Start the Server

```bash
python start_server.py
```

## Accessing Your DataBridge Server

Once your server is running (either through Docker or manual installation), you can access it in several ways:

### 1. Server Access Points

* API: <http://localhost:8000>
* API Documentation: <http://localhost:8000/docs>
* Health Check: <http://localhost:8000/health>

### 2. Getting Your Access URI

1. Visit the API documentation at <http://localhost:8000/docs>
2. Find and use the `/local/generate_uri` endpoint to generate your admin URI
3. Save this URI - you'll need it to connect to your server

### 3. Ways to Use DataBridge

With your URI, you can interact with DataBridge in several ways:

#### Using the Shell

```bash
python shell.py <your_local_uri>
```

#### Using the Python SDK

```python
from databridge import DataBridge
db = DataBridge("your-databridge-uri", is_local=True)
```

#### Using the UI Component

The UI provides a visual interface for prototyping and testing. To set it up:

1. Navigate to the UI directory:

```bash
cd databridge-core/ui-component
```

2. Install dependencies and start:

```bash
npm install
npm run dev
```

The UI will be available at <http://localhost:3000>. Use your generated URI to connect.

## Additional Configuration

### MongoDB Setup

1. You need a MongoDB Atlas cluster with Vector Search enabled
2. Create a database named as per your DATABRIDGE\_DB setting
3. The server will automatically create required collections and indexes

### AWS S3 Setup

1. Create an S3 bucket for document storage
2. Create an IAM user with permissions for this bucket
3. Use the access keys in your .env file

### API Keys

* OpenAI API key: Required if using OpenAI for embeddings
* Unstructured API key: Required for document parsing

## Next Steps

* See the [Quick Start Guide](/databridge-docs/getting-started/quickstart.md) to begin using your server


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://databridge.gitbook.io/databridge-docs/getting-started/installation.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
