Completions

This guide explains how to generate AI completions using your documents as context in DataBridge. The completions feature allows you to ask questions about your documents and get accurate, contextual responses.

Basic Setup

First, ensure you have the DataBridge client initialized:

from databridge import DataBridge

# Initialize client with your DataBridge URI
db = DataBridge("databridge://owner_id:token@api.databridge.ai")

Generating Completions

The query method combines semantic search with language model completion:

# Generate a completion with context
response = db.query(
    query="What are the key findings about customer satisfaction?",
    filters={"department": "research"},
    k=4,  # Number of chunks to use as context
    min_score=0.7,  # Minimum similarity threshold
    max_tokens=500,  # Maximum length of completion
    temperature=0.7  # Controls randomness (0.0-1.0)
)

print(response.completion)  # The generated response
print(response.usage)  # Token usage statistics

How It Works

  1. Your query is used to search for relevant chunks in your documents

  2. The most relevant chunks are selected based on semantic similarity

  3. These chunks are used as context for the language model

  4. The model generates a completion that answers your query using the provided context

Advanced Usage

1. Controlling Context

Adjust how much context is used:

# Use more context chunks
response = db.query(
    query="Summarize our Q4 financial performance",
    filters={"department": "finance", "year": 2024},
    k=6,  # Increase number of context chunks
    min_score=0.8  # Higher threshold for better relevance
)

2. Temperature Control

Adjust response creativity:

# More focused, deterministic response
focused_response = db.query(
    query="What is our current refund policy?",
    temperature=0.2  # Lower temperature for more focused responses
)

# More creative, varied response
creative_response = db.query(
    query="Suggest improvements for our customer service",
    temperature=0.8  # Higher temperature for more creative responses
)

3. Token Management

Control response length and manage token usage:

# Shorter, concise response
short_response = db.query(
    query="Summarize the main points",
    max_tokens=100  # Limit response length
)

# Longer, detailed response
detailed_response = db.query(
    query="Provide a detailed analysis",
    max_tokens=1000  # Allow for longer response
)

# Track token usage
usage = detailed_response.usage
print(f"Prompt tokens: {usage['prompt_tokens']}")      # Tokens used in context and query
print(f"Completion tokens: {usage['completion_tokens']}")  # Tokens in the response
print(f"Total tokens: {usage['total_tokens']}")        # Total tokens used

# Calculate costs (if using usage-based pricing)
COST_PER_1K_TOKENS = 0.002  # Example rate
cost = (usage['total_tokens'] / 1000) * COST_PER_1K_TOKENS
print(f"Estimated cost: ${cost:.4f}")

The usage dictionary in CompletionResponse provides detailed token consumption:

  • prompt_tokens: Number of tokens in the context (retrieved chunks) and query

  • completion_tokens: Number of tokens in the generated response

  • total_tokens: Total tokens used in the operation

Monitor token usage to:

  • Optimize costs

  • Stay within rate limits

  • Track usage patterns

  • Budget appropriately

Common Use Cases

1. Question Answering

# Answer specific questions
response = db.query(
    query="What was the total revenue in Q4 2023?",
    filters={"department": "finance", "year": 2023},
    temperature=0.3  # Lower temperature for factual answers
)

2. Summarization

# Generate summaries
response = db.query(
    query="Summarize the key points from the latest board meeting",
    filters={"type": "meeting_notes", "meeting": "board"},
    k=8,  # Use more context for comprehensive summary
    max_tokens=300
)

3. Analysis and Insights

# Generate analytical insights
response = db.query(
    query="What are the emerging trends in customer behavior?",
    filters={"department": "marketing", "type": "analysis"},
    temperature=0.7,  # Balance between creativity and focus
    k=6  # More context for comprehensive analysis
)

Next Steps

After mastering completions:

  • Implement caching for frequent queries

  • Set up monitoring for token usage

  • Create templates for common query types

  • Integrate with your application's error handling and logging systems

Last updated