This guide explains how to generate AI completions using your documents as context in DataBridge. The completions feature allows you to ask questions about your documents and get accurate, contextual responses.
Basic Setup
First, ensure you have the DataBridge client initialized:
from databridge import DataBridge
# Initialize client with your DataBridge URI
db = DataBridge("databridge://owner_id:token@api.databridge.ai")
Generating Completions
The query method combines semantic search with language model completion:
# Generate a completion with context
response = db.query(
query="What are the key findings about customer satisfaction?",
filters={"department": "research"},
k=4, # Number of chunks to use as context
min_score=0.7, # Minimum similarity threshold
max_tokens=500, # Maximum length of completion
temperature=0.7 # Controls randomness (0.0-1.0)
)
print(response.completion) # The generated response
print(response.usage) # Token usage statistics
How It Works
Your query is used to search for relevant chunks in your documents
The most relevant chunks are selected based on semantic similarity
These chunks are used as context for the language model
The model generates a completion that answers your query using the provided context
Advanced Usage
1. Controlling Context
Adjust how much context is used:
# Use more context chunks
response = db.query(
query="Summarize our Q4 financial performance",
filters={"department": "finance", "year": 2024},
k=6, # Increase number of context chunks
min_score=0.8 # Higher threshold for better relevance
)
2. Temperature Control
Adjust response creativity:
# More focused, deterministic response
focused_response = db.query(
query="What is our current refund policy?",
temperature=0.2 # Lower temperature for more focused responses
)
# More creative, varied response
creative_response = db.query(
query="Suggest improvements for our customer service",
temperature=0.8 # Higher temperature for more creative responses
)
3. Token Management
Control response length and manage token usage:
# Shorter, concise response
short_response = db.query(
query="Summarize the main points",
max_tokens=100 # Limit response length
)
# Longer, detailed response
detailed_response = db.query(
query="Provide a detailed analysis",
max_tokens=1000 # Allow for longer response
)
# Track token usage
usage = detailed_response.usage
print(f"Prompt tokens: {usage['prompt_tokens']}") # Tokens used in context and query
print(f"Completion tokens: {usage['completion_tokens']}") # Tokens in the response
print(f"Total tokens: {usage['total_tokens']}") # Total tokens used
# Calculate costs (if using usage-based pricing)
COST_PER_1K_TOKENS = 0.002 # Example rate
cost = (usage['total_tokens'] / 1000) * COST_PER_1K_TOKENS
print(f"Estimated cost: ${cost:.4f}")
The usage dictionary in CompletionResponse provides detailed token consumption:
prompt_tokens: Number of tokens in the context (retrieved chunks) and query
completion_tokens: Number of tokens in the generated response
total_tokens: Total tokens used in the operation
Monitor token usage to:
Optimize costs
Stay within rate limits
Track usage patterns
Budget appropriately
Common Use Cases
1. Question Answering
# Answer specific questions
response = db.query(
query="What was the total revenue in Q4 2023?",
filters={"department": "finance", "year": 2023},
temperature=0.3 # Lower temperature for factual answers
)
2. Summarization
# Generate summaries
response = db.query(
query="Summarize the key points from the latest board meeting",
filters={"type": "meeting_notes", "meeting": "board"},
k=8, # Use more context for comprehensive summary
max_tokens=300
)
3. Analysis and Insights
# Generate analytical insights
response = db.query(
query="What are the emerging trends in customer behavior?",
filters={"department": "marketing", "type": "analysis"},
temperature=0.7, # Balance between creativity and focus
k=6 # More context for comprehensive analysis
)
Next Steps
After mastering completions:
Implement caching for frequent queries
Set up monitoring for token usage
Create templates for common query types
Integrate with your application's error handling and logging systems