Processing Rules
Overview
Types of Rules
Metadata Extraction Rules
from databridge import DataBridge, MetadataExtractionRule
from pydantic import BaseModel
# Define your metadata schema
class ArticleMetadata(BaseModel):
title: str
author: str
publication_date: str
topics: list[str]
# Create the rule
metadata_rule = MetadataExtractionRule(schema=ArticleMetadata)
# Use during ingestion
db = DataBridge("your-uri")
doc = db.ingest_text(
content="Your article content...",
rules=[metadata_rule]
)
# The extracted metadata will be merged with any provided metadata
print(f"Extracted metadata: {doc.metadata}")Natural Language Rules
Rule Configuration
Common Use Cases
PII Detection and Removal
Content Classification
Text Summarization
Format Standardization
Defining Custom Rules
Core Implementation
SDK Implementation
Example: Custom Metadata Extraction
Best Practices for Custom Rules
Best Practices
Next Steps
Last updated