Knowledge Base Commands
Commands for managing the AI knowledge base, vector search, and pgvector integration.
Setup Commands
setup_knowbase
Setup Knowledge Base with pgvector extension and migrations.
python manage.py setup_knowbase [OPTIONS]Options:
--skip-extensions- Skip creating PostgreSQL extensions
Complete Setup
First-Time Setup
python manage.py setup_knowbaseWhat it does:
- ✅ Creates
pgvectorextension for vector similarity search - ✅ Creates
pg_trgmextension for text search - ✅ Runs required database migrations
- ✅ Sets up vector indexes
Output:
🚀 Setting up Knowledge Base...
📦 Creating PostgreSQL extensions...
✓ pgvector extension created
✓ pg_trgm extension created
✅ Knowledge Base setup completed!Skip Extensions (Already Created)
python manage.py setup_knowbase --skip-extensionsUse when:
- Extensions already created manually
- Running as non-superuser
- Extensions exist from previous setup
Statistics Commands
knowbase_stats
Display knowledge base statistics and metrics.
python manage.py knowbase_stats [OPTIONS]Options:
--format [json|yaml|table]- Output format (default: table)--detailed- Show detailed statistics
Viewing Statistics
Basic Statistics
python manage.py knowbase_statsOutput:
📚 Knowledge Base Statistics
==================================================
Documents: 1,234
Embeddings: 1,234
Average Embedding Size: 1536
Storage Used: 45.2 MB
Last Updated: 2025-10-01 14:30:00Detailed Statistics
python manage.py knowbase_stats --detailedAdditional output:
Document Types:
Text: 856 (69.4%)
PDF: 234 (19.0%)
Markdown: 144 (11.6%)
Index Performance:
Average Query Time: 12ms
Cache Hit Rate: 87.3%
Total Queries (24h): 4,567
Storage Breakdown:
Documents: 12.3 MB
Embeddings: 32.9 MB
Indexes: 0.8 MBJSON Export
python manage.py knowbase_stats --format jsonOutput:
{
"documents": 1234,
"embeddings": 1234,
"average_embedding_size": 1536,
"storage_used_mb": 45.2,
"last_updated": "2025-10-01T14:30:00Z",
"document_types": {
"text": 856,
"pdf": 234,
"markdown": 144
}
}YAML Export
python manage.py knowbase_stats --format yamlOutput:
documents: 1234
embeddings: 1234
average_embedding_size: 1536
storage_used_mb: 45.2
last_updated: "2025-10-01T14:30:00Z"
document_types:
text: 856
pdf: 234
markdown: 144PostgreSQL Extensions
pgvector Extension
Purpose: Enables vector similarity search
Features:
- Store embeddings as vector data types
- Similarity search (cosine, L2, inner product)
- Indexing for fast retrieval
- Works with OpenAI, Cohere, Hugging Face embeddings
Example query:
SELECT * FROM knowbase_document
ORDER BY embedding <=> '[0.1, 0.2, ...]'::vector
LIMIT 10;pg_trgm Extension
Purpose: Enables fuzzy text search
Features:
- Trigram-based text search
- Fuzzy matching
- Fast text similarity queries
- Works with LIKE, ILIKE, similarity()
Example query:
SELECT * FROM knowbase_document
WHERE content % 'search query'
ORDER BY similarity(content, 'search query') DESC;Common Tasks
Initial Setup Workflow
# 1. Setup Knowledge Base
python manage.py setup_knowbase
# 2. Verify setup
python manage.py knowbase_stats
# 3. Check database extensions
psql -d your_database -c "\dx"Manual Extension Creation
If setup_knowbase fails due to permissions:
# Connect as PostgreSQL superuser
psql -U postgres -d your_database
# Create extensions manually
CREATE EXTENSION IF NOT EXISTS vector;
CREATE EXTENSION IF NOT EXISTS pg_trgm;
# Verify extensions
\dxThen run setup without extensions:
python manage.py setup_knowbase --skip-extensionsMonitor Knowledge Base Growth
# Create script to track growth
#!/bin/bash
while true; do
echo "$(date)"
python manage.py knowbase_stats --format json | jq '.documents'
sleep 3600 # Every hour
doneExport Statistics for Monitoring
# Export to monitoring system
python manage.py knowbase_stats --format json > /var/log/knowbase_stats.json
# Schedule with cron
0 * * * * cd /path/to/project && python manage.py knowbase_stats --format json > /var/log/knowbase_stats_$(date +\%Y\%m\%d_\%H).jsonConfiguration
Database Requirements
Minimum PostgreSQL Version: 11+ Recommended Version: 14+
pgvector Configuration
Add to postgresql.conf:
# Increase shared_buffers for better vector performance
shared_buffers = 256MB
# Optimize for vector operations
max_parallel_workers_per_gather = 4Django Configuration
# config.py
from django_cfg import DjangoConfig, DatabaseConfig
class MyConfig(DjangoConfig):
# Enable Knowledge Base
enable_knowbase: bool = True
# Configure database
databases: dict[str, DatabaseConfig] = {
"default": DatabaseConfig(
engine="postgresql",
name="mydb",
host="localhost",
port=5432,
)
}Troubleshooting
Extension Creation Failed
Error:
❌ Failed to create extensions: permission deniedSolution:
# Option 1: Run as superuser
psql -U postgres -d your_database -c "CREATE EXTENSION vector;"
psql -U postgres -d your_database -c "CREATE EXTENSION pg_trgm;"
# Option 2: Grant permissions
GRANT CREATE ON DATABASE your_database TO your_user;Missing pgvector Extension
Error:
could not open extension control file: No such file or directorySolution:
# Install pgvector
# Ubuntu/Debian
sudo apt-get install postgresql-14-pgvector
# macOS (Homebrew)
brew install pgvector
# From source
git clone https://github.com/pgvector/pgvector.git
cd pgvector
make
sudo make installPerformance Issues
Slow queries?
-- Create indexes for better performance
CREATE INDEX ON knowbase_document USING ivfflat (embedding vector_cosine_ops);
CREATE INDEX ON knowbase_document USING gin (content gin_trgm_ops);Best Practices
1. Setup Early in Development
# Run setup immediately after project creation
django-cfg create-project "My Project" && \
cd my_project && \
python manage.py migrate_all && \
python manage.py setup_knowbase2. Monitor Storage Usage
# Check storage regularly
python manage.py knowbase_stats --detailed
# Set up alerts for storage limits3. Regular Statistics Export
# Daily statistics export
0 0 * * * python manage.py knowbase_stats --format json > /backups/kb_stats_$(date +\%Y\%m\%d).json4. Backup Extensions Configuration
# Backup PostgreSQL extensions
pg_dump --schema-only --create your_database > extensions_backup.sql5. Test Vector Search Performance
# In Django shell
from django_cfg.apps.knowbase.models import Document
import time
start = time.time()
results = Document.objects.vector_search("test query", limit=10)
duration = time.time() - start
print(f"Query took {duration*1000:.2f}ms")Related Documentation
- Quick Reference - Fast command lookup
- AI Agents Commands - Agent management
- Knowledge Base Guide - Complete KB documentation
Knowledge Base powers semantic search! 📚