Skip to content

Conversation

@rvguha
Copy link
Contributor

@rvguha rvguha commented Dec 19, 2025

Summary

  • Add PostgreSQL backend for conversation storage with compressed JSON serialization
  • Implement pluggable conversation storage backends (PostgreSQL, Azure Table Storage)
  • Add conversation management utilities and authentication framework
  • Update protocol models to properly serialize @type fields using Pydantic aliases
  • Rename query.decontextualized_text to query.decontextualized_query for consistency
  • Add TEST_USER environment variable support for testing
  • Consolidate LLM and embedding providers into provider-specific packages
  • Remove duplicate provider implementations from bundles package
  • Add request context tracking and rate limiting infrastructure
  • Clean up debug logging throughout codebase

Key Features

PostgreSQL Conversation Storage

  • Full conversation turn storage (request + response)
  • JSONB columns with compressed JSON (separators=(',', ':'))
  • Proper @type serialization in stored data
  • Async PostgreSQL client using asyncpg
  • Schema initialization on server startup
  • dump_conversations.py utility for viewing stored conversations

Protocol Updates

  • Fixed Pydantic serialization to use @type instead of schema_type
  • Added ser_json_by_alias=True to model configs
  • Consistent field naming across request/response
  • Support for meta.user and meta.remember fields

Provider Consolidation

  • Moved all LLM providers to packages/providers/*/models
  • Removed duplicate implementations from packages/bundles/models
  • Cleaner separation of concerns

Test plan

  • Server starts successfully with PostgreSQL conversation storage
  • Conversations are saved to PostgreSQL with proper @type serialization
  • dump_conversations.py displays saved conversations correctly
  • JSON is compressed (no whitespace) in database
  • TEST_USER environment variable is loaded and sent with queries
  • decontextualized_query field works correctly

🤖 Generated with Claude Code

rvguha and others added 2 commits December 19, 2025 10:24
Major changes:
- Add PostgreSQL backend for conversation storage with compressed JSON
- Implement conversation storage backends (PostgreSQL, Azure Table Storage)
- Add conversation management utilities and authentication
- Update protocol models to use @type serialization with Pydantic aliases
- Rename query.decontextualized_text to query.decontextualized_query
- Add TEST_USER environment variable support
- Consolidate LLM and embedding providers into provider-specific packages
- Remove duplicate provider implementations from bundles package
- Add request context tracking and rate limiting infrastructure
- Clean up debug logging throughout codebase

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
Resolved conflicts:
- config.py: Added object_storage configuration from main
- llm.py: Kept debug logging and model config extraction
- server.py: Added /config endpoint for TEST_USER
- pi_labs.py: Removed (consolidated into providers)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
Sanitize user-provided values in log statements to prevent log injection attacks:
- baseNLWeb.py: Sanitize query text before logging
- conversation/auth.py: Sanitize conversation_id and user_id values
- rate_limiter.py: Sanitize client_id before logging

All newline and carriage return characters are escaped to prevent log forgery.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
@rvguha rvguha merged commit ea1eec2 into main Dec 19, 2025
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants