feat: Docker runtime improvements and srsly dependency removal#58
Merged
feat: Docker runtime improvements and srsly dependency removal#58
Conversation
## Changes Made ### 🐛 Dependency Issues Fixed - Remove spacy dependency to eliminate srsly C compilation issues - Move SPARQLWrapper from test to main dependencies - Update poetry.lock to reflect dependency changes ### 🐳 Docker Runtime Improvements - Add comprehensive environment variable support for flexible path configuration - Support KBP_WORK_DIR, KBP_HOME, KBP_CONFIG_PATH, and other runtime variables - Create multi-stage Dockerfile with proper C extension handling - Add host networking support for local service connectivity ### 🛠️ New Docker Tooling - Add docker-run.sh wrapper script for easy Docker usage - Create docker-compose.app.yml for persistent services - Add comprehensive Docker usage documentation in README-docker.md - Support volume mounting with proper path resolution ### 📁 Path Configuration Enhancements - Make .kbp directory location configurable via environment variables - Support custom config file paths through KBP_CONFIG_PATH - Improve config file detection with multiple fallback locations - Enable flexible working directory configuration ### 🚀 Deployment Ready - GitHub Actions workflow for multi-platform artifact building - Support for Docker images, Python wheels, and standalone executables - Cloud native buildpack configuration (project.toml) - Comprehensive troubleshooting and usage examples ## Breaking Changes - spacy dependency removed (EntityRecognizer disabled by default anyway) - Config path resolution now prioritizes environment variables ## Benefits - ✅ Eliminates srsly compilation issues in cloud builds - ✅ Docker containers work with any directory structure - ✅ Easy deployment across different environments - ✅ Configurable paths for various runtime scenarios - ✅ Network connectivity to host services (SPARQL endpoints) 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>
Signed-off-by: David Stenglein <dave@missingmass.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
This PR significantly improves Docker runtime configuration and removes the problematic
srslydependency that was causing compilation issues with cloud native buildpacks.Problem Solved
The original issue was that
srsly(a spaCy dependency) contains C extensions that require compilation, causing failures in cloud native buildpack environments. Additionally, the CLI was hardcoded to specific paths making Docker usage difficult.Key Changes
🔧 Dependency Management
🐳 Docker Runtime Configuration
KBP_WORK_DIR- Working directoryKBP_HOME- Configuration directory (replaces hardcoded ~/.kbp)KBP_CONFIG_PATH- Custom config file locationKBP_KNOWLEDGE_BASE_PATH- Documents directoryKBP_METADATA_STORE_PATH- Metadata storage location🛠️ New Docker Tooling
scripts/docker-run.sh- User-friendly wrapper script with host networkingdocker-compose.app.yml- Production-ready compose configurationREADME-docker.md- Comprehensive Docker usage guide📁 Path Flexibility
Testing
```bash
Build and test Docker image
docker build -t knowledgebase-processor:latest .
Test with environment variables
docker run --rm -v "$(pwd):/workspace" \
-e KBP_WORK_DIR=/workspace \
-e KBP_HOME=/workspace/.kbp \
-w /workspace \
knowledgebase-processor:latest kb --help
Test wrapper script
./scripts/docker-run.sh init
./scripts/docker-run.sh scan
```
Usage Examples
Quick Start
```bash
Initialize and scan documents
./scripts/docker-run.sh init
./scripts/docker-run.sh scan
With custom directory
./scripts/docker-run.sh -w ~/Documents init
Continuous monitoring with host network access
./scripts/docker-run.sh publish --watch
```
Docker Compose
```bash
Interactive mode
docker-compose -f docker-compose.app.yml up kbp
Watch mode with Fuseki
docker-compose -f docker-compose.app.yml up fuseki kbp-watch
```
Benefits
✅ No more compilation issues - Cloud native buildpacks now work
✅ Flexible Docker deployment - Works with any directory structure
✅ Host network connectivity - Can access local SPARQL endpoints
✅ Configurable paths - Adapts to different runtime environments
✅ Production ready - Complete tooling and documentation
Breaking Changes
Migration Guide
For users currently using spacy/entity recognition:
For Docker users:
🤖 Generated with Claude Code