Split Wonderware pipeline into reusable connector + pipeline by 514Ben · Pull Request #194 · 514-labs/registry

514Ben · 2026-02-06T21:53:49Z

Summary

This PR splits the Wonderware pipeline into two components:

Reusable Connector (connector-registry/wonderware/) - Handles data access from Wonderware Historian
Focused Pipeline (pipeline-registry/wonderware_to_clickhouse/) - Handles ClickHouse storage

This follows the established SAP HANA CDC pattern and enables the Wonderware connector to be reused by other pipelines.

🎯 Changes

✨ New: Wonderware Connector

Created complete connector in connector-registry/wonderware/ with 4-level hierarchy:

Root → Author (514-labs) → Language (python) → Implementation (default)

Core modules:

config.py - Connection configuration (host, port, database, credentials)
connection_manager.py - SQLAlchemy connection pool with circuit breaker pattern
reader.py - Data extraction (discover_tags, fetch_history_data, get_tag_count, test_connection)
connector.py - High-level facade providing simple API
models.py - Domain models (TagInfo, HistoryRow, ConnectorStatus)
Complete test suite with SQLite mock fixtures

Metadata:

Category: historian
Capabilities: Extract ✅, Transform ❌, Load ❌
Tags: historian, scada, aveva, wonderware, sql-server, industrial

🔄 Updated: Pipeline

Configuration changes:

wonderware_config.py: Renamed WonderwareConfig → PipelineConfig
Removed all connection fields (now in connector)
Changed env prefix: WONDERWARE_ → WONDERWARE_PIPELINE_
Kept only pipeline-specific fields: tag_chunk_size, backfill_chunk_days, sync_schedule, etc.

Workflow updates:

wonderware_sync.py: Uses WonderwareConnector with clean imports
wonderware_backfill.py: Uses WonderwareConnector with clean imports
Added _get_cached_tags() helper for Redis caching in sync workflow

Infrastructure:

Added symlink: app/wonderware → connector source (enables clean imports without path manipulation)

Tests:

Updated test_wonderware_config.py to test PipelineConfig
Updated conftest.py with new env var prefix
Added test to verify connection fields are NOT in pipeline config

🗑️ Deleted

app/workflows/lib/wonderware_client.py - Logic moved to connector

📋 Environment Variables

Connector (unchanged from original)

WONDERWARE_HOST           # Required
WONDERWARE_PORT           # Default: 1433
WONDERWARE_DATABASE       # Default: Runtime
WONDERWARE_USERNAME       
WONDERWARE_PASSWORD       
WONDERWARE_DRIVER         # Default: mssql+pytds

Pipeline (new prefix)

WONDERWARE_PIPELINE_TAG_CHUNK_SIZE           # Default: 10
WONDERWARE_PIPELINE_BACKFILL_CHUNK_DAYS      # Default: 1
WONDERWARE_PIPELINE_SYNC_SCHEDULE            # Default: */1 * * * *
WONDERWARE_PIPELINE_BACKFILL_OLDEST_TIME     # Default: 2025-01-01 00:00:00
WONDERWARE_PIPELINE_TAG_CACHE_TTL            # Default: 3600

✅ Benefits

Reusability - Wonderware connector can be used by other pipelines
Separation of Concerns - Connector handles data access, pipeline handles storage
Maintainability - Updates to connection logic only need to happen in the connector
Consistency - Follows established SAP HANA CDC pattern
Testing - Each component has independent test suite

🧪 Verification

✅ All connector Python files compile without syntax errors
✅ All pipeline Python files compile without syntax errors
✅ Symlink resolves correctly to connector source
✅ Import chain works: from wonderware import WonderwareConnector
✅ No path manipulation required (clean imports)

📊 Files Changed

28 files changed, 2,183 insertions(+)
18 new files in connector-registry
10 files in pipeline (new/modified)

🔍 Review Notes

Please verify:

Connector metadata is correct
Symlink works in your environment
Environment variable naming is acceptable
Tests are comprehensive
Documentation is clear

## Changes ### New: Wonderware Connector (connector-registry/wonderware/) - Created 4-level hierarchy following SAP HANA CDC pattern - **config.py**: Connection configuration (host, port, database, credentials) - **connection_manager.py**: SQLAlchemy connection pool with circuit breaker - **reader.py**: Data extraction (discover_tags, fetch_history_data) - **connector.py**: High-level facade providing simple API - **models.py**: Domain models (TagInfo, HistoryRow, ConnectorStatus) - Complete test suite with mock fixtures ### Updated: Pipeline (pipeline-registry/wonderware_to_clickhouse/) - **wonderware_config.py**: Renamed to PipelineConfig, removed connection fields - Changed env prefix to WONDERWARE_PIPELINE_ - Kept only: tag_chunk_size, backfill_chunk_days, sync_schedule, etc. - **wonderware_sync.py**: Updated to use WonderwareConnector - **wonderware_backfill.py**: Updated to use WonderwareConnector - **app/wonderware**: Added symlink to connector (clean imports, no path manipulation) - **tests**: Updated to test PipelineConfig without connection fields ### Deleted - **wonderware_client.py**: Logic moved to connector ## Benefits - Connector can be reused by other pipelines - Clear separation: connector handles data access, pipeline handles ClickHouse - Follows established patterns (SAP HANA CDC) - Each component has independent tests Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

vercel · 2026-02-06T21:53:55Z

The latest updates on your projects. Learn more about Vercel for GitHub.

Project	Deployment	Actions	Updated (UTC)
registry	Ready	Preview, Comment	Feb 6, 2026 11:24pm

## Documentation Added ### Main Documentation - **README.md**: Overview, features, installation, quick start, API summary, examples, troubleshooting ### Detailed Guides (docs/) - **configuration.md**: Complete configuration reference - Environment variables - Connection settings - Security best practices - Advanced configuration (circuit breaker, retry logic) - Troubleshooting connection issues - **getting-started.md**: Step-by-step tutorial - Installation and setup - Connection testing - Tag discovery - Historical data fetching - Batch processing patterns - Incremental sync patterns - Error handling examples - Common usage patterns - **api-reference.md**: Complete API documentation - WonderwareConnector class - WonderwareConfig class - WonderwareReader class - ConnectionPool class - Data models (TagInfo, HistoryRow, ConnectorStatus) - Exceptions - Type hints and advanced usage ## Coverage - ✅ Installation instructions (standalone + bundled) - ✅ Configuration guide with all options - ✅ Quick start examples - ✅ Complete API reference - ✅ Usage patterns and best practices - ✅ Error handling examples - ✅ Security best practices - ✅ Troubleshooting guide Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

## Changes Updated pipeline README.md to reflect the new architecture where the pipeline uses the reusable Wonderware connector: ### Architecture Updates - Added "What's New" section explaining connector split - Updated component diagram showing connector as external dependency - Updated data flow diagram with connector layer - Clarified separation: connector handles data access, pipeline handles storage ### Configuration Updates - Split configuration section into: - Connector config (WONDERWARE_* prefix) - Pipeline config (WONDERWARE_PIPELINE_* prefix) - Added links to connector configuration documentation - Clarified which settings belong where ### Code References Updates - Replaced references to `wonderware_client.py` with connector API - Updated workflow descriptions to show connector usage - Added import examples: `from wonderware import WonderwareConnector` - Removed outdated `WonderwareClient` references ### Troubleshooting Updates - Added section for connector-specific issues - Added links to connector troubleshooting guide - Updated connection testing examples to use connector ### Documentation Links - Added "Related Documentation" section with links to: - Connector README - Connector configuration guide - Connector API reference ## Impact - Users now understand the two-component architecture - Clear separation between connector and pipeline configuration - Updated examples use the new connector API - All internal references are now accurate Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

## Documentation Added ### getting-started.md (Updated) Complete step-by-step tutorial covering: - Prerequisites and installation - Configuration (split connector vs pipeline) - Starting the pipeline and testing connection - Running historical backfill with Temporal UI - Monitoring with Temporal UI and APIs - Querying data via REST API and ClickHouse - Next steps and troubleshooting ### configuration.md (New) Detailed configuration reference with: - Configuration overview (two-namespace model) - Connector configuration (WONDERWARE_*) - Pipeline configuration (WONDERWARE_PIPELINE_*) - ClickHouse and Redis configuration - Performance tuning guidelines - Security configuration best practices - Environment-specific configurations - Configuration validation scripts ### workflows.md (New) Complete workflow documentation: - Backfill workflow (4-task DAG) - Task-by-task breakdown with code - Performance optimization tips - Best practices - Sync workflow (single task) - Watermark logic explanation - Caching strategy - Sync frequency tuning - Workflow management (pause/cancel/retry) - Error handling and debugging - Monitoring and alerting ### apis.md (New) Complete API reference: - All REST endpoints documented - Request/response formats - Query parameters - Example curl, Python, JavaScript - Error handling - Rate limiting guidance - Real-world usage examples (dashboard, export, monitoring) - Grafana integration guide ## Coverage ✅ Installation and setup ✅ Configuration (connector + pipeline) ✅ Workflows (backfill + sync) ✅ APIs (all endpoints) ✅ Monitoring and debugging ✅ Performance tuning ✅ Security best practices ✅ Production deployment guidance ✅ Troubleshooting guides ✅ Code examples in multiple languages ## Total Documentation - **4 comprehensive guides** (~600+ lines each) - **~2,400 lines** of detailed documentation - **Numerous code examples** (Python, Bash, SQL, JavaScript) - **Diagrams and architecture explanations** - **Links to connector documentation** Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

Replace incorrect npm installation command with the correct bash script installation method for Moose CLI. Changes: - docs/getting-started.md: Updated Moose installation from npm to bash script Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

Changes: - Updated all .python-version files to 3.13 - Updated all README.md and getting-started.md files - Updated setup.py python_requires and classifiers - Affects: wonderware_to_clickhouse, qvd_to_clickhouse, sap_hana_cdc_to_clickhouse, and sap_hana_cdc connector Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

Removed "Option B: Install Dependencies Only" section from the connector getting-started guide. The connector should be installed from the registry, not as standalone dependencies. Changes: - docs/getting-started.md: Removed Option B section - Simplified to single installation method Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

Removed pip-related instructions since we're not targeting new Python users. Users are expected to already have pip installed with Python. Changes: - wonderware_to_clickhouse/docs/getting-started.md: Removed pip prerequisite section - qvd_to_clickhouse/docs/getting-started.md: Removed pip/uv package manager line Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

vercel bot deployed to Preview February 6, 2026 21:53 View deployment

vercel bot deployed to Preview February 6, 2026 22:02 View deployment

vercel bot deployed to Preview February 6, 2026 22:06 View deployment

vercel bot deployed to Preview February 6, 2026 22:15 View deployment

vercel bot deployed to Preview February 6, 2026 22:37 View deployment

514Ben and others added 2 commits February 6, 2026 17:42

vercel bot deployed to Preview February 6, 2026 22:44 View deployment

vercel bot deployed to Preview February 6, 2026 23:24 View deployment

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Split Wonderware pipeline into reusable connector + pipeline#194

Split Wonderware pipeline into reusable connector + pipeline#194
514Ben wants to merge 8 commits intomainfrom
wonderware-pipeline

514Ben commented Feb 6, 2026

Uh oh!

vercel bot commented Feb 6, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

514Ben commented Feb 6, 2026

Summary

🎯 Changes

✨ New: Wonderware Connector

🔄 Updated: Pipeline

🗑️ Deleted

📋 Environment Variables

Connector (unchanged from original)

Pipeline (new prefix)

✅ Benefits

🧪 Verification

📊 Files Changed

🔍 Review Notes

Uh oh!

vercel bot commented Feb 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

vercel bot commented Feb 6, 2026 •

edited

Loading