Skip to content

Releases: mingjerli/clgraph

Release v0.0.3

31 Dec 13:46
8a08848

Choose a tag to compare

[0.0.3] - 2025-12-29

Added

AI/LLM Agent Integration

  • Agent module (clgraph.agent): Build lineage-aware AI agents with LangChain
  • Tools module (clgraph.tools): LangChain-compatible tools for lineage queries
    • Lineage tools: trace columns backward/forward, find paths
    • Governance tools: PII detection, impact analysis
    • SQL tools: query generation, validation
    • Schema tools: table/column lookup
    • Context tools: pipeline summary, metadata
  • MCP Server (clgraph.mcp): Model Context Protocol server for Claude Desktop integration

Core Features

  • to_simplified() method for input/output only lineage graph (filters internal CTEs/subqueries)
  • build_subpipeline() convenience method for extracting sub-pipelines
  • JSON round-trip serialization: Pipeline.from_json() and Pipeline.from_json_file()
  • Template variable support in Pipeline class with template_context parameter
  • Validation framework with structured issue reporting (ValidationIssue, add_issue())
  • Logging for validation issues at library level (logger: clgraph.validation)
  • Enhanced validation for unqualified columns in JOIN conditions
  • COUNT(*) resolution to individual columns when schema is known
  • Star (*) expansion for cross-query column lineage with EXCEPT/REPLACE support
  • API validation mode with auto-generated API dictionary
  • __repr__ methods for QueryUnit, QueryUnitGraph, and Pipeline for better debugging

Visualization

  • Consolidated visualization functions into library (clgraph.visualizations)
  • visualize_pipeline_lineage(), visualize_table_dependencies(), visualize_column_lineage()

Examples

  • ClickHouse example with enterprise data pipeline (raw → staging → analytics → marts)
  • Enterprise demo with Ollama for local LLM integration

Changed

  • Breaking: Renamed package from clpipe to clgraph
  • Breaking: Removed GraphVizExporter class (use visualize_* functions from clgraph.visualizations instead)
  • Renamed query_lineages to query_graphs for clarity
  • Unified column naming for cross-pipeline lineage
  • Removed redundant save_metadata, load_metadata, apply_metadata methods (use to_json/from_json instead)
  • Filter redundant input star nodes from lineage graph
  • Updated minimum sqlglot version to >=28.0.0

Fixed

  • Pin Airflow to 2.x for API stability (3.x has breaking changes)
  • Handle sqlglot 28.x breaking change in EXCEPT/REPLACE key names
  • Exclude star nodes from simplified lineage view
  • SELECT queries without destination now treated as virtual result tables ({query_id}_result)
  • Sanitize Graphviz node IDs to avoid colon port syntax issues
  • Handle Schema objects in multi_query with version fallback
  • Fix metadata propagation with two-pass approach

Documentation

  • Revamped README with user-focused messaging and updated examples
  • Added architecture diagram
  • Added illustration and expanded introduction with use cases
  • Comprehensive docstrings and output examples in README