Skip to content

gorakhnathy7/datahub-signoz-integration

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

17 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

DataHub + SigNoz Metadata Integration

This repository contains a working integration that turns live service health (from SigNoz) into actionable metadata in DataHub: operational incidents, entity tagging alongwith lineage-based downstream sub-entity tagging. It is made to be easy to run, easy to explain, and easy to adapt.

What this showcases

  • Operational context in DataHub: creates an operational incident when a service is unhealthy.
  • Automated blast radius: applies a failure tag to downstream datasets via lineage.
  • Live telemetry: health is read from the SigNoz Query Range API

Architecture at a glance

  1. OpenTelemetry Demo emits traces/metrics.
  2. SigNoz stores them and exposes metrics via API.
  3. The integration polls SigNoz, computes health, and emits DataHub assertion runs.
  4. DataHub incidents + lineage tags visualize impact.

Repo layout

This repo contains only the integration. It expects the following sibling repos in the same parent directory:

  • ../signoz/ -> SigNoz stack (Docker Compose)
  • ../otel-demo/ -> OpenTelemetry demo app (traffic generator)
  • ../datahub/ -> DataHub quickstart (Docker Compose)
  • integration/ -> Integration implementation + docs (this README)

Prerequisites

Quickstart (end-to-end)

Use can use separate terminals for each stack.

Terminal A: SigNoz

cd ../signoz/deploy/docker
docker compose up -d

Terminal B: DataHub quickstart

DATAHUB_VERSION=head UI_INGESTION_DEFAULT_CLI_VERSION=head DATAHUB_MAPPED_GMS_PORT=8082 \
docker compose -p datahub -f ../datahub/docker/quickstart/docker-compose.quickstart.yml up -d

Verify DataHub is healthy:

curl -i http://localhost:8082/health

Terminal C: OpenTelemetry Demo (traffic generator)

cd ../otel-demo
FRONTEND_PORT=8083 ENVOY_PORT=8083 \
OTEL_COLLECTOR_HOST=host.docker.internal \
OTEL_COLLECTOR_PORT_GRPC=4317 OTEL_COLLECTOR_PORT_HTTP=4318 \
PUBLIC_OTEL_EXPORTER_OTLP_TRACES_ENDPOINT=http://localhost:8083/otlp-http/v1/traces \
docker compose up -d

Confirm telemetry:

Create a SigNoz API key:

  • SigNoz UI -> Settings -> API Keys -> Create

Terminal D: Integration (venv)

cd integration
python -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt

Start the integration (reads live SigNoz metrics only):

export DATAHUB_SERVER=http://localhost:8082
export SIGNOZ_API_URL=http://localhost:8080
export SIGNOZ_API_KEY=YOUR_API_KEY
export SIGNOZ_SERVICE_NAME=payment
export SIGNOZ_SERVICE_FILTER_KEY=service.name
export SIGNOZ_STATUS_CODE_KEY=status.code
# If SigNoz returns 400 for filter expressions, disable filters and let the
# integration filter locally by service name.
export SIGNOZ_DISABLE_FILTERS=true
python bridge.py

Notes:

  • The integration always reads from the SigNoz API (no file-based fallback).
  • If payment is not listed in SigNoz, set SIGNOZ_SERVICE_NAME to a service shown in the SigNoz Services view.

Common setup fixes

OTEL demo shows no services in SigNoz

Make sure the OTEL demo is exporting to the SigNoz collector on your host:

  • OTEL_COLLECTOR_HOST=host.docker.internal
  • OTEL_COLLECTOR_PORT_GRPC=4317
  • OTEL_COLLECTOR_PORT_HTTP=4318

OTEL collector crash: "bind: cannot assign requested address"

If otel-collector is crash-looping on port 4317, disable it (the demo can export directly to SigNoz):

cd ../otel-demo
OTEL_COLLECTOR_HOST=host.docker.internal \
OTEL_COLLECTOR_PORT_GRPC=4317 \
OTEL_COLLECTOR_PORT_HTTP=4318 \
docker compose up -d --scale otel-collector=0

OTEL demo container name conflicts

If you see “container name already in use”, clean up and restart:

cd ../otel-demo
docker compose down -v --remove-orphans
docker rm -f \
  otel-collector llm image-provider fraud-detection cart product-catalog payment shipping ad email flagd-ui accounting quote currency \
  jaeger grafana valkey-cart opensearch postgresql kafka prometheus flagd \
  product-reviews recommendation checkout frontend load-generator frontend-proxy
docker network rm opentelemetry-demo

DataHub upgrade OOM-killed

If datahub-upgrade exits with OOM, increase Docker Desktop memory to 10–12 GB and re-run:

docker compose -p datahub -f ../datahub/docker/quickstart/docker-compose.quickstart.yml down
DATAHUB_VERSION=head UI_INGESTION_DEFAULT_CLI_VERSION=head DATAHUB_MAPPED_GMS_PORT=8082 \
docker compose -p datahub -f ../datahub/docker/quickstart/docker-compose.quickstart.yml up -d

Verify the demo in DataHub

Incident (service health)

  • Open the Kafka dataset: payment-ingestion-worker
  • Incidents tab shows Service Health Failure (Operational, Active)

Lineage + blast radius

  • Open the Hive dataset: payment_table
  • Lineage -> Downstreams should include:
    • payment_agg_table -> payment_mart -> payment_reporting_table
  • Each downstream asset should have the Upstream_Failure tag

Recovery (optional)

  1. Turn the failure flag off in Flagd UI: http://localhost:55518
  2. Wait ~60 seconds for the next polling cycle
  3. Confirm:
    • Incident becomes Resolved
    • Upstream_Failure tag is removed from downstream assets

Cleanup demo artifacts in DataHub

This removes the mock datasets, custom assertion, and the demo incident.

python -m bridge_app.cleanup

Notes:

  • This does not delete your real datasets or tags.
  • It only removes entities created by the demo.
  • It also deletes the local incident_state.json file.

Using this with a real service

  1. Instrument your service with OpenTelemetry

    • Set service.name to the logical name you want to monitor.
    • Export OTLP to your SigNoz collector (gRPC 4317 or HTTP 4318).
  2. Verify telemetry in SigNoz

    • SigNoz -> Services -> confirm your service.name is listed.
  3. Configure the integration

export SIGNOZ_SERVICE_NAME=your-service-name
export SIGNOZ_API_URL=https://your-signoz-host
export SIGNOZ_API_KEY=your-signoz-api-key
export SIGNOZ_SERVICE_FILTER_KEY=service.name
export SIGNOZ_STATUS_CODE_KEY=status.code
export SERVICE_URN="urn:li:dataset:(urn:li:dataPlatform:your_platform,your_service_dataset,PROD)"
export PRIMARY_TABLE_URN="urn:li:dataset:(urn:li:dataPlatform:your_platform,your_primary_table,PROD)"
  1. Run the integration
  • Incidents and lineage tags will reflect your live production health signal.

Troubleshooting

  • SigNoz returns empty data
    • Confirm your service name in SigNoz Services view.
    • Ensure your API key is valid and has access.
    • Verify the SigNoz URL points to the UI/API host.
  • SigNoz query returns 400 (key not found or syntax errors)
    • Use context-aware keys:
      • SIGNOZ_SERVICE_FILTER_KEY=service.name
      • SIGNOZ_STATUS_CODE_KEY=status.code
    • If it still fails, set:
      • SIGNOZ_DISABLE_FILTERS=true
  • DataHub incident does not appear
    • Check DATAHUB_SERVER and ensure GMS is healthy.
    • Look for integration logs about assertion upsert or incident creation.
  • Lineage tags do not appear
    • Ensure the mock lineage setup ran (integration logs).
    • Verify dataset URNs in config match DataHub entities.

Thanks for trying DataHub. If you have questions, feel free to drop them in our Slack channel.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors