This repository contains a working integration that turns live service health (from SigNoz) into actionable metadata in DataHub: operational incidents, entity tagging alongwith lineage-based downstream sub-entity tagging. It is made to be easy to run, easy to explain, and easy to adapt.
- Operational context in DataHub: creates an operational incident when a service is unhealthy.
- Automated blast radius: applies a failure tag to downstream datasets via lineage.
- Live telemetry: health is read from the SigNoz Query Range API
- OpenTelemetry Demo emits traces/metrics.
- SigNoz stores them and exposes metrics via API.
- The integration polls SigNoz, computes health, and emits DataHub assertion runs.
- DataHub incidents + lineage tags visualize impact.
This repo contains only the integration. It expects the following sibling repos in the same parent directory:
../signoz/-> SigNoz stack (Docker Compose)../otel-demo/-> OpenTelemetry demo app (traffic generator)../datahub/-> DataHub quickstart (Docker Compose)integration/-> Integration implementation + docs (this README)
- Docker + Docker Compose
- Python 3.11+ (create a virtualenv if you don’t already have one)
- Docker Desktop memory: 10–12 GB recommended (DataHub + SigNoz + OTEL is heavy)
- Free local ports:
- SigNoz UI: http://localhost:8080
- DataHub UI: http://localhost:9002
- DataHub GMS: http://localhost:8082
- OpenTelemetry Demo UI: http://localhost:8083
Use can use separate terminals for each stack.
cd ../signoz/deploy/docker
docker compose up -d
DATAHUB_VERSION=head UI_INGESTION_DEFAULT_CLI_VERSION=head DATAHUB_MAPPED_GMS_PORT=8082 \
docker compose -p datahub -f ../datahub/docker/quickstart/docker-compose.quickstart.yml up -d
Verify DataHub is healthy:
curl -i http://localhost:8082/health
cd ../otel-demo
FRONTEND_PORT=8083 ENVOY_PORT=8083 \
OTEL_COLLECTOR_HOST=host.docker.internal \
OTEL_COLLECTOR_PORT_GRPC=4317 OTEL_COLLECTOR_PORT_HTTP=4318 \
PUBLIC_OTEL_EXPORTER_OTLP_TRACES_ENDPOINT=http://localhost:8083/otlp-http/v1/traces \
docker compose up -d
Confirm telemetry:
- Demo UI: http://localhost:8083
- SigNoz UI: http://localhost:8080 (Services/Traces should list demo services)
- Pick a service name to monitor (example:
payment,frontend-proxy,ad)
Create a SigNoz API key:
- SigNoz UI -> Settings -> API Keys -> Create
cd integration
python -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt
Start the integration (reads live SigNoz metrics only):
export DATAHUB_SERVER=http://localhost:8082
export SIGNOZ_API_URL=http://localhost:8080
export SIGNOZ_API_KEY=YOUR_API_KEY
export SIGNOZ_SERVICE_NAME=payment
export SIGNOZ_SERVICE_FILTER_KEY=service.name
export SIGNOZ_STATUS_CODE_KEY=status.code
# If SigNoz returns 400 for filter expressions, disable filters and let the
# integration filter locally by service name.
export SIGNOZ_DISABLE_FILTERS=true
python bridge.py
Notes:
- The integration always reads from the SigNoz API (no file-based fallback).
- If
paymentis not listed in SigNoz, setSIGNOZ_SERVICE_NAMEto a service shown in the SigNoz Services view.
Make sure the OTEL demo is exporting to the SigNoz collector on your host:
OTEL_COLLECTOR_HOST=host.docker.internalOTEL_COLLECTOR_PORT_GRPC=4317OTEL_COLLECTOR_PORT_HTTP=4318
If otel-collector is crash-looping on port 4317, disable it (the demo can export directly to SigNoz):
cd ../otel-demo
OTEL_COLLECTOR_HOST=host.docker.internal \
OTEL_COLLECTOR_PORT_GRPC=4317 \
OTEL_COLLECTOR_PORT_HTTP=4318 \
docker compose up -d --scale otel-collector=0
If you see “container name already in use”, clean up and restart:
cd ../otel-demo
docker compose down -v --remove-orphans
docker rm -f \
otel-collector llm image-provider fraud-detection cart product-catalog payment shipping ad email flagd-ui accounting quote currency \
jaeger grafana valkey-cart opensearch postgresql kafka prometheus flagd \
product-reviews recommendation checkout frontend load-generator frontend-proxy
docker network rm opentelemetry-demo
If datahub-upgrade exits with OOM, increase Docker Desktop memory to 10–12 GB and re-run:
docker compose -p datahub -f ../datahub/docker/quickstart/docker-compose.quickstart.yml down
DATAHUB_VERSION=head UI_INGESTION_DEFAULT_CLI_VERSION=head DATAHUB_MAPPED_GMS_PORT=8082 \
docker compose -p datahub -f ../datahub/docker/quickstart/docker-compose.quickstart.yml up -d
- Open the Kafka dataset:
payment-ingestion-worker - Incidents tab shows Service Health Failure (Operational, Active)
- Open the Hive dataset:
payment_table - Lineage -> Downstreams should include:
payment_agg_table->payment_mart->payment_reporting_table
- Each downstream asset should have the
Upstream_Failuretag
- Turn the failure flag off in Flagd UI: http://localhost:55518
- Wait ~60 seconds for the next polling cycle
- Confirm:
- Incident becomes Resolved
Upstream_Failuretag is removed from downstream assets
This removes the mock datasets, custom assertion, and the demo incident.
python -m bridge_app.cleanup
Notes:
- This does not delete your real datasets or tags.
- It only removes entities created by the demo.
- It also deletes the local
incident_state.jsonfile.
-
Instrument your service with OpenTelemetry
- Set
service.nameto the logical name you want to monitor. - Export OTLP to your SigNoz collector (gRPC 4317 or HTTP 4318).
- Set
-
Verify telemetry in SigNoz
- SigNoz -> Services -> confirm your
service.nameis listed.
- SigNoz -> Services -> confirm your
-
Configure the integration
export SIGNOZ_SERVICE_NAME=your-service-name
export SIGNOZ_API_URL=https://your-signoz-host
export SIGNOZ_API_KEY=your-signoz-api-key
export SIGNOZ_SERVICE_FILTER_KEY=service.name
export SIGNOZ_STATUS_CODE_KEY=status.code
export SERVICE_URN="urn:li:dataset:(urn:li:dataPlatform:your_platform,your_service_dataset,PROD)"
export PRIMARY_TABLE_URN="urn:li:dataset:(urn:li:dataPlatform:your_platform,your_primary_table,PROD)"
- Run the integration
- Incidents and lineage tags will reflect your live production health signal.
- SigNoz returns empty data
- Confirm your service name in SigNoz Services view.
- Ensure your API key is valid and has access.
- Verify the SigNoz URL points to the UI/API host.
- SigNoz query returns 400 (key not found or syntax errors)
- Use context-aware keys:
SIGNOZ_SERVICE_FILTER_KEY=service.nameSIGNOZ_STATUS_CODE_KEY=status.code
- If it still fails, set:
SIGNOZ_DISABLE_FILTERS=true
- Use context-aware keys:
- DataHub incident does not appear
- Check
DATAHUB_SERVERand ensure GMS is healthy. - Look for integration logs about assertion upsert or incident creation.
- Check
- Lineage tags do not appear
- Ensure the mock lineage setup ran (integration logs).
- Verify dataset URNs in config match DataHub entities.
Thanks for trying DataHub. If you have questions, feel free to drop them in our Slack channel.