Tired of not understanding why your Nifi sucks?
New! A TypeScript application that extracts processor information and performance metrics from Apache NiFi process groups and stores them in a SQLite database for analysis.
- 🔍 Recursively extracts processor information from all nested process groups
- 📊 Comprehensive performance metrics collection and analysis
- 💾 Stores data in SQLite database with structured schema
- LLM compatible -
llm.mdcan help Language Models create SQL queries for you.
- 🔍 Detect bottlenecks by querying
- 📊 create your own visualizations (not just "View Status History") using Datagrip, Apache superset, or maybe your own UI.
The application stores information in multiple tables to enable detailed analysis:
processors_info: Basic processor configuration and settingsprocessors_properties: Processor-specific property configurationsprocessors_status_history: Historical performance metrics
connections_info: Information about connections between processorsconnections_targets: Details about connection endpoints
provenance_events: Records of FlowFile lifecycle eventsprovenance_events_attributes: FlowFile attributes at event timeprovenance_events_flowfile_relationships: Parent-child relationships between FlowFiles
nodes_info: Information about NiFi cluster nodes
For detailed schema information, see llm.md.
Note: If you don't have a NiFi instance running, it's recommended to start a local one first. See Local NiFi Setup for instructions.
- Node.js >= 20.0.0
- pnpm >= 8.0.0
There are two ways to run this application:
# Install dependencies
pnpm install
# Start the server
pnpm start:serverThe script located in src/script directory accepts the following parameters:
--nifi-url(-u): URL of the NiFi instance--auth(-a): Credentials in formatusername:password--pg-id: Process group ID to analyze--provenance(-p): Amount of provenance events to collect per processor (default: 100000, 0 to disable)
Examples:
# Analyze root process group
pnpm start:script --nifi-url https://localhost:8080 --auth admin:password
# Analyze specific process group with custom provenance events limit
pnpm start:script --nifi-url https://localhost:8080 --auth admin:password --pg-id abc-123-def-456 --provenance 50000
# Analyze without collecting provenance events
pnpm start:script --nifi-url https://localhost:8080 --auth admin:password -p 0The application can also be configured using environment variables:
| Variable | Description | Required |
|---|---|---|
NIFI_URL |
URL of the NiFi instance | Yes |
NIFI_USERNAME |
NiFi username for authentication | Yes |
NIFI_PASSWORD |
NiFi password for authentication | Yes |
PG_ID |
Process group ID to analyze | No (defaults to root) |
DB_PATH |
Custom path for the SQLite database | No (defaults to ./data/output.db) |
Start and stop your preferred NiFi version using the provided scripts:
For Linux/macOS:
# Start NiFi v1.28.0 (default)
./scripts/start-nifi.sh
# Start NiFi v2.2.0
./scripts/start-nifi.sh v2
# Stop all NiFi instances
./scripts/stop-nifi.sh
# Stop specific version
./scripts/stop-nifi.sh v1
./scripts/stop-nifi.sh v2For Windows:
# Start NiFi v1.28.0 (default)
scripts\start-nifi.bat
# Start NiFi v2.2.0
scripts\start-nifi.bat v2
# Stop all NiFi instances
scripts\stop-nifi.bat
# Stop specific version
scripts\stop-nifi.bat v1
scripts\stop-nifi.bat v2NIFI_URL- NiFi instance URL (default:https://localhost:8080)NIFI_USERNAME- NiFi usernameNIFI_PASSWORD- NiFi passwordPG_ID- Process Group ID to analyze (default: prompts for selection)DB_PATH- SQLite database path (default:./data/output.db)
If PG_ID is not provided, the application will:
- Fetch all root-level process groups
- Present an interactive menu for selection
- Allow choosing "Root" to analyze all process groups
graph LR
A[Apache NiFi] --> B[TypeScript Analyzer]
B --> C[SQLite Database]
C --> D[Analysis Tools]
LLM Prompt Example:
"Find the slowest processors in my NiFi flow. Show me the top 10 processors with the longest execution times, including their names and types."
Generated SQL Query:
SELECT
pi.name AS processor_name,
pi.type AS processor_type,
MAX(psh.task_millis) / 1000 AS max_duration_seconds
FROM processors_status_history psh
JOIN processors_info pi ON psh.processor_id = pi.id
WHERE psh.task_millis > 0
GROUP BY psh.processor_id
ORDER BY max_duration_seconds DESC
LIMIT 10;This query will show you the top 10 processors with the longest execution times, helping you quickly identify performance bottlenecks.
More LLM Prompt Examples:
"Show me processors that are taking longer than 5 seconds to execute""Find processors with high average lineage duration""Which processor types are most common in my flow?""Show me all InvokeHTTP processors and their performance metrics"
Find processors with longest execution times:
SELECT MAX(psh.task_millis) / 1000 AS duration,
pi.name AS name,
pi.type AS type
FROM processors_status_history psh
JOIN processors_info pi ON psh.processor_id = pi.id
WHERE psh.task_millis > 0
GROUP BY psh.processor_id
ORDER BY MAX(psh.task_millis) DESC;Identify potential bottlenecks by lineage duration:
SELECT psh.processor_id,
MAX(psh.average_lineage_duration) as average_lineage,
pi.type,
pi.name
FROM processors_status_history psh
JOIN processors_info pi ON psh.processor_id = pi.id
WHERE psh.average_lineage_duration > 0
GROUP BY processor_id
ORDER BY average_lineage;Find load-balanced connections:
SELECT
c.name AS connection_name,
src.name AS source_name,
dst.name AS destination_name,
c.load_balance_strategy,
c.load_balance_partition_attribute
FROM connections_info c
JOIN connections_targets src ON c.source_id = src.id
JOIN connections_targets dst ON c.destination_id = dst.id
WHERE c.is_load_balanced = TRUE;Get processors with high run duration:
SELECT name, type, run_duration, concurrent_tasks
FROM processors_info
WHERE run_duration >= 1000
ORDER BY run_duration DESC;Get processor type distribution:
SELECT type, COUNT(*) as count
FROM processors_info
GROUP BY type
ORDER BY count DESC;- Default credentials are used for development only
- Change default passwords in production environments
- Ensure NiFi instance is properly secured
- Use environment variables for sensitive configuration
- The analyzer uses
NODE_TLS_REJECT_UNAUTHORIZED=0to bypass certificate validation
NiFi Connection Issues:
- Verify NiFi is running and accessible
- Check credentials
- Wait for NiFi to fully start (2-3 minutes)
- Self-signed certificate warnings are bypassed automatically
SQLite Errors:
- Ensure write permissions to data directory
- Verify sufficient disk space
- Check that native SQLite libraries are available
Prerequisites:
- Node.js 18+
- pnpm
- TypeScript
- SQLite
Scripts:
pnpm start- Run the analyzerpnpm build- Compile TypeScript
- Default credentials are for development only
- Change default passwords in production
- Ensure NiFi instance is properly secured
- Use environment variables for sensitive configuration
- Certificate validation is bypassed for development
ISC License
The analyzer uses a multi-stage Dockerfile with volume mounting:
- Dependencies are pre-installed in the builder stage
- Edit TypeScript files in your local
src/directory - Changes are immediately available in the container
- Restart the analyzer to pick up changes:
docker compose restart nifi-analyzer
The application provides detailed logging:
- ✅ Success operations
⚠️ Warnings and fallbacks- ❌ Errors and failures
- 📊 Progress indicators
- Node.js 20+
- pnpm
- TypeScript
- Docker & Docker Compose
pnpm start- Run the analyzerpnpm build- Compile TypeScriptpnpm test- Run tests (when implemented)
- Follow TypeScript best practices
- Use functional programming patterns where possible
- Maintain comprehensive error handling
- Add appropriate logging
- Follow DRY and KISS principles
ISC License