Nifi 8843 internal by prabhjyotsingh · Pull Request #47 · acceldata-io/nifi

prabhjyotsingh · 2026-02-19T21:27:43Z

HA Configuration — Three NiFi Registry Nodes

There are four files to change on each node (the database connection and cluster flags are the only per-node differences).

Overview

[Load Balancer]
/ |
Node-1 Node-2 Node-3
\ | /
[PostgreSQL]

All three nodes point at the same DB. The DB holds:

Metadata (buckets, flows, versions)
Users & groups / access policies (DB-backed providers)
Flow content (DB-backed persistence provider)
Cache version counters (CACHE_VERSION)
Leader election lock (CLUSTER_LEADER)
Durable event log (REGISTRY_EVENT)

File 1 — conf/nifi-registry.properties

This is the main config file. The DB section and the cluster section both live here. Everything below is the same on all three nodes except
nifi.registry.cluster.node.identifier.

── Database (same on all nodes) ──────────────────────────────────────────────

nifi.registry.db.url=jdbc:postgresql://pg-host:5432/nifi_registry
nifi.registry.db.driver.class=org.postgresql.Driver
nifi.registry.db.driver.directory=./lib/postgresql # dir containing the JDBC jar
nifi.registry.db.username=nifireg
nifi.registry.db.password=secret
nifi.registry.db.maxConnections=5

── Cluster (change node.identifier per node) ─────────────────────────────────

nifi.registry.cluster.enabled=true
nifi.registry.cluster.node.identifier=node-1 # → node-2 / node-3 on the others
nifi.registry.cluster.cache.refresh.interval.ms=15000

The DB URL is the only place you configure the database. The providers (authorizers.xml, providers.xml) simply say "use the database" — they don't repeat the connection
details, because the framework injects the same DataSource bean into them automatically.

File 2 — conf/providers.xml

Switch flow storage from the default filesystem provider to the database provider. Identical on all three nodes.

org.apache.nifi.registry.provider.flow.DatabaseFlowPersistenceProvider

  <!-- Extension bundle storage: also switch to S3 or keep filesystem+shared-NFS.
       Filesystem only works if all nodes mount the same directory. -->
  <extensionBundlePersistenceProvider>
      <class>org.apache.nifi.registry.provider.extension.FileSystemBundlePersistenceProvider</class>
      <property name="Extension Bundle Storage Directory">./extension_bundles</property>
  </extensionBundlePersistenceProvider>

Note on extension bundles: If you use the filesystem provider for bundles, mount the same NFS/EFS directory on every node. Or switch to S3 (S3BundlePersistenceProvider) for a
fully shared, filesystem-free setup.

File 3 — conf/authorizers.xml

Switch from the default file-backed providers to the DB-backed ones. Identical on all three nodes. Fill in your actual admin identity.

  <!-- Users & groups live in the DB (UGP_USER / UGP_GROUP tables) -->
  <userGroupProvider>
      <identifier>database-user-group-provider</identifier>
      <class>org.apache.nifi.registry.security.authorization.database.DatabaseUserGroupProvider</class>
      <property name="Initial User Identity 1">CN=admin, OU=nifi</property>
  </userGroupProvider>

  <!-- Access policies live in the DB (APP_POLICY tables) -->
  <accessPolicyProvider>
      <identifier>database-access-policy-provider</identifier>
      <class>org.apache.nifi.registry.security.authorization.database.DatabaseAccessPolicyProvider</class>
      <property name="User Group Provider">database-user-group-provider</property>
      <property name="Initial Admin Identity">CN=admin, OU=nifi</property>
      <!-- Add NiFi node identities if NiFi itself connects to this Registry -->
      <property name="NiFi Identity 1">CN=nifi-node-1, OU=nifi</property>
  </accessPolicyProvider>

  <authorizer>
      <identifier>managed-authorizer</identifier>
      <class>org.apache.nifi.registry.security.authorization.StandardManagedAuthorizer</class>
      <property name="Access Policy Provider">database-access-policy-provider</property>
  </authorizer>

Also confirm nifi-registry.properties points to this authorizer:
nifi.registry.security.authorizer=managed-authorizer

File 4 — lib/postgresql/ (JDBC driver)

Drop the PostgreSQL JDBC jar into a directory on every node and point nifi.registry.db.driver.directory at it (done in File 1 above). No XML needed — just the jar file.

What happens on first start

Node-1 starts first. Flyway runs all migrations (V2–V10) and creates all tables, including CACHE_VERSION, CLUSTER_LEADER, and REGISTRY_EVENT. Node-1 wins the INSERT into
CLUSTER_LEADER and becomes leader.
Node-2 and Node-3 start. Flyway detects all migrations are already applied and skips them. Their heartbeat threads try to INSERT into CLUSTER_LEADER but fail (row exists
and Node-1 holds the lease), so they become followers.
All three nodes poll CACHE_VERSION every 15 s. When any node mutates users/groups/policies, it bumps the version and the other two reload their in-memory caches within 15
s.
If Node-1 dies, Node-2 or Node-3 claims the expired lease within 30 s and takes over event delivery.

Summary of what goes where

Setting	Where
Database URL, driver, username, password	nifi-registry.properties
cluster.enabled, node.identifier, cache interval	nifi-registry.properties
Flow persistence provider (DB vs filesystem)	providers.xml
User/group + policy providers (DB vs file)	authorizers.xml
PostgreSQL JDBC jar	lib/postgresql/ dir on each node
Nothing else	— no extra XML, no ZooKeeper, no NiFi cluster.xml

Copilot

Pull request overview

This PR implements high-availability (HA) clustering capabilities for NiFi Registry with database-backed coordination, along with a maintenance mode feature for operational management.

Changes:

Adds maintenance mode functionality via Spring Boot Actuator endpoints to temporarily block write operations for safe backups
Implements active-active clustering with database-backed leader election, cache synchronization, and durable event delivery
Refactors EventService to support both standalone and cluster-aware modes with exactly-once event delivery guarantees

Reviewed changes

Copilot reviewed 22 out of 22 changed files in this pull request and generated 9 comments.

Show a summary per file

File	Description
MaintenanceModeManager.java	Simple state manager for maintenance mode using AtomicBoolean
MaintenanceModeFilter.java	Servlet filter that blocks write HTTP methods during maintenance mode
MaintenanceModeHealthIndicator.java	Exposes maintenance mode state via /actuator/health endpoint
MaintenanceModeEndpoint.java	Actuator endpoint for toggling maintenance mode via POST requests
NiFiRegistrySecurityConfig.java	Integrates maintenance mode filter into security filter chain
LeaderElectionManager.java	Interface for cluster leader election
DatabaseLeaderElectionManager.java	TTL-based leader election using CLUSTER_LEADER table
CacheRefreshPoller.java	Polls CACHE_VERSION table to synchronize authorization caches across nodes
DatabaseUserGroupProvider.java	Enhanced to bump cache version on mutations and register with poller
DatabaseAccessPolicyProvider.java	Enhanced to bump cache version on mutations and register with poller
EventService.java	Refactored from concrete class to interface
StandardEventService.java	Standalone in-memory event delivery implementation (renamed from EventService)
ClusterAwareEventService.java	Cluster-aware event delivery with database persistence and leader-only delivery
EventServiceConfiguration.java	Factory that selects EventService implementation based on cluster mode
ExtensionManager.java	Added optional LeaderElectionManager injection for logging
V9__AddClusterTables.sql	Creates CACHE_VERSION and CLUSTER_NODE tables
V10__AddLeaderAndEventTables.sql	Creates CLUSTER_LEADER and REGISTRY_EVENT tables
NiFiRegistryProperties.java	Added cluster configuration property getters
nifi-registry.properties	Added cluster configuration property template
administration-guide.adoc	Comprehensive documentation for maintenance mode, clustering, and leader election
CLAUDE.md	Development guidance for AI coding assistants
TestEventService.java	Updated to use StandardEventService

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-02-19T21:47:45Z

...-web-api/src/main/java/org/apache/nifi/registry/actuator/MaintenanceModeHealthIndicator.java

+import org.springframework.boot.health.contributor.Health;
+import org.springframework.boot.health.contributor.HealthIndicator;


The Spring Boot Actuator import paths appear to be incorrect. The correct package for Spring Boot 2.x and 3.x is org.springframework.boot.actuate.health, not org.springframework.boot.health.contributor. The imports should be:

import org.springframework.boot.actuate.health.Health;

import org.springframework.boot.actuate.health.HealthIndicator;

This will cause a compilation failure.

Suggested change

import org.springframework.boot.health.contributor.Health;

import org.springframework.boot.health.contributor.HealthIndicator;

import org.springframework.boot.actuate.health.Health;

import org.springframework.boot.actuate.health.HealthIndicator;

Copilot · 2026-02-19T21:47:46Z

...c/main/java/org/apache/nifi/registry/security/authorization/database/CacheRefreshPoller.java

+    @Override
+    public void destroy() {
+        LOGGER.info("Shutting down CacheRefreshPoller.");
+        scheduler.shutdownNow();
+    }


The destroy method always calls scheduler.shutdownNow() even when cluster mode is disabled and the scheduler was never started. While this is likely harmless (the scheduler is initialized in the constructor), it would be cleaner to track whether the scheduler was actually started and only shut it down if it's running. Consider adding a flag like private volatile boolean started = false; and checking it in the destroy method.

...egistry-framework/src/main/java/org/apache/nifi/registry/event/ClusterAwareEventService.java

Copilot · 2026-02-19T21:47:46Z

...fi-registry-framework/src/main/java/org/apache/nifi/registry/extension/ExtensionManager.java

    private final Map<String, ExtensionClassLoader> classLoaderMap = new HashMap<>();
    private final AtomicBoolean loaded = new AtomicBoolean(false);

+    // Optional: injected only when cluster mode is enabled; used for logging.


The comment states "Optional: injected only when cluster mode is enabled" but DatabaseLeaderElectionManager is always created as a Spring bean (marked @component). While the @Autowired(required = false) makes the injection optional at the Spring level, the leaderElectionManager will always be injected in practice since the bean always exists. Update the comment to reflect reality: "Optional injection; used for logging cluster leadership status when available."

Suggested change

// Optional: injected only when cluster mode is enabled; used for logging.

// Optional injection; used for logging cluster leadership status when available.

Copilot · 2026-02-19T21:47:47Z

nifi-registry/nifi-registry-core/nifi-registry-docs/src/main/asciidoc/administration-guide.adoc

+.Re-enabling writes after backup
+----
+curl -X POST https://localhost:18443/nifi-registry/actuator/maintenance \
+  -H "Content-Type: application/json" \
+  -d '{"enabled": false}'
+----


The documentation describes how to disable maintenance mode via a POST request (line 1247), but this will fail due to the deadlock issue identified earlier - the MaintenanceModeFilter blocks all POST requests when maintenance mode is enabled, including the POST to /actuator/maintenance itself. The documentation should include a workaround, such as: "Note: If maintenance mode was accidentally enabled and must be disabled, restart the NiFi Registry process as maintenance mode is not persisted across restarts."

Copilot · 2026-02-19T21:47:47Z

...ore/nifi-registry-framework/src/main/resources/db/migration/default/V9__AddClusterTables.sql

+-- Cluster node registration table: tracks live nodes for operational visibility.
+-- Updated by a heartbeat thread on each node; no quorum or eviction is performed in this phase.
+
+CREATE TABLE CLUSTER_NODE (
+    NODE_ID        VARCHAR(100) NOT NULL,
+    HOSTNAME       VARCHAR(255) NOT NULL,
+    LAST_HEARTBEAT TIMESTAMP   NOT NULL,
+    CONSTRAINT PK__CLUSTER_NODE PRIMARY KEY (NODE_ID)
+);


The CLUSTER_NODE table is created in this migration but there's no code in the PR that actually writes to it. According to the comment, it should be "Updated by a heartbeat thread on each node" but no such heartbeat logic exists. This appears to be dead schema that was planned but not implemented, or code is missing from this PR. Consider either removing this table definition or implementing the heartbeat logic that populates it.

Suggested change

-- Cluster node registration table: tracks live nodes for operational visibility.

-- Updated by a heartbeat thread on each node; no quorum or eviction is performed in this phase.

CREATE TABLE CLUSTER_NODE (

NODE_ID VARCHAR(100) NOT NULL,

HOSTNAME VARCHAR(255) NOT NULL,

LAST_HEARTBEAT TIMESTAMP NOT NULL,

CONSTRAINT PK__CLUSTER_NODE PRIMARY KEY (NODE_ID)

);

Copilot · 2026-02-19T21:47:47Z

...i/src/main/java/org/apache/nifi/registry/web/security/maintenance/MaintenanceModeFilter.java

+        if (maintenanceModeManager.isEnabled() && isWriteMethod(httpRequest.getMethod())) {
+            sendMaintenanceModeResponse(httpResponse);
+            return;
+        }


The MaintenanceModeFilter blocks ALL write operations (POST, PUT, PATCH, DELETE) when maintenance mode is enabled, including the POST request to /actuator/maintenance endpoint that would disable maintenance mode. This creates a deadlock situation where maintenance mode cannot be disabled via the API once enabled. Consider either:

Exempting the /actuator/maintenance endpoint from the filter

Allowing GET with a query parameter to toggle state

Adding an exclude path pattern to the filter for critical administrative endpoints

Copilot · 2026-02-19T21:47:47Z

...c/main/java/org/apache/nifi/registry/security/authorization/database/CacheRefreshPoller.java

+    // Provider instances are not Spring beans; they self-register here on initialisation.
+    private static volatile DatabaseAccessPolicyProvider accessPolicyProvider;
+    private static volatile DatabaseUserGroupProvider userGroupProvider;


The static volatile fields for provider instances (accessPolicyProvider, userGroupProvider) could cause issues in testing scenarios or if multiple Spring contexts are created. While volatile ensures visibility, static state is shared across all instances of the class. Consider making these instance fields or using a different mechanism to allow the providers to register themselves with the poller, such as a callback passed during provider initialization.

Copilot · 2026-02-19T21:47:48Z

...gistry-framework/src/main/java/org/apache/nifi/registry/event/EventServiceConfiguration.java

+    @Bean
+    @Autowired
+    public EventService eventService(
+            final List<EventHookProvider> eventHookProviders,
+            final NiFiRegistryProperties properties,
+            final DataSource dataSource,
+            final LeaderElectionManager leaderElectionManager) {
+
+        if (properties.isClusterEnabled()) {
+            LOGGER.info("Cluster mode is enabled; creating ClusterAwareEventService.");
+            return new ClusterAwareEventService(eventHookProviders, dataSource, leaderElectionManager);


The EventServiceConfiguration.eventService() method requires a LeaderElectionManager parameter, but in standalone mode (cluster.enabled=false) the DatabaseLeaderElectionManager bean still exists and will be injected. While this works (the bean self-disables when cluster mode is off), it creates an unnecessary dependency. Consider making the LeaderElectionManager parameter optional with @Autowired(required = false) or using @ConditionalOnProperty to only create the ClusterAwareEventService when clustering is enabled.

@configuration

Implements leader election and distributed event hooks for NiFi Registry cluster mode (nifi.registry.cluster.enabled=true). - V10__AddLeaderAndEventTables.sql: adds CLUSTER_LEADER (TTL-based distributed lock) and REGISTRY_EVENT (durable event log) tables. - LeaderElectionManager interface + DatabaseLeaderElectionManager: TTL lease (30 s) with 10 s heartbeat. Algorithm: renew own lease → claim expired lease → INSERT first-ever row → remain follower. - EventService converted to interface; existing in-memory logic moved to StandardEventService (standalone mode). ClusterAwareEventService (cluster mode) persists events to REGISTRY_EVENT and delivers them exactly once from the leader node via a 5 s polling loop with 7-day retention cleanup. - EventServiceConfiguration (@configuration) selects the right EventService implementation at startup based on cluster mode. - ExtensionManager: injects optional LeaderElectionManager and logs which node performed extension discovery.

…ain/java/org/apache/nifi/registry/event/ClusterAwareEventService.java Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

ODP-6078 Prepare for 3.3.6.4-1 release

3523e71

prabhjyotsingh requested review from armitage420, Copilot and kravii February 19, 2026 21:31

Copilot started reviewing on behalf of prabhjyotsingh February 19, 2026 21:40 View session

Copilot AI reviewed Feb 19, 2026

View reviewed changes

prabhjyotsingh added 2 commits February 19, 2026 16:52

phase one

4d05648

phase 2

c050a77

prabhjyotsingh force-pushed the NIFI-8843-internal branch from 1b628c0 to d651688 Compare February 19, 2026 21:52

prabhjyotsingh and others added 2 commits February 19, 2026 16:55

Update nifi-registry/nifi-registry-core/nifi-registry-framework/src/m…

0782e1a

…ain/java/org/apache/nifi/registry/event/ClusterAwareEventService.java Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

prabhjyotsingh force-pushed the NIFI-8843-internal branch from d651688 to 0782e1a Compare February 19, 2026 21:55

armitage420 force-pushed the rel/ODP-2.7.2.3.3.6.4-1 branch from 3523e71 to f7c3872 Compare February 20, 2026 14:39

Merge branch 'rel/ODP-2.7.2.3.3.6.4-1' into NIFI-8843-internal

1bbec18

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Nifi 8843 internal#47

Nifi 8843 internal#47
prabhjyotsingh wants to merge 6 commits intorel/ODP-2.7.2.3.3.6.4-1from
NIFI-8843-internal

prabhjyotsingh commented Feb 19, 2026 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Feb 19, 2026

Uh oh!

Copilot AI Feb 19, 2026

Uh oh!

Uh oh!

Copilot AI Feb 19, 2026

Uh oh!

Copilot AI Feb 19, 2026

Uh oh!

Copilot AI Feb 19, 2026

Uh oh!

Copilot AI Feb 19, 2026

Uh oh!

Copilot AI Feb 19, 2026

Uh oh!

Copilot AI Feb 19, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

		import org.springframework.boot.health.contributor.Health;
		import org.springframework.boot.health.contributor.HealthIndicator;

	// Optional: injected only when cluster mode is enabled; used for logging.
	// Optional injection; used for logging cluster leadership status when available.

Conversation

prabhjyotsingh commented Feb 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

── Database (same on all nodes) ──────────────────────────────────────────────

── Cluster (change node.identifier per node) ─────────────────────────────────

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Feb 19, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 19, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Copilot AI Feb 19, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 19, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 19, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 19, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 19, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 19, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

prabhjyotsingh commented Feb 19, 2026 •

edited

Loading