Systematic database index and query performance analysis

## Problem

A single functional index on `LOWER(StringValue)` reduced `FixupCrossBatchReferenceIdsAsync` from **300+ seconds (timeout)** to **~6 seconds** at MediumLarge scale (294K reference attribute values). This suggests there may be other queries with similar performance gaps that would benefit from index optimisation.

EF Core generates queries that work at small scale but degrade non-linearly. JIM's data model has wide tables (`ConnectedSystemObjectAttributeValues`, `ConnectedSystemObjectChangeAttributeValues`) with millions of rows at production scale. Missing indexes on common JOIN/WHERE patterns cause full table scans.

## Scope

This issue covers query and data-access optimisation across several areas:

### A. Systematic index analysis (original scope)

1. Enable `pg_stat_statements` and use slow query logs during integration tests
2. Run `EXPLAIN ANALYZE` on top offenders
3. Add functional/composite indexes where needed (require raw SQL migrations)
4. Use integration test performance baselines for regression detection

**Status:** ✅ Complete — `pg_stat_statements` enabled in dev PostgreSQL configurations, reusable `Analyse-QueryPerformance.ps1` script created. S8 OpenLDAP Medium baseline captured. No missing indexes found — 100% cache hit rate across all queries. Main bottlenecks are structural (cross-batch fixup, FK key share checks), not index-related.

### B. EF backlink optimisations (absorbed from #69)

Use scalar FK properties (e.g. `dependent.AttributeId`) instead of navigation properties (e.g. `dependent.Attribute.Name`) wherever possible, to eliminate unnecessary `.Include()` chains and avoid `AsSplitQuery` materialisation overhead.

**Status:** Partially complete — CSOAV model already has both navigation and scalar FK properties, and Phase 4 sync page-load eliminated the worst offenders. Remaining work is a codebase-wide sweep.

### C. CSO cross-batch reference fixup elimination (absorbed from #397)

During Full Import, CSO attribute values with reference FKs pointing to CSOs on other parallel write connections were nulled, requiring a post-hoc `FixupCrossBatchReferenceIdsAsync` SQL UPDATE with expensive LOWER() joins.

**Status:** ✅ Complete — Three changes eliminate the CSO attribute value fixup:
1. Case-insensitive secondary external ID lookup (`OrdinalIgnoreCase`) matching RFC 4514 DN semantics
2. Two-phase parallel write: CSO rows committed first, then attribute values with full FK visibility
3. Sort non-referencing CSOs first + committed ID tracking across batches

Result: CSO fixup query eliminated from top 30, 16,441 → 0 cross-batch FKs fixed, total DB time -8%.

### D. ApiKeys FK key share check volume (new finding from Part A)

pg_stat_statements revealed that EF Core's FK validation (`FOR KEY SHARE`) on the `ApiKeys` table fires **450,194 times** during a single S8 Medium test run (4.1s total, 0.01ms mean). A similar pattern on `ConnectorDefinitions` adds another 209,914 calls (822ms). These are EF Core's internal FK consistency checks triggered on every `SaveChangesAsync` for entities referencing these tables.

Investigation needed: why the call count is so high and whether it can be reduced (e.g. by restructuring FK relationships, using `AsNoTracking` where appropriate, or batching saves).

### E. Change record cross-batch reference fixup

`FixupCrossBatchChangeRecordReferenceIdsAsync` still runs the same LOWER() join pattern on `ConnectedSystemObjectChangeAttributeValues`. At S8 Medium scale this is **4.1s** (11 calls, 370ms mean) — the #1 query by total execution time after Part C eliminated the CSO fixup.

Change records store reference DN strings in `StringValue` and have `ReferenceValueId` nulled during COPY binary persistence to avoid FK violations. Unlike CSO attribute values (which have a dedicated `UnresolvedReferenceValue` column and in-memory resolution), change records lack an equivalent resolution path.

Approach options:
- Extend the two-phase write pattern to change record persistence (`SyncRepository.RpeiOperations.cs`)
- Resolve change record references in-memory before COPY persistence (similar to Part C's approach for CSO attribute values)

## S8 OpenLDAP Medium Results

### Baseline (before Part C)

| # | Total | Mean | Calls | Category |
|---|-------|------|-------|----------|
| 1 | 4.4s | 274ms | 16 | COPY bulk insert (MVO attr values) |
| 2 | 4.1s | 0.01ms | 450K | **FK key share: ApiKeys** |
| 3 | **3.5s** | **315ms** | 11 | **Change record fixup (LOWER join)** |
| 4 | 1.8s | 0.03ms | 60K | EF update (MVO attr values by PK) |
| 5 | 1.7s | 72ms | 24 | COPY bulk insert (CSO attr values) |
| 19 | **451ms** | **226ms** | 2 | **CSO attr value fixup (LOWER join)** |

### After Part C

| # | Total | Mean | Calls | Category |
|---|-------|------|-------|----------|
| 1 | **4.1s** | **370ms** | 11 | **Change record fixup (LOWER join) — Part E target** |
| 2 | 2.1s | 129ms | 16 | COPY bulk insert (MVO attr values) |
| 5 | 1.8s | 0.004ms | 455K | FK key share: ApiKeys |
| — | — | — | — | CSO attr value fixup: **eliminated** |

Total DB time: 30.7s → **28.2s** (−8%)

## Known patterns to check

- ~~CSO cross-batch reference fixup (LOWER join)~~ — eliminated by Part C
- Change record cross-batch reference fixup (LOWER join) — Part E
- FK key share check volume on ApiKeys and ConnectorDefinitions — Part D
- EF-generated queries with `.Include()` chains on large tables
- Navigation property usage where scalar FK would suffice (#69)

## Impact

At MediumLarge scale (~5K CSOs, ~300K attribute values), query performance directly determines whether sync completes or times out. At production scale (100K+ objects), unindexed queries will be orders of magnitude worse.

## Related

- Commit `e0acd631`: Added functional indexes for cross-batch reference fixup (300s → 6s)
- Issue #426: Deferred export resolution overhead (fixed)
- ~~Issue #69~~: EF backlink optimisations (absorbed into this issue)
- ~~Issue #397~~: Cross-batch reference fixup container ordering (absorbed into this issue)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Systematic database index and query performance analysis #427

Problem

Scope

A. Systematic index analysis (original scope)

B. EF backlink optimisations (absorbed from #69)

C. CSO cross-batch reference fixup elimination (absorbed from #397)

D. ApiKeys FK key share check volume (new finding from Part A)

E. Change record cross-batch reference fixup

S8 OpenLDAP Medium Results

Baseline (before Part C)

After Part C

Known patterns to check

Impact

Related

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

#	Total	Mean	Calls	Category
1	4.4s	274ms	16	COPY bulk insert (MVO attr values)
2	4.1s	0.01ms	450K	FK key share: ApiKeys
3	3.5s	315ms	11	Change record fixup (LOWER join)
4	1.8s	0.03ms	60K	EF update (MVO attr values by PK)
5	1.7s	72ms	24	COPY bulk insert (CSO attr values)
19	451ms	226ms	2	CSO attr value fixup (LOWER join)

#	Total	Mean	Calls	Category
1	4.1s	370ms	11	Change record fixup (LOWER join) — Part E target
2	2.1s	129ms	16	COPY bulk insert (MVO attr values)
5	1.8s	0.004ms	455K	FK key share: ApiKeys
—	—	—	—	CSO attr value fixup: eliminated

Systematic database index and query performance analysis #427

Description

Problem

Scope

A. Systematic index analysis (original scope)

B. EF backlink optimisations (absorbed from #69)

C. CSO cross-batch reference fixup elimination (absorbed from #397)

D. ApiKeys FK key share check volume (new finding from Part A)

E. Change record cross-batch reference fixup

S8 OpenLDAP Medium Results

Baseline (before Part C)

After Part C

Known patterns to check

Impact

Related

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions