Skip to content

Redundant catalog indexing during content creation: notifyCreated calls _reindexWorkflowVariables unnecessarily #4275

@jensens

Description

@jensens

(this was a finding while creating zodb-pgjsonb)

Problem

When content is created, handleContentishEvent (CMFCatalogAware.py:272-276) does:

if IObjectAddedEvent.providedBy(event):
    wfaware.notifyWorkflowCreated()   # (a) initializes workflow + redundant reindex
    ob.indexObject()                    # (b) full INDEX (queued)

Step (a) calls WorkflowTool.notifyCreated which calls _reindexWorkflowVariables(ob). This method does two things:

  1. ob.reindexObject(idxs=vars) — queues a REINDEX for workflow variables (redundant, because indexObject() in step (b) covers all indexes)
  2. ob.reindexObjectSecurity() — calls unrestrictedSearchResults(path=path) which forces an immediate processQueue(), flushing all pending indexing operations

Why this is expensive

cProfile data from creating 100 content objects (50 Documents + 50 News Items):

  • Total ADD time: 8.31s (83ms per object)
  • reindexObjectSecurity path: ~4.1s (49%) — this includes the premature queue flush via unrestrictedSearchResultsprocessQueue() and the resulting catalog operations
  • _reindexWorkflowVariables overall: ~4.0sreindexObject(idxs=vars) queues a REINDEX that the queue optimizer later has to merge with the INDEX from indexObject()

Why it is redundant for new objects

  • reindexObjectSecurity searches the catalog for child objects to cascade security changes to. A brand-new object has no children — the search always returns 0 results (or just the object itself, which hasn't been indexed yet anyway).
  • reindexObject(idxs=vars) queues a REINDEX for workflow variable indexes (review_state, etc.), but indexObject() called immediately after in handleContentishEvent indexes ALL fields including those same workflow variables.
  • The wf.notifyCreated(ob) call that initializes the workflow state (creates initial history, sets review_state attribute on the object) is NOT the problem — that part is needed. Only the subsequent _reindexWorkflowVariables call is redundant.

Callers of _reindexWorkflowVariables

There are exactly two callers in Products.CMFCore.WorkflowTool:

  1. notifyCreated (line 296) — during object creation ← this is the redundant one
  2. _invokeWithNotification (line 538) — during workflow transitions ← this one is needed

Proposed Fix

Override notifyCreated in CMFPlone's WorkflowTool subclass (Products/CMFPlone/WorkflowTool.py) to skip the _reindexWorkflowVariables call:

@security.private
def notifyCreated(self, ob):
    """Notify all applicable workflows that an object has been created.

    Overrides CMFCore to skip _reindexWorkflowVariables.
    The caller (handleContentishEvent) always calls indexObject()
    immediately after, which indexes all fields including workflow
    variables and security. Calling _reindexWorkflowVariables here
    is redundant and expensive: reindexObjectSecurity forces an
    immediate queue flush via unrestrictedSearchResults for an object
    that has no children yet.
    """
    wfs = self.getWorkflowsFor(ob)
    for wf in wfs:
        if self.getHistoryOf(wf.getId(), ob):
            continue
        wf.notifyCreated(ob)

This preserves workflow state initialization while eliminating the redundant reindex. Workflow transitions continue to use _reindexWorkflowVariables via _invokeWithNotification on the base class, so those are unaffected.

Expected Impact

  • Content creation: ~49% faster (~4.1s saved per 100 objects, from 83ms to ~42ms per object)
  • Workflow transitions: No change
  • Content deletion: No change

Risk Assessment

Low risk:

  • Only the creation path is affected
  • The CMFCore test test_object_indexed_after_adding already expects exactly one index call (['index /site/bar'])
  • Workflow state initialization is preserved
  • The fix should also be upstreamed to Products.CMFCore.WorkflowTool.notifyCreated in the long term

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions