-
Notifications
You must be signed in to change notification settings - Fork 287
Description
Update on 2020-04-08:
The particular performance issue described below as Problem 1 has been resolved in #1012 by applying the below proposed Quick fix (a), however, the underlying Problem 2 remains.
I suggest to keep this issue around for context, and proceed with below outlined Long-term fixes (c) and (d).
As reported by @woodruffw (thanks!) and reproduced in the profiling experiment below, repository tool's delegate_hashed_bins method has massive performance issues when delegating to many bins (e.g. 16K), for the following reasons:
Problem 1:
delegate_hashed_bins calls delegate for each new bin. delegate then adds the new delegated role to roledb, but also adds delegation infos to the steadily growing delegating role in roledb in each iteration.
Problem 2:
Operations on roledb are extremely expensive, because they perform two deepcopys of a roleinfo dictionary, like so:
1. roleinfo = roledb.get_roleinfo() # deepcopying roleinfo out of roledb
2. modify roledinfo
3. roledb.update_roleinfo(roleinfo) # deepcopying roleinfo back into roleb
Furthermore, update_roleinfo calls tuf.formats.ROLEDB_SCHEMA.check_match for the passed roleinfo, which recursively iterates over all its elements to check for some criteria.
I propose the following fixes:
- Quick fix (a):
Change delegate_hashed_bins to update the delegating role only once in the roledb, instead of once for every bin. This means that we can't call delegate as it is right now, but would need to replicate or, ideally, factor out common functionality.
- Intermediate fix (b):
Stop deepcopying and instead update roledb by reference. The deepcopys are an unnecessary overhead throughout the codebase (see profiling experiment). This seems quite feasible in most cases, but it is also risky. It might be worth skipping the intermediate fix, and move from quick fix directly to the long-term fix.
- Long-term fix (c):
Kill roledb with fire. There is absolutely no reason to have different ways of representing tuf metadata in memory, and constantly sync them. Right now:- the updater has it's own
metadatadict store - the repository tool has a class-based model of all tuf metadata
- the developer tool inherits the relevant data model from repository tool
- repository lib might need a general refactor
- the updater has it's own
Also see secondary objective "Has only one internal structure ..." of #846 ")
- Orthogonal intermediate/long-term fix (d):
Stop distrusting your own code. There is really no need to callcheck_matchin the internal API on data that we constructed ourselves, especially if it is that costly (see profiling results below and Revise schema and formats facility secure-systems-lab/securesystemslib#183).