-
Notifications
You must be signed in to change notification settings - Fork 79
Closed
Labels
Infrastructure and toolsDevelopment infrastructure and toolsDevelopment infrastructure and toolsPerformanceThis issue addresses performance, either runtime or memoryThis issue addresses performance, either runtime or memory
Milestone
Description
It would be very useful to have some simple microbenchmarks we could run from time-to-time. We've tried using ASV, but it's very heavyweight and collects masses of data that's never used.
Requirements:
- Make a script
perf_benchmark.pywhich runs a standard set of microbenchmarks and outputs the result to a file. The idea is this file should be updated before every release, so that we can spot any perf regressions and also so that we have a per-release history of the benchmarks in the history. - Update the developer docs to include how and when to run these benchmarks as part of the release process.
CPU time performance benchmarks:
Given a standard file
- time to load
- time to save
- time to access
ts.tablesin a loop - time to access
tables.nodes,tables.individualsetc - time to access columns,
nodes.flags, etc - time to get first tree, ts.first()
- time to seek to middle tree
- time to iterate over all trees
- time to access tree arrays, tree.parent etc.
- time to decode all variants
- time to iterate over all rows, with and without metadata
- time to write to vcf (writing to devnull)
additional as I thought of them (BJ):
- tree node arrays (
postorderet al) - tree row accessors (
ts.node(42)) - tree accessors (
tree.left_sib(42)) - iterate over
tree.nodes
It would be nice to track the memory usage here, but this may be much more difficult and not worth the effort.
petrelharp
Metadata
Metadata
Assignees
Labels
Infrastructure and toolsDevelopment infrastructure and toolsDevelopment infrastructure and toolsPerformanceThis issue addresses performance, either runtime or memoryThis issue addresses performance, either runtime or memory