Performance statistics report

It would be very useful to have some simple microbenchmarks we could run from time-to-time. We've tried using ASV, but it's very heavyweight and collects masses of data that's never used. 

Requirements:

- Make a script ``perf_benchmark.py`` which runs a standard set of microbenchmarks and outputs the result to a file. The idea is this file should be updated before every release, so that we can spot any perf regressions and also so that we have a per-release history of the benchmarks in the history.
- Update the developer docs to include how and when to run these benchmarks as part of the release process.

CPU time performance benchmarks:

Given a standard file

- [x] time to load
- [x] time to save
- [x] time to access ``ts.tables`` in a loop
- [x] time to access ``tables.nodes``, ``tables.individuals`` etc
- [x] time to access columns, ``nodes.flags``, etc
- [x] time to get first tree, ts.first()
- [x] time to seek to middle tree
- [x] time to iterate over all trees
- [x] time to access tree arrays, tree.parent etc.
- [x] time to decode all variants
- [x] time to iterate over all rows, with and without metadata
- [x] time to write to vcf (writing to devnull)

additional as I thought of them (BJ):
- [x] tree node arrays (`postorder` et al)
- [x] tree row accessors (`ts.node(42)`)
- [x] tree accessors (`tree.left_sib(42)`)
- [x] iterate over `tree.nodes`

It would be nice to track the memory usage here, but this may be much more difficult and not worth the effort.



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Performance statistics report #2444

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Performance statistics report #2444

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions