Skip to content

Topiary Quite Slow on an HPC System #51

@AlexT97

Description

@AlexT97

Hello,

I'm a Research Computing Facilitator at an HPC center in university. I've been working with some users who want to use Topiary on our systems. However, we've been running into some serious performance issues. Jobs our users run take substantially longer than expected. We had one user request a year of runtime for their data.

I've run the installation of Topiary several times for our users in their home directories and compiled the dependent GeneRAX and RaxML codes as well for them. In general, we compile with the following specifications:

Python:  Anaconda with Python 3.12.4 with a Conda environment build according to the docs
Compiler:  GNU 8.5.0 or higher (I've tried 13.2.0)
MPI:  OpenMPI 4.1.0 (I've also tried 4.1.6) built against the GNU compilers above

RaxML Options:  AVX deactivated and active (tried both) and with MPI
GeneRAX Options:  AVX deactivated and active (tried both) and with MPI

MPI4Py:  I force a recompilation of MPI4Py with the --no-cache-dir option in PIP so it'll compile with our OpenMPI.

Also, if it helps, we have mostly mid-range nodes here (AMD EPYCs with 96 cores and a lot of older nodes with lower core counts). We also have GPFS as our parallel storage system, which is where all user files get stored.

  1. Have you all experienced any situations where Topiary get this slow or otherwise worryingly slow? If so, how did you get around it?

  2. Do you see anything in the setup I mentioned that I may be missing?

Thanks!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions