buffer size issue

Hello

We have installed PartitionFinder on our cluster, and we notice a strange behavior when we increase the number of threads (not MPI) combined with option --raxml to process a huge dataset :

The whole PartitionFinder process stays frozen waiting for raxml.linux sub processes, often marked as zombies.

With the example nucleotide dataset we noticed the same behavior, even with -p 8.

With debugging option and --save-phylofiles we checked if there was something wrong with RAxML ... launched sequentially alone outside PartitionFinder on the same data, all RAxML processes run without any problem.

We suspected a problem off buffer size (not set in the subprocess.Popen call) ... we set a comfortable one, and can go further in the data processing, but we still have main process blocked in the same way.

I've changed the code of run_program in partfinder/util.py to the following one, replacing subprocess.Popen by a basic old os.system, and now everything is OK :

def run_program(binary, command):
    unique_filename = uuid.uuid4()
    command = "\"%s\" %s 2> %s.err > %s.out" % (binary, command, unique_filename, unique_filename)
    log.debug("Running '%s'", command)
    returncode=os.system(command)
    if returncode != 0:
        raise ExternalProgramError("Exit %s: %s" % (returncode,command), "see %s.err %s.out files in project folder" % (unique_filename, unique_filename))
    else:
        os.remove("%s.err" % (unique_filename))
        os.remove("%s.out" % (unique_filename))


I've not tested another "old" solution found here https://bugs.python.org/issue12739

The tests were performed with :

partitionfinder-2.1.1
python 2.7.13
RAxML 8.2.9 compiled with gcc 4.4.7 (tests also made with gcc 6.1.0)
CentOS release 6.5 (Final)

on Dell PowerEdge C6220 (2 x Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz 10 cores with Hyper-Threading, 256G RAM)

The initial command line was : python PartitionFinder.py -p 20 --raxml --no-ml-tree examples/nucleotide/

We noticed also that in fact a -p 2 gave quite the same processing time than a -p 20 ...

Yours faithfully


Patrice Déhais

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

buffer size issue #124

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

buffer size issue #124

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions