DNA sequence comparison tool implementing Approximate Pattern Matching (APM) algorithm using levenshtein's distance.
This project combines OpenMP and MPI parallelization for a cluster of multicore machines
If you just want to compare performances between the sequential and parallel implementations, you might want to use our test script:
# Basic script that test 3 patterns on a one-line DNA file with approximation distance of 0
./test.shIf you want to change the patterns, the approximation distance or the test file, just modify those lines in test.sh:
PATTERNS="CAG GTACAT GGG" # List of patterns to test
FILE="dna/line_chrY.fa" # DNA file
APPROXIMATION=0 # Approximation distanceOr if you want to execute one implementation, follow those steps :
# Generate the binaries
make- Sequential execution :
./apm approximation_factor dna_database pattern1 pattern2 ...- Parallel execution :
With OpenMP:
export OMP_NUM_THREAD=number_of_cores;
./apmOMP approximation_factor dna_database pattern1 pattern2 ...With MPI:
mpirun -np number_of_machines -f hosts ./apmMPI approximation_factor dna_database pattern1 pattern2 ...Hybrid (OpenMP + MPI) :
export OMP_NUM_THREAD=number_of_cores;
mpirun -np number_of_machines -f hosts ./apmParallel approximation_factor dna_database pattern1 pattern2 ...- OpenMP Benchmark :
./run.sh | tee run.data
gnuplot speedup.sh
display SpeedUpAPM-OMP.png- MPI Benchmark :
./mpi_speedup.sh | tee mpi_speedup.data
gnuplot mpi_speedup.sh
display SpeedUpAPM-MPI.pngDetailed benchmark and results analysis to be found at apm/results/Slides CSC5001.pdf