Smith-Waterman algorithm implementations for Introduction to Scientific Computing course

This repository contains code of some simple implementations and testing code for the Smith-Waterman algorithm. It was created as a final project for the Introduction to Scientific Computing course, taught in autumn 2024 at the Galielian School of Higher Education).

Introduction to the problem and the Smith-Waterman algorithm

Genome and protein alignment are fundamental bio-informatics techniques used to compare sequences of DNA, RNA, or proteins to identify regions of similarity. These alignments reveal evolutionary relationships, functional similarities, or conserved regions across species or within different genes and proteins of the same organism. Genomes can consist of billions of base pairs (the human genome has over 3 billion base pairs!). Aligning such large datasets requires significant computational resources.

The Smith-Waterman algorithm is a dynamic programming algorithm used specifically for local sequence alignment, that is to identify the most similar subsequences between two sequences.

This repository

We have experimented mainly with three different approaches using OpenMP:

Using locks, a block division using locks to syncronize.
Using atomic counters
Using tasks, a Divide&Conquer recursive approach.

The relative implementations are in the src/ folder, along with some testing code.

Future work

Some nice things could be tried in the future: A number of different approaches are possible:

MPI could be well suited to solve this problem: the communication between processes is much smaller than the computation done by each of them.
GPUs are widely used in the literature for this algorithm, though this is probably very tricky.

Name		Name	Last commit message	Last commit date
Latest commit History 52 Commits
.github/workflows		.github/workflows
data		data
presentation		presentation
src		src
.gitignore		.gitignore
1M.fa		1M.fa
LICENSE		LICENSE
README.md		README.md
ReferenceDataGenerator.py		ReferenceDataGenerator.py
SmithWatermanTestGen.py		SmithWatermanTestGen.py
archive.tar.gz		archive.tar.gz
data.sh		data.sh
makefile		makefile
smith_waterman		smith_waterman
test.sh		test.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Smith-Waterman algorithm implementations for Introduction to Scientific Computing course

Introduction to the problem and the Smith-Waterman algorithm

This repository

Future work

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 3

Uh oh!

Languages

License

GiovaGa/smith-waterman_SciComp

Folders and files

Latest commit

History

Repository files navigation

Smith-Waterman algorithm implementations for Introduction to Scientific Computing course

Introduction to the problem and the Smith-Waterman algorithm

This repository

Future work

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 3

Uh oh!

Languages

Packages