glhap

DOCUMENTATION

This software provedis you a way to calculate haplogroups depending of genotype likelihood

Dependencies

C:

htslib

https://github.com/samtools/htslib

Python

anytree 2.8.0
numpy 1.17.4
pysam 0.18.0

Also you may need PhyloTree.org-parser for making haplogrep tree in json format https://github.com/alexeyshmelev/PhyloTree.org-parser/tree/main

How to run:

Calculate haplogroups with samtools likelihood model:

glhap.py

you can simply use

./glhap.py array/array.json {vcf_file}.vcf

array.json is provided phylotree17 in json.

The program will produce top 10 haplogroups with their log-likelihood

Calculate haplogroups with hamming distance model:

HOW TO USE g++ hamgrep.c -lhts -o hamgrep ./hamgrep f1.fa

f1.fa is fasta file for called dna YOU MUST HAVE A "filelist.txt" FILE IN FOLDER WITH EXECUTIVE FILE WHERE LIST OF FASTA FILES MUST BE WRITTEN MUST BE WRITTEN

Calculate haplogroups with gatk likelihood model:

g++ hamgrep.c -lhts -o hamgrep

./hamgrep {piliup}.pileup

{pileup}.pileup is pileup file, you can get it with

samtools mpileup -f ref.fa -B -o pileup.pileup in.bam

you must have A "filelist.txt" file in folder with executive file where list of fasta files must be written.

gl_cont

This software provides you to estimate contamination in DNA based on quality score.

Dependencies

simlord 1.0.4
numpy 1.22.3
tqdm 4.64.0
biopython 1.78
cython 0.29.30
pysam 0.19.1 Picard GATK

How to run:

for first use you should type python setup.pu build_ext --inplace

then

./gl_cont.py bam.bam ref.ref contaminants.fa nIter where nIter - number of iteration for MCMC

Simulate data

You may generate data with script based on simlord

python contamsim.py hap1.fa prop1 [...] hapn.fa propn prop1 is number of reads divided by 100000

Name		Name	Last commit message	Last commit date
Latest commit History 55 Commits
array		array
data		data
fasta		fasta
programs(codes)		programs(codes)
.gitignore		.gitignore
CnsMaj3_1.pl		CnsMaj3_1.pl
Draw_graphs.ipynb		Draw_graphs.ipynb
README.md		README.md
filelist.txt		filelist.txt
glhap.ipynb		glhap.ipynb
glhap.py		glhap.py
glhap_app.py		glhap_app.py
hapwork.ipynb		hapwork.ipynb
hapwork.py		hapwork.py
likelihood_calculation.py		likelihood_calculation.py
merge.sh		merge.sh
rcrs.fa		rcrs.fa
refchrm.fa		refchrm.fa
rsrs.fasta		rsrs.fasta
technical_notes.txt		technical_notes.txt
tree_construct.py		tree_construct.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

glhap

DOCUMENTATION

Dependencies

How to run:

Calculate haplogroups with samtools likelihood model:

Calculate haplogroups with hamming distance model:

Calculate haplogroups with gatk likelihood model:

gl_cont

Dependencies

How to run:

Simulate data

About

Uh oh!

Releases

Packages

Uh oh!

Languages

Genomics-HSE/glhap

Folders and files

Latest commit

History

Repository files navigation

glhap

DOCUMENTATION

Dependencies

How to run:

Calculate haplogroups with samtools likelihood model:

Calculate haplogroups with hamming distance model:

Calculate haplogroups with gatk likelihood model:

gl_cont

Dependencies

How to run:

Simulate data

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages