Skip to content

Thyerlab/nanopore_amplicon_seq

Repository files navigation

Generate DMS sequence map

Analysis see file <Nanopore_ampliconseq_intphoCKOD_analysis.ipynb>

  • DMS oligo library is generated by SPINE tool modified by Thyer lab. Publish on https://github.com/chillwei/SPINE_Thyer.

  • Deep mutational scanning tool helps to fragmentize the gene for the purpose of introducing singal amino acid mutations (including insertions, deldetions, replacement of 20 amino acid except for wild type aa) in synthetic oligo. Each oligo covers one chunk of the gene and carries one amino acid mutation. The fragment size is dependent on gene length and Synoligo length.

  • DMS sequence map is generated based on the sequence in DMS Synthetic oligo ordering sheet.

  • According to the ID and DNA sequence of each DMS Synoligo, we are able to extract information for specific mutation type, mutation location and also we are able to recover the mutant DNA sequence and protein sequence.

Profiling the nanopore amplicon seq refined data to DMS oligo info map

Worflow for data refinement

  • Raw nanopore seq data will be quality filtered through fastp

  • Each submission of amplicon seq might contain multiple different library. vfind (https://github.com/nsbuitrago/vfind) will be used to align the barcodes to specific library and pull out the library. Since the library extraction is based on sequence alignment of 2 barcodes in each sequence from nanopore, the alignment threshold is adjustable. So far it will return mixed library population which will not be a concern, because furhter filtering will help refine the data.

Further filtering pipeline:

  • length filter; get rid of stop codon

  • Map filtered data to DMS oligo info sheet

    current mapping is based on protein sequence, but mutant DNA sequence is also available for other purpose.

This filtering will neglect a fair amount of the mutant population that carries multiple mutations during the library prep and selection. Further pipeline will be set up for this population. (in progress)

Python Dependencies

biopython==1.78

numpy==1.26.4

pandas==2.2.1

pyfastx==2.1.0

python-dateutil==2.9.0.post0

pytz==2024.1

six==1.16.0

tzdata==2024.1

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors