Analysis see file <Nanopore_ampliconseq_intphoCKOD_analysis.ipynb>
-
DMS oligo library is generated by SPINE tool modified by Thyer lab. Publish on https://github.com/chillwei/SPINE_Thyer.
-
Deep mutational scanning tool helps to fragmentize the gene for the purpose of introducing singal amino acid mutations (including insertions, deldetions, replacement of 20 amino acid except for wild type aa) in synthetic oligo. Each oligo covers one chunk of the gene and carries one amino acid mutation. The fragment size is dependent on gene length and Synoligo length.
-
DMS sequence map is generated based on the sequence in DMS Synthetic oligo ordering sheet.
-
According to the ID and DNA sequence of each DMS Synoligo, we are able to extract information for specific mutation type, mutation location and also we are able to recover the mutant DNA sequence and protein sequence.
-
Raw nanopore seq data will be quality filtered through fastp
-
Each submission of amplicon seq might contain multiple different library. vfind (https://github.com/nsbuitrago/vfind) will be used to align the barcodes to specific library and pull out the library. Since the library extraction is based on sequence alignment of 2 barcodes in each sequence from nanopore, the alignment threshold is adjustable. So far it will return mixed library population which will not be a concern, because furhter filtering will help refine the data.
-
length filter; get rid of stop codon
-
Map filtered data to DMS oligo info sheet
current mapping is based on protein sequence, but mutant DNA sequence is also available for other purpose.
This filtering will neglect a fair amount of the mutant population that carries multiple mutations during the library prep and selection. Further pipeline will be set up for this population. (in progress)
biopython==1.78
numpy==1.26.4
pandas==2.2.1
pyfastx==2.1.0
python-dateutil==2.9.0.post0
pytz==2024.1
six==1.16.0
tzdata==2024.1