forked from jyothishnt/GPS-Project-Website
-
Notifications
You must be signed in to change notification settings - Fork 2
Expand file tree
/
Copy pathtraining_command_line.html
More file actions
242 lines (207 loc) · 10.4 KB
/
training_command_line.html
File metadata and controls
242 lines (207 loc) · 10.4 KB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
<!DOCTYPE html>
<html lang="en">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
<meta charset="utf-8">
<title>GPS :: Global Pneumococcal Sequencing Project</title>
<meta name="description" content="">
<meta name="viewport" content="width=device-width, initial-scale=1">
<link rel="stylesheet" href="https://netdna.bootstrapcdn.com/bootstrap/3.1.0/css/bootstrap.min.css">
<link href='https://fonts.googleapis.com/css?family=Open+Sans:400italic,600italic,400,300,600' rel='stylesheet'
type='text/css'>
<!-- fonts, icons for bootstrap -->
<link href="https://netdna.bootstrapcdn.com/font-awesome/4.0.3/css/font-awesome.css" rel="stylesheet">
<link href="css/styles.min.css" rel="stylesheet">
<!-- HTML5 Support for IE -->
<!--[if lt IE 9]>
<script src="https://cdnjs.cloudflare.com/ajax/libs/html5shiv/3.7.2/html5shiv.min.js"></script>
<![endif]-->
<!-- Favicon -->
<link rel="shortcut icon" href="gps.ico">
</head>
<body>
<!-- Header Starts -->
<header>
<div class="container">
<div class="row">
<div class="logo">
<h1><i class=""></i><a href="index.html"><span class="color">GPS</span></a></h1>
<div class="hmeta">Global Pneumococcal Sequencing Project</div>
</div>
</div>
</div>
</header>
<!-- Navigation bar starts -->
<div class="navbar navbar-default bs-docs-nav top-menu" role="banner">
<div class="container">
<div class="navbar-header">
<button class="navbar-toggle" type="button" data-toggle="collapse" data-target=".bs-navbar-collapse">
<span class="sr-only">Toggle navigation</span>
<span class="icon-bar"></span>
<span class="icon-bar"></span>
<span class="icon-bar"></span>
</button>
</div>
<nav class="collapse navbar-collapse bs-navbar-collapse" role="navigation">
<ul class="nav navbar-nav">
<li>
<a id="home" href="index.html">
<!-- <i class="fa fa-home blue"> </i> --> Home
</a>
</li>
<li class="dropdown">
<a class="dropdown-toggle" data-toggle="dropdown" href="#">
About <b class="caret"></b>
</a>
<ul class="dropdown-menu">
<li class=""><a id="summary" href="project_outline.html">Project Outline</a></li>
<li class=""><a id="summary" href="the_team.html">The Team</a></li>
<li class=""><a id="founders" href="partners.html">Project Partners </a></li>
<li class=""><a id="substudies" href="substudies.html">Sub-studies</a></li>
</ul>
</li>
<li class="dropdown">
<a class="dropdown-toggle" data-toggle="dropdown" href="#">Resources <b class="caret"></b></a>
<ul class="dropdown-menu">
<li class=""><a id="resources_overview" href="resources_overview.html">Overview</a></li>
<li class=""><a id="sampling_map" href="sampling_map.html">Countries</a></li>
<li class=""><a id="GPSC_lineages" href="GPSC_lineages.html">Strains</a></li>
<li class=""><a id="serotypes" href="serotypes.html">Serotypes</a></li>
<!-- <li class=""><a id="public_database" href="/public_database.html">Database (public)</a></li> -->
<li class=""><a id="data_downloads" target="_blank" href="/dataviewer/gps/data">Database</a></li>
</ul>
</li>
<li><a id="publications" href="publications.html">Publications</a></li>
<li class="dropdown active">
<a class="dropdown-toggle" data-toggle="dropdown" href="#">Training <b class="caret"></b></a>
<ul class="dropdown-menu">
<li class=""><a id="training_browser_based" href="training_drag_and_drop.html">Drag and Drop Tools</a>
</li>
<li class=""><a id="training_command_line" href="training_command_line.html">Command Line</a></li>
</ul>
</li>
<li><a id="contact" href="contact.html">Contact</a></li>
</ul>
</nav>
</div>
</div>
<!-- Navigation bar ends -->
<div id="content">
<!-- Here goes the content for 'summary' menu item in the top navigation bar -->
<section>
<div class="container content justify">
<header>
<h1><i class="fa fa-pencil-square-o blue"></i> Training: Command Line</h1>
</header>
<div class="desc">
<!-- TODO: add training command line -->
<h3>In silico serotyping</h3>
<p>Install SeroBA (<a target="_blank"
href="https://www.microbiologyresearch.org/content/journal/mgen/10.1099/mgen.0.000186">Epping et al
2018</a>) as per instructions at <a target="_blank"
href="https://github.com/sanger-pathogens/seroba#installation">https://github.com/sanger-pathogens/seroba#installation</a>
and git clone the database from the following link <a target="_blank"
href="https://github.com/sanger-pathogens/seroba.git">https://github.com/sanger-pathogens/seroba.git</a>.
</p>
<p>Files required to run serotyping using SeroBA:</p>
<ol>
<li>paired-end fastq files</li>
<li>database</li>
<li>sample list (only for running on multiple samples)</li>
</ol>
<p>Run in silico serotyping on a single sample:
<pre>serotype runSerotyping <full path to the database> <read 1> <read 2> <output folder prefix></pre>
</p>
<p>Run in silico serotyping on multiple samples:</p>
<ol>
<li>
create a list of sample names and save it as samplelist (e.g. the sample name for
<em>24371_8#283_1.fastq.gz</em> is <em>24371_8#283</em>)
</li>
<li>
<pre>for f in $(cat samplelist); do seroba runSerotyping <path to the database> ${f}_1.fastq.gz ${f}_2.fastq.gz ${f}; done</pre>
</li>
<li>
<pre>seroba summary ./</pre>
</li>
</ol>
<p>
<strong>Output:</strong>
<br>
summary.tsv
</p>
<p>
<!-- TODO: link to instructions -->
These instructions are available to download here: <a href="in-silico-serotyping.txt"
download>Instructions</a>
</p>
<h3>GPSC assignment</h3>
<p>Install popPUNK (<a target="_blank" href="https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6360808/">Lees et al
2018</a>) as per instructions at <a target="_blank"
href="https://poppunk.readthedocs.io/en/latest/installation.html">https://poppunk.readthedocs.io/en/latest/installation.html</a>
and download the GPS reference database “GPS_query.tar.bz2” from the following link <a target="_blank"
href="https://www.pneumogen.net/gps/GPS_query.tar.bz2">Database</a> and the GPSC designations
“gpsc_definitive.csv” from <a target="_blank"
href="https://www.pneumogen.net/gps/gpsc_definitive.csv">CSV</a> this page.</p>
<p>Files required to run GPSC assignment using popPUNK:</p>
<ol>
<li>queries.txt: a list of paths to assemblies you wish to assign GPSCs to</li>
<li>GPS_query: GPS reference database, uncompress GPS_query.tar.bz2</li>
<li>gpsc_definitive.csv: Published GPSC designations for the references</li>
</ol>
<p>
output directory name is assigned using <em>--output</em>
<br>
number of threads can be changed using <em>–threads</em>
</p>
<p>Run GPSC assignment:</p>
<pre>poppunk --assign-query --ref-db GPS_query --distances GPS_query/GPS_query.dists --model-dir GPS_query --q-files queries.txt --output GPSC_assignment --threads 8 --full-db --external-clustering gpsc_definitive.csv</pre>
<p>
<strong>Outputs:</strong>
<br>
<em>_clusters.csv</em>: popPUNK clusters with dataset specific nomenclature
<br>
<em>_external_clusters.csv</em>: GPSC v2 scheme designations
</p>
<p>Novel Clusters: Will be assigned NA in the <em>_external_clusters.csv</em> as they have not been seen in
the v2 dataset used to designate the GPSCs. The popPUNK <em>_clusters.csv</em> file can be used to determine
if NA isolates are the same cluster or not.</p>
<p>Please email: <strong>globalpneumoseq@gmail.com</strong> to have novel clusters added to the database and a
GPSC cluster name assigned after you have checked for low level contamination which may contribute to biased
accessory distances.</p>
<p>Merged clusters: Unsampled diversity may represent missing variation linking two clusters. GPSCs are then
merged. For example if GPSC23 and GPSC362 merged, the GPSC would be then reported as GPSC23, with a merge
history of GPSC23;362.</p>
<p>These instructions are available to download here: <a download
href="https://www.pneumogen.net/gps/GPSC_README.rtf">Instructions</a></p>
</div>
</div>
</section>
<div id="modal"></div>
<footer>
<div class="container">
<div class="row copy">
<div class="col-lg-7 col-md-7 col-sm-12 copy-left">
<span>GPS © 2016</span> -
<a href="cookiespolicy.html" onclick="">Cookies policy</a> | <a href="legal.html">Terms &
Conditions.</a>
</div>
<div class="col-lg-5 col-md-5 col-sm-12 copy-right">
This site is hosted by the <a href="https://www.sanger.ac.uk/">Wellcome Trust Sanger Institute</a>
</div>
</div>
</div>
<div class="clearfix"></div>
</footer>
<script type="text/javascript" src="https://code.jquery.com/jquery-latest.min.js"></script>
<script src="https://netdna.bootstrapcdn.com/bootstrap/3.1.0/js/bootstrap.min.js"></script>
<script src="js/custom.min.js"></script>
<script type="text/javascript" src="/zxtm/piwik2.js"></script>
<script type="text/javascript" src="https://www.google.com/jsapi"></script>
<script type="text/javascript">
/***** To show chart in the summary page ****/
google.load('visualization', '1', { packages: ['corechart'] });
google.setOnLoadCallback(drawChart);
</script>
</body>
</html>