-
Notifications
You must be signed in to change notification settings - Fork 0
Expand file tree
/
Copy pathORIGINAL-README.txt
More file actions
90 lines (71 loc) · 4.97 KB
/
ORIGINAL-README.txt
File metadata and controls
90 lines (71 loc) · 4.97 KB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
This file is part of GPCCA.
Copyright (c) 2018, 2017 Bernhard Reuter
with contributions of Marcus Weber and Susanna Röblitz
If you use this code or parts of it, cite the following reference:
Reuter, B., Weber, M., Fackeldey, K., Röblitz, S., & Garcia, M. E. (2018). Generalized
Markov State Modeling Method for Nonequilibrium Biomolecular Dynamics: Exemplified on
Amyloid β Conformational Dynamics Driven by an Oscillating Electric Field. Journal of
Chemical Theory and Computation, 14(7), 3579–3594. https://doi.org/10.1021/acs.jctc.8b00079
GPCCA is free software: you can redistribute it and/or modify
it under the terms of the GNU Lesser General Public License as published
by the Free Software Foundation, either version 3 of the License, or
(at your option) any later version.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
You should have received a copy of the GNU Lesser General Public License
along with this program. If not, see <http://www.gnu.org/licenses/>.
------------------------------------------------------------------------------------------
For questions or support contact
Bernhard Reuter
bernhard-reuter@gmx.de
or fill a pull request.
------------------------------------------------------------------------------------------
Download the code at: https://github.com/msmdev/gpcca
------------------------------------------------------------------------------------------
How to use GPCCA:
Firstly, you will need an installation of MATLAB. Since GPCCA was developed and tested
with MATLAB version 2015b, it is advised to use this version but later versions should
work as well.
To execute GPCCA, start MATLAB and run the script main.m. This script contains the
relevant input parameters, which are preset for the example of butane (the count matrix of
butane is stored in the file count.txt). The script executes the main program gpcca.m,
which will ask you to supply the name of a input file containing the count matrix
characterizing your system/process. This count matrix needs to be generated in advance
from your data (e.g. molecular dynamics time series data) as described in detail in the
above reference. There you can also find details about using GPCCA on protein
conformational dynamics data gained from molecular dynamics simulations. You might also
take a look at the flowchart supplied in the file Algorithm.png.
The program is pleasant to use interactively and automatically stores a whole range of
interesting quantities and plots. The following procedure is advised:
1. If interactively selected: Use the minChi criterion for an interval of cluster numbers
(e.g., 2 to 10) specified in the main.m script to obtain a graph of the minChi's.
2. Based on the minChi graph the interval of cluster numbers (e.g., 2 to 5), within which
optimization is performed for each cluster number using Gauss-Newton, Nelder-Mead, or
Levenberg-Marquardt, is selected and inputted. Here it is advisable to use Gauss-Newton,
since even if it does not converge to the minimum it will come very close to it and is
extremely fast. A graph of the crispness is outputted, which serves to identify the
optimal cluster number for a final optimization, if one is desired.
The optimization loop can be executed in parallel (via parfor, by setting the option
wk.parallel = 1) or serial (wk.parallel = 0). This saves hours and days if you have
multiple cores and MATLABs Parallel Programming Toolbox.
3. If specified in the main.m script: Interactively, an predetermined optimal cluster
number (or an interval of predetermined cluster numbers) is entered and the final
optimization takes place - Nelder-Mead with a ( possibly very) large number of iterations
has proven to be a good choice here.
The optimization loop can be executed in parallel (via parfor, by setting the option
iopt.parallel = 1) or serial (iopt.parallel = 0).
There are two test suites that are very easy to use:
The unitTests Testsuite checks the functionality of each individual function, the
regressionTest Testsuite checks the correct functioning of the entire program (i.e. of
gpcca.m, which calls all other functions) based on known results for the butane count
matrix. Extensive test results and reports are stored in folders created by the tests -
these do not have to be checked manually and are only for archiving (can be deleted).
The test suites are called by run(unitTests) and run(regressionTest) respectively. Then
the desired precision is chosen in which the test is performed: 'mp' (multiprecision) is
only selectable if you have installed the Multiprecision Toolbox of Advanpix, 'double' is
the Matlab standard.
Also for the overall program the precision can and must be determined. This happens in the
main.m script and is by default set to 'double' there.
WARNING: Don't set a precision bellow 'double'!!! NEVER use single precision!!!