-
Notifications
You must be signed in to change notification settings - Fork 1
Expand file tree
/
Copy pathERROR_CORRECTION
More file actions
182 lines (154 loc) · 9.05 KB
/
ERROR_CORRECTION
File metadata and controls
182 lines (154 loc) · 9.05 KB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
#
# '$RCSfile: ERROR_CORRECTION,v $'
# '$Author: mlee $'
# '$Date: 2009-05-21 17:33:15 $'
# '$Revision: 1.3 $'
# '$Date: 2009-05-21 17:33:15 $'
#
When things go wrong with Vegbank: What to do when something doesn't work
-------------------------------------------------------------------------
Detecting and evaluating errors
-------------------------------
There are numerous strange errors that could be produced by vegbank. These
errors can be delivered in several ways. One way is for the server to
notify the user directly via the web site. This usually comes in the form
of an error page and hopefully gives the user some clue as to what went
wrong.
Another way is that the server may recover from an error, but
write a message to the log file. The log file is located in the
/usr/vegbank/logs directory. The most current one will be called vegbank.log.
It is helpful to 'tail' the log with the following command:
tail -f /usr/vegbank/logs/vegbank.log
This command will show you in real time as entries are added to the log file.
A good way to debug is to have the log tailing in one window, then perform
the action that caused the error and watch what happens to the log file.
Another way the system can provide error information is via the tomcat log,
which contains any messages sent to the servlet container itself. This log
is in /usr/local/devtools/jakarta-tomcat/logs and the most recent is called
catlina.out. This log can also be tailed and watched for errors.
Recovering from errors
----------------------
The most serious problems can occur when tomcat runs out of memory. This
usually happens when a file in memory gets too big (such as when you're
downloading a ton of plots in an xml file). Usually tomcat will recover
itself, but if it doesn't and the vegbank webapp becomes unresponsive, tomcat
will need to be restarted (see Restarting Vegbank Subsystems below).
Memory issues can be debugged through the use of a few standard commands. The
command 'free' tells you how much free memory is left on the system. The
output will look something like this:
[berkley@vegbank vegbank]$ free
total used free shared buffers cached
Mem: 3035248 3019660 15588 0 99380 988428
-/+ buffers/cache: 1931852 1103396
Swap: 2048248 1447180 601068
In this example, there is 15 MB (15588 KB) of free memory available. If the
free level is close to 0, usually tomcat will need to be restarted. A way
to see if java is the culprit is to use the 'top' command:
[berkley@vegbank vegbank]$ top
11:59:49 up 671 days, 18:02, 7 users, load average: 0.00, 0.06, 0.05
205 processes: 204 sleeping, 1 running, 0 zombie, 0 stopped
CPU states: cpu user nice system irq softirq iowait idle
total 0.0% 0.0% 0.4% 0.0% 0.0% 0.0% 99.5%
cpu00 0.0% 0.0% 0.0% 0.0% 0.0% 0.0% 100.0%
cpu01 0.0% 0.0% 0.9% 0.0% 0.0% 0.0% 99.0%
cpu02 0.0% 0.0% 0.9% 0.0% 0.0% 0.0% 99.0%
cpu03 0.0% 0.0% 0.0% 0.0% 0.0% 0.0% 100.0%
Mem: 3035248k av, 3026796k used, 8452k free, 0k shrd, 99432k buff
2235720k actv, 425716k in_d, 64072k in_c
Swap: 2048248k av, 1447176k used, 601072k free 988820k cached
PID USER PRI NI SIZE RSS SHARE STAT %CPU %MEM TIME CPU COMMAND
26850 tomcat 25 0 1585M 1.5G 26884 S 0.0 53.1 0:06 0 java
26852 tomcat 15 0 1585M 1.5G 26884 S 0.0 53.1 67:15 3 java
26853 tomcat 15 0 1585M 1.5G 26884 S 0.0 53.1 1:10 0 java
Top will give you a running total of what processes are using which resources.
You can sort by CPU usage (type 'P' while top is running) or memory usuage
(type 'M' while top is running). In this example, java is the number one memory
user with 53.1% of the total system memory. It is normal for there to be
many java processes running under the tomcat username. If the system
is locked up and you see that any one of the java processes is constantly
using 100% of a CPU for an extended period of time, that process is usually
locked up and tomcat needs to be restarted.
Other not-so-serious errors should be handled by tomcat or vegbank and may
do something inocuous such as terminate a users download. If an error does
not fix itself, restarting tomcat is usually the best way to deal with it.
If there are any data issues, you may see that postgreSQL has somehow locked up
or is stuck on a process. Top will indicate this to you if you see the
process 'postmaster' hogging 100% of one or more CPUs for a prolonged period.
This can be fixed by restarting postgres.
Processes can be killed (with caution) by typing kill [PID] where PID is the process ID
listed above. Sometimes Java or Postmaster may get stuck and need to be killed.
Restarting Vegbank Subsystems
-----------------------------
TOMCAT
1) sudo systemctl restart tomcat9
-- you should restart tomcat if you see OutOfMemory Errors, or if stuff generally isn't working.
-- if you can't see any data on VegBank, this may be a tomcat error, or also postgres.
-- if the system is really slow, there could be a stuck process (might have to kill it, see above)
POSTGRES
1) sudo systemctl restart postgresql@10-main.service
-- do this if you've already restarted tomcat and data still isn't viewable on VegBank
APACHE (HTTPD)
1) sudo systemctl restart apache2
-- do this if the website doesn't display anything at all (rare)
*Note that if you are bringing the system up and none of these systems are
running, the order you should bring them up in is
1) postgres
2) tomcat
3) apache
Utilities
---------
There are several utilities located in the /usr/vegbank/bin directory. The
main ones that deserve comment are:
1) vbdump -- the backup script. This is run from /etc/cron.daily every night.
It creates the backup files in /usr/vegbank/backup and rotates them as needed.
It also calls vacuum analyze on the database which clears out uneeded handles
and makes the DB run faster.
2) runAccGen -- runs the accession code generator
This may be necessary to run after plot data are loaded if the accession
code generation fails after plot load. runDenorms.sh will also be necessary
example: /usr/vegbank/bin/runAccGen vegbank
(name of database) is the required parameter
3) runKwGen -- creates keywords for the quick search
4) runDenorms.sh -- runs the denormalization for the quick search
This many be necessary if plants aren't showing up for a plot. This is due
to denormalized fields protecting plots by default until the plot is
marked as not having an embargo.
Usage: runDenorms.sh [table to update | all] [type of update|all]
Example: /usr/vegbank/bin/runDenorms.sh observation
Example: /usr/vegbank/bin/runDenorms.sh all all
**this will take a while.**
5) daily_maintenance.sh -- runs every night at 3 AM from a conjob setup on the
vegbank users's crontab. This updates the info on the home page (various
plot counts, etc) and also removes any loose leaf datasets that are
no longer referenced.
There are also some utilities that are not in the bin directory, but rather
in the ant build.xml file. These targets include:
1) generateDBSQL -- generate all of the SQL needed for the db in the
vegbank/build/sql dir
2) generateBEANS -- generate all of the beans from the data model and put
them in the build dir
3) dbVerify -- checks that the database matches the currently checked in
data model
4) runxmlcache -- caches all beans in the database as xml. this shouldn't
need to be run unless you're building a db from scratch
5) createdb -- creates an empty vegbank database
6) binbuild -- copies all of the bin files to build/bin and does token
substitution
7) bin -- copies all of the bin files to /usr/vegbank/bin and does token
substitution
Misc examples of things having gone wrong (and hopefully how they were fixed):
-------------------------------------------------------------------------------------------------------------
10-Sep-2006: when you type in http://vegbank.org in a web browser
it states that vegbank.org refused the connection. This was fixed by Colby and was because:
"Apache was not loading cause it could not find the directory that it was trying to write the error log file
to for the vegbank.org virtual host. It was trying to write to /usr/www/vegbank/logs/error.log and the
logs directory did not exist there. I created it and started apache. It all seems to be working now."
29-SEP-2009: we were getting an error on the overnight cron stating:
/etc/cron.daily/vbdump:
NOTICE: number of page slots needed (226496) exceeds max_fsm_pages (35000)
HINT: Consider increasing the configuration parameter "max_fsm_pages" to a value over 226496.
VACUUM
This cron does two things: 1) backup the database and 2) vacuum the database. I tried vacuuming the database manually and got the same error message.
I went to:
http://varlena.com/GeneralBits/Tidbits/perf.html#maxfsmp
and discoved that you can try a "vacuum full" which I have done, and it may solve the problem.