Skip to content

submit command results in memory errors and rejects with <EOFError: end of file reached> invalid SIP descriptor errors and defunct processes #777

@lydiam

Description

@lydiam

I submitted the following command on fclnx30:

   sudo -b -u daitss submit --username lydiam --password fclad00d2 --source FTPDL --path /var/daitss/ops/incoming/ftpdl/FTPDL_20151212T090803 --batch FTPDL_20151212T090803

The result was that 6-8 of about 144 SIPs submitted successfully and the remainder rejected with the following error:

   sip path: /var/daitss/ops/incoming/ftpdl/FTPDL_20151212T090803/AA00036241_00001; 
   Invalid SIP descriptor. XML validation errors:
   line:unknown: msg:#<EOFError: end of file reached>

The submit log contained the "standard" log messages, with the majority of packages being rejected and some being submitted.:

   2015-12-14 12:57:47 INFO 2015-12-14 12:57:47 -0500 -- AA00036194_00001 -- rejected: E14HY8Q8F_RM36BJ
   2015-12-14 12:57:50 INFO 2015-12-14 12:57:50 -0500 -- AA00036192_00001 -- submitted successfully: EIQJ0H9KI_IKIHUJ
   2015-12-14 12:57:50 INFO 2015-12-14 12:57:50 -0500 -- AA00037821_00001 -- rejected: E30TCY4OV_ONW141

At the same time I go memory errors in STDOUT and logs named hs_err_pid*.log, with the following contents:
[lydiam@fclnx30 ftpdl]$ more hs_err_pid15104.log
#
# There is insufficient memory for the Java Runtime Environment to continue.
# Native memory allocation (malloc) failed to allocate 32744 bytes for ChunkPool::allocate
# Possible reasons:
# The system is out of physical RAM or swap space
# In 32 bit mode, the process size limit was hit
# Possible solutions:
# Reduce memory load on the system
# Increase physical memory or swap space
# Check if swap backing store is full
# Use 64 bit Java on a 64 bit OS
# Decrease Java heap size (-Xmx/-Xms)
# Decrease number of Java threads
# Decrease Java thread stack sizes (-Xss)
# Set larger code cache with -XX:ReservedCodeCacheSize=
# This output file may be truncated or incomplete.
#
# Out of Memory Error (allocation.cpp:214), pid=15104, tid=47784611424576
#
# JRE version: OpenJDK Runtime Environment (7.0_85-b01) (build 1.7.0_85-mockbuild_2015_07_13_18_00-b00)
# Java VM: OpenJDK 64-Bit Server VM (24.85-b03 mixed mode linux-amd64 compressed oops)
# Derivative: IcedTea 2.6.1
# Distribution: Built on Red Hat Enterprise Linux Server release 5.11 (Tikanga) (Mon Jul 13 18:00:16 EDT 2015)
# Failed to write core dump. Core dumps have been disabled. To enable core dumping, try "ulimit -c unlimited" before starting Java again

--------------- T H R E A D ---------------

Current thread (0x000000001a667000): JavaThread "C2 CompilerThread0" daemon [_thread_in_native, id=15130, stack(0x00002b75b8ba2000,0x00002b75b8ca3000)]

Stack: [0x00002b75b8ba2000,0x00002b75b8ca3000], sp=0x00002b75b8c9d440, free space=1005k
Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code)
V [libjvm.so+0x9e3ece]
V [libjvm.so+0x9e45fb]
V [libjvm.so+0x4dcf85]
V [libjvm.so+0x2fdea0]
....

These logs can be found in: /var/daitss/ops/incoming/ftpdl

There are also lots of defunct processes:

daitss 27706 27690 0 12:53 pts/0 00:00:00 [ruby]
daitss 27727 27690 0 12:53 pts/0 00:00:00 [ruby]
daitss 27745 27690 0 12:53 pts/0 00:00:00 [ruby]
daitss 27766 27690 0 12:53 pts/0 00:00:00 [ruby]
daitss 27787 27690 0 12:53 pts/0 00:00:00 [ruby]
daitss 27809 27690 0 12:53 pts/0 00:00:00 [ruby]

And other daitss processes from Dec. 13:

daitss 23705 1 0 Dec13 ? 00:03:44 Rack: /opt/web-services/sites/core/current
daitss 23713 1 0 Dec13 ? 00:00:00 Rack: /opt/web-services/sites/core/current
daitss 23719 1 0 Dec13 ? 00:00:00 Rack: /opt/web-services/sites/core/current
daitss 23725 1 0 Dec13 ? 00:00:21 Rack: /opt/web-services/sites/core/current
daitss 23736 1 0 Dec13 ? 00:00:49 Rack: /opt/web-services/sites/actionplan/current
daitss 23743 1 0 Dec13 ? 00:00:00 Rack: /opt/web-services/sites/actionplan/current
daitss 23749 1 0 Dec13 ? 00:00:00 Rack: /opt/web-services/sites/actionplan/current
daitss 23755 1 0 Dec13 ? 00:00:17 Rack: /opt/web-services/sites/actionplan/current
daitss 23763 1 0 Dec13 ? 00:00:02 Rack: /opt/web-services/sites/storage-master/current
daitss 23773 1 0 Dec13 ? 00:00:02 Rack: /opt/web-services/sites/storage-master/current
daitss 23787 1 0 Dec13 ? 00:00:00 Rack: /opt/web-services/sites/transform/current

I will kill the defunct processes and attempt to restart DAITSS.

Gerald suggested attempting submission of one package at a time.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions