-
Notifications
You must be signed in to change notification settings - Fork 3
Description
We've been able to reliably reproduce a crashing bug on CentOS 7.9.2009 amd64 running kernel 3.10.0-1160.42.2.el7.x86_64 which appears to surround resource starvation
Primary testing system was a "DD-Premium-Intel" 4-core Xeon system on DigitalOcean which cpuinfo clocks at 2494.102 mHz
Testing system had 50 concurrency ports licenced.
Tested with a testing context that reads as follows:
[stress-test]
exten => 1,1,Answer()
exten => 1,n,Set(SWIFT_VOICE=Allison-8kHz)
exten => 1,n,Swift(This is Swift talking in the Allison Smith voice)
exten => 1,n,Hangup()
A stress test callfile was created, thusly:
Channel: Local/1@stress-test
MaxRetries: 2
RetryTime: 60
WaitTime: 30
Context: stress-test
Extension: 1
The following Bash snippet was used to spawn 250 simultaneous test calls:
while true; do for i in {1..250}; do echo $i && cp stress-test.call "/var/spool/asterisk/outgoing/$i.call"; done; done
We used core set verbose 7 and core set debug 7 in Asterisk CLI to get granular output, and once the above snippet was run, we initially saw for approximately 10-15 seconds that Asterisk was beginning to process test calls, but many instances of the following would ultimately be printed, followed shortly by Asterisk CLI disconnecting and Asterisk daemon terminating abnormally:
[Oct 15 13:48:15] DEBUG[19793][C-00000002] app_swift.c: Whoops, writer starved for audio
Bottleneck on the system was observed as CPU time, with all 4 cores fully saturated at Asterisk crash time. No OOM killing on the Asterisk daemon was observed, physmem usage for the whole system stayed below 300MiB out of 8GiB available by crash time.
Obviously, in some situations, resource starvation is just a reality, however, such resource starvation certainly should not cause abnormal termination of Asterisk, which could lead to total unavailability of a production PBX system.
Please let me know if any additional information or guidance on this issue are needed.