-
Notifications
You must be signed in to change notification settings - Fork 0
resolve Heroku startup issues #1
Description
TL;DR: On Heroku's daily restart, this app sometimes crashes because it doesn't start up quickly enough.
Quick background on how OpenNews etherpad works
Overview: We run an etherpad-lite instance on Heroku, which is publicly available at https://etherpad.opennews.org. The instance uses a Postgres db and a Standard 1X dyno (512MB RAM, 1x CPU share). In testing via etherpad-load-test, these resources are more than enough to handle our normal traffic.
Deployment details: We use SSL for our etherpad-lite instance, which means that to run on Heroku, we need our own forked version of etherpad-lite that includes heroku-ssl-redirect. That means we also use a forked version of etherpad-lite-heroku, which pulls in our version of the etherpad software as a submodule.
The etherpad-lite-heroku wrapper is what actually gets deployed to Heroku, where it runs a launch script that does some config and starts the etherpad service.
The reboot problem
Heroku restarts your app dynos once a day for maintenance, which is normally a fine thing. However, etherpad-lite occasionally takes too long to start back up, resulting in:
During the etherpad-lite startup process, it checks in with npm on a whole list of dependencies. A number of them appear to be outdated, which I think might be the root of the problem here. The software seems to work fine once it's actually running, but sometimes the reboot itself takes long enough that Heroku throws a timeout and the app crashes again.
The short-term fix is manually restarting our Heroku instance—usually this only requires 1 or 2 restarts, but occasionally takes 10 or so. The most problematic times are mid-morning and midday (when there's more overall web traffic, which is what makes me think those dependency checks are the problem). The long-term fix, of course, involves updating the etherpad-lite software.
Some logs
I've been able to fork and modify these etherpad apps, get them running on Heroku, and do a certain amount of troubleshooting, but I'm about at my limit of feeling comfortable ripping into node software. Here are a few logs that hopefully tell some tales:
