After merging #51, I did some more general sanity checks and noticed some oddities, which I talked with @pingometer support about. Our calculations are now somewhat similar to Pingometer’s, but not the same—and they never will be:
It turns out Pingometer calculates uptime based on each check of a monitor. That is, every check is factored into Pingometer’s “uptime” calculation, regardless of whether it passed the threshold needed to trigger an event. A monitor’s sensitivity setting determines how many consecutive failed checks result in an event. There is no sensitivity level that results in a single failed check leading to an incident: http://support.pingometer.com/knowledge_base/topics/what-is-the-sensitivity
We’re now calculating uptime based on events, which leaves us with slightly different results. That’s not a bad thing—whether uptime is calculated based on times we actually classify the service as down vs. every individual unsuccessful check is pretty subjective. In some contexts or in some philosophies, what we’re no doing is more correct. In others, Pingometer’s approach is more right.
Either way, Pingometer gives us all the data we need to choose our approach. HOWEVER, because checks are frequent across their platform, Pingometer (quite reasonably) only stores individual check data for a few days. So if we want to change the way we calculate things, we can do it going forward, but can’t get historical data.
This probably means I should also be capturing checks in addition to events, but we should also figure out the appropriate approach to calculations here.
After merging #51, I did some more general sanity checks and noticed some oddities, which I talked with @pingometer support about. Our calculations are now somewhat similar to Pingometer’s, but not the same—and they never will be:
It turns out Pingometer calculates uptime based on each check of a monitor. That is, every check is factored into Pingometer’s “uptime” calculation, regardless of whether it passed the threshold needed to trigger an event. A monitor’s sensitivity setting determines how many consecutive failed checks result in an event. There is no sensitivity level that results in a single failed check leading to an incident: http://support.pingometer.com/knowledge_base/topics/what-is-the-sensitivity
We’re now calculating uptime based on events, which leaves us with slightly different results. That’s not a bad thing—whether uptime is calculated based on times we actually classify the service as down vs. every individual unsuccessful check is pretty subjective. In some contexts or in some philosophies, what we’re no doing is more correct. In others, Pingometer’s approach is more right.
Either way, Pingometer gives us all the data we need to choose our approach. HOWEVER, because checks are frequent across their platform, Pingometer (quite reasonably) only stores individual check data for a few days. So if we want to change the way we calculate things, we can do it going forward, but can’t get historical data.
This probably means I should also be capturing checks in addition to events, but we should also figure out the appropriate approach to calculations here.