This is a modified fork of Luminol for use with Skyline
Luminol is a light weight python library for time series data analysis. The two major functionalities it supports are anomaly detection and correlation. It can be used to investigate possible causes of anomaly. You collect time series data and Luminol can:
- Given a time series, detect if the data contains any anomaly and gives you back a time window where the anomaly happened in, a time stamp where the anomaly reaches its severity, and a score indicating how severe is the anomaly compare to others in the time series.
- Given two time series, help find their correlation coefficient. Since the correlation mechanism allows a shift room, you are able to correlate two peaks that are slightly apart in time.
Luminol is configurable in a sense that you can choose which specific algorithm you want to use for anomaly detection or correlation. In addition, the library does not rely on any predefined threshold on the values of a time series. Instead, it assigns each data point an anomaly score and identifies anomalies using the scores.
By using the library, we can establish a logic flow for root cause analysis. For example, suppose there is a spike in network latency:
- Anomaly detection discovers the spike in network latency time series
- Get the anomaly period of the spike, and correlate with other system metrics(GC, IO, CPU, etc.) in the same time range
- Get a ranked list of correlated metrics, and the root cause candidates are likely to be on the top.
Investigating the possible ways to automate root cause analysis is one of the main reasons we developed this library and it will be a fundamental part of the future work.