Skip to content

EO4EU/anomaly-detection-preprocessor

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

The EO4EU logo

Funded by the EU.

anomaly-detection-preprocessor

This component starts by collecting Prometheus metrics from a historic window to form the training dataset. The training dataset is then uploaded to a S3 bucket and a message is sent to the next component of the anomaly detection workflow with the needed information to find the uploaded dataset in S3. Continuing, the preprocessor enters an infite loop, collects current metric instances at a fixed interval, and sends those metrics to the next component by encapsulating them to the Kafka message itself.

The anomaly detection preprocessor collects all counter and gauge cAdvisor metrics from all pods that constitute a particular microservice. A microservice is defined by the namespace it resides in and a list of the deployments that constitute the microservice. It is assumed that a single microservice is not shared across multiple namespaces. The term deployment here is used to refer to either a Kubernetes Deployment or StatefulSet. Each deployment consists of one or more pods that are subject to the auto-scaling, load balancing, and self-healing features of Kubernetes. Pods which are created by the auto-scaling mechanism are called replicas and are copies of each other. Since in such a dynamic environment the number of pod replicas is bound to change, the data gathering reflects that fact. Pod replicas have their metrics aggregated and divided by the number of replicas as of the time of metric collection. The main assumption is the fact that pod replicas are created or destroyed based on an autoscaling mechanism, whose purpose is to maintain the equal sharing of resources among them.

Gauge metrics are used as they are, while the rate of counter metrics is used instead. The Prometheus rate function calculates the per-second average rate of increase of the time series in the range vector and automatically handles breaks in monotonicity. The rate period should not be smaller than the scraping interval of Prometheus. The queries sent to Prometheus follow:

Counter metrics: sum by (pod) (rate(metric{namespace="...",pod=∼"..."}[5m]))
Gauge metrics: sum by (pod)(metric{namespace="...",pod=∼"..."})

The namespace field represents the namespace of the microservice, while the pod field represents the deployments that make-up the microservice. The =∼ expression used in the pod field selects labels that regex-match the provided strings.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published