anomaly-detection-preprocessor

Funded by the EU.

anomaly-detection-preprocessor

This component starts by collecting Prometheus metrics from a historic window to form the training dataset. The training dataset is then uploaded to a S3 bucket and a message is sent to the next component of the anomaly detection workflow with the needed information to find the uploaded dataset in S3. Continuing, the preprocessor enters an infite loop, collects current metric instances at a fixed interval, and sends those metrics to the next component by encapsulating them to the Kafka message itself.

The anomaly detection preprocessor collects all counter and gauge cAdvisor metrics from all pods that constitute a particular microservice. A microservice is defined by the namespace it resides in and a list of the deployments that constitute the microservice. It is assumed that a single microservice is not shared across multiple namespaces. The term deployment here is used to refer to either a Kubernetes Deployment or StatefulSet. Each deployment consists of one or more pods that are subject to the auto-scaling, load balancing, and self-healing features of Kubernetes. Pods which are created by the auto-scaling mechanism are called replicas and are copies of each other. Since in such a dynamic environment the number of pod replicas is bound to change, the data gathering reflects that fact. Pod replicas have their metrics aggregated and divided by the number of replicas as of the time of metric collection. The main assumption is the fact that pod replicas are created or destroyed based on an autoscaling mechanism, whose purpose is to maintain the equal sharing of resources among them.

Gauge metrics are used as they are, while the rate of counter metrics is used instead. The Prometheus rate function calculates the per-second average rate of increase of the time series in the range vector and automatically handles breaks in monotonicity. The rate period should not be smaller than the scraping interval of Prometheus. The queries sent to Prometheus follow:

Counter metrics: sum by (pod) (rate(metric{namespace="...",pod=∼"..."}[5m]))
Gauge metrics: sum by (pod)(metric{namespace="...",pod=∼"..."})

The namespace field represents the namespace of the microservice, while the pod field represents the deployments that make-up the microservice. The =∼ expression used in the pod field selects labels that regex-match the provided strings.

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
build		build
dagger		dagger
docs		docs
rules		rules
.gitignore		.gitignore
.gitlab-ci.yml		.gitlab-ci.yml
LICENSE		LICENSE
README.md		README.md
cookiecutter-config.yaml		cookiecutter-config.yaml
dagger.json		dagger.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

anomaly-detection-preprocessor

About

Uh oh!

Releases

Packages

Languages

License

EO4EU/anomaly-detection-preprocessor

Folders and files

Latest commit

History

Repository files navigation

anomaly-detection-preprocessor

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages