Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
30 changes: 30 additions & 0 deletions docs/docs/dev-manuals/other-repositories/chaostoolkit-executor.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
# Chaos Toolkit Executor Service

The Chaos Toolkit Executor is a wrapper container, containing a web-server for communication with the Experiment Executor, written in Python,
which matches the technology Chaos Toolkit.
In addition, it has the Chaos Toolkit CLI and all common Chaos Toolkit extensions, including our custom extension for Docker installed.
This allows for executing a wide range of chaos experiments using the Chaos Toolkit.

## Technology Stack

- **Language**: Python
- **Framework**: Flask
- **Chaos Engineering Tool**: [Chaos Toolkit](https://chaostoolkit.org/)

## API

- `POST /start-experiment` - Start the execution of a Chaos Toolkit chaos experiment
- `POST /stop-experiment` - Stop the execution

## Repository Structure

<div className="repository-structure" >

- `/misarch_chaostoolkit`: Python package of the extension for Chaos Toolkit to inject failure in local Docker containers and the web server

</div>

## Functionality Overview

The Chaos Toolkit Executor exposes a REST API to start and stop chaos experiments.
When an experiment is started, the Chaos Toolkit Executor uses the Chaos Toolkit CLI to execute the provided experiment configuration.
Original file line number Diff line number Diff line change
@@ -0,0 +1,80 @@
# Experiment Executor Frontend Service

The Experiment Executor Frontend is the web frontend used to create, modify, and execute experiments.
It has minimal state, as it only keeps changes in the local browser state until they are saved or the website is closed.
However, experiment management is performed by the Experiment Executor.
Therefore, the frontend dynamically loads and stores the different parts of the experiment configurations via the API of the Experiment Executor.

## Technology Stack

- **Language**: JavaScript
- **Framework**: Vue.js
- **Package Manager**: Yarn
- **Builder**: Vite

## Repository Structure

The repository is structured as follows:

<div className="repository-structure">

- `/src/`: The source code
- `components/`: [Single-File Components](https://vuejs.org/glossary/#single-file-component) that can be reused -- this is basically an internal UI components library.
- `model/`: Data models used in the frontend.
- `types/`: TypeScript types.
- `util/`: Utility source code.

</div>

## Important Components

The most important components are:

- The Graph / Experiment Overview
- The MiSArch Experiment Config Editor
- The Chaos Toolkit Editor
- The Gatling Work Editor
- The Goal Editor
- Overlay

### Graph / Experiment Overview

- The graph component is the main visualization component of the frontend.
- It uses [chart.js](https://www.chartjs.org/) to visualize the different experiment components.

### MiSArch Experiment Config Editor and Chaos Toolkit Editor

- The editors are based on [Monaco Editor](https://microsoft.github.io/monaco-editor/).
- They both can be switched between a code view and a form view.
- The form view is dynamically generated based on the JSON schema of the respective configuration.

### Gatling Work Editor

- The Gatling Work Editor is a Monaco Editor that allows the user to create and modify Gatling Work scenarios.
- It lets write users Kotlin code for the Gatling scenarios.

### Goal Editor

- The Goal Editor is a simple visual area that allows the user to define the goals of the experiment.

### Overlay

- The overlay component is used to display the

## Known Issues / Open Bugs

There are several known issues and open bugs in the experiment executor frontend:

- Help texts are often black on black background in other operating systems then macOS.
- When there are large experiments the graph can become slow and unresponsive.
- Technical debt and code smells.

## Technical Debt

:::info
This section is written from the personal perspective of the experiment tool developer during the time of the Master Thesis.
:::
Sadly, there is already some technical debt. Why? Because at the beginning of the thesis I was still a complete beginner when it came to frontend
development with Vue, TypeScript, and JavaScript in general.
Some things I only understood and learned to use properly over time meaning issue by issue.
As we did not have the time to refactor everything as soon as possible, we had to prioritize the implementation of additional features to ensure that the frontend would be at least somehow feature-complete at the end of the project.
132 changes: 132 additions & 0 deletions docs/docs/dev-manuals/other-repositories/experiment-executor.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,132 @@
# Experiment Executor Service

The Experiment Executor is the core component of the MiSArch Experiment Tool.
It is responsible for (1) the creation and storage of experiments, (2) the execution of stored experiments by calling the executor components at
the correctly scheduled time, (3) the collection and transformation of the Gatling metrics, and (4) the creation of the final Grafana dashboard and the report.

## API

The Experiment Executor exposes the following REST API endpoints to manage and execute experiments.

#### Experiment Management

- `POST /experiment/generate` - Generate a new experiment with a new UUID
- `GET /experiment/list` - List all experiments with their versions
- `GET /experiment/{testUUID}/versions` - List all versions of a specific experiment
- `POST /experiment/{testUUID}/{testVersion}/newVersion` - Create a new version of an existing experiment
- `DELETE /experiment/{testUUID}` - Delete an experiment with all its versions
- `DELETE /experiment/{testUUID}/{testVersion}` - Delete a specific version of an experiment

#### Configuration

- `GET /experiment/{testUUID}/{testVersion}/chaosToolkitConfig` - Get the Chaos Toolkit configuration of a specific experiment version
- `PUT /experiment/{testUUID}/{testVersion}/chaosToolkitConfig` - Update the Chaos Toolkit configuration of a specific experiment version
- `GET /experiment/{testUUID}/{testVersion}/misarchExperimentConfig` - Get the MiSArch Experiment Config of a specific experiment version
- `PUT /experiment/{testUUID}/{testVersion}/misarchExperimentConfig` - Update the MiSArch Experiment Config of a specific experiment version
- `GET /experiment/{testUUID}/{testVersion}/gatlingConfig` - Get the Gatling configuration of a specific experiment version
- `PUT /experiment/{testUUID}/{testVersion}/gatlingConfig` - Update the Gatling configuration of a specific experiment version
- `GET /experiment/{testUUID}/{testVersion}/config` - Get the global experiment configuration of a specific experiment version
- `PUT /experiment/{testUUID}/{testVersion}/config` - Update the global experiment configuration of a specific experiment version

#### Execution

- `POST /experiment/start`
- `POST /experiment/{testUUID}/{testVersion}/start` - Start the execution of a specific experiment version
- `POST /experiment/{testUUID}/{testVersion}/stop` - Stop the execution of a specific experiment version
- `GET /experiment/{testUUID}/{testVersion}/events` - Register for server-sent events to get experiment execution updates

#### Synchronization & Metrics

- `POST /trigger/{testUUID}/{testVersion}` - Register a component (Gatling Executor, Chaos Toolkit Executor, MiSArch Experiment Config) as ready
- `GET /trigger/{testUUID}/{testVersion}` - Poll if the experiment can start
- `POST /experiment/{testUUID}/{testVersion}/gatling/metrics/steadyState` - Forward steady-state metrics from Gatling Executor

## Technology Stack

- **Language**: Kotlin
- **Framework**: Spring Boot
- **Asynchronous Processing**: Spring WebFlux + Kotlin Coroutines

## Repository Structure

The repository is structured as follows:

<div className="repository-structure" >

- `/src/`: Source code of the service
- `config`: Package that includes several configuration classes
- `controller/`: Package that includes all REST controllers
- `experiment/`: Different controllers for the experiment lifecycle
- `service/`: Package for all service classes containing the actual business logic
- `model/`: Package that includes the main data model
- `ExperimentCofig`: The global experiment configuration schema
- `plugin`: Package that includes all plugin classes for the different technologies
- `export`: Plugins for Grafana export, LLM export and report generation
- `failure`: Plugins for failure execution with Chaos Toolkit Executor and MiSArch Experiment Config
- `workload`: Plugin for Gatling Executor workload execution
- `metrics`: Plugins for metrics transformation and storage from Prometheus and Gatling

</div>

## Experiment Execution Process

The following steps describe the workflow that is executed when an experiment is started.

### Starting an Experiment

- Initiation:
- Start via API call or UI button.
- Loads experiment configuration from persistent storage.
- If no execution for the same version is running, a temporary state is created in memory.
- Component Preparation:
- Sends HTTP requests:
- Failure configuration → Chaos Toolkit Executor
- Workload configuration → Gatling Executor
- Reset failures → MiSArch Experiment Config
- Waits for all components to be ready.

### Synchronization & Registration

TODO mermaid

- Endpoints Provided:
1. Register component as ready.
2. Poll if experiment can start.
- Readiness:
- All three components (Gatling Executor, Chaos Toolkit Executor, MiSArch Experiment Configuration) must register.
- Polling every 100 ms; experiment starts when all are ready (max 300 ms diff).
- Scheduling:
- Failure scheduling is handled by Experiment Executor.

### Special Handling

- Warm-up / Steady-State Hypothesis:
- Gatling Executor runs these before registering as ready.
- Metrics are forwarded to Experiment Executor for threshold calculation and goal storage.

### Execution Phase

- Start:
- Timestamp is marked.
- Failure injection is managed by Experiment Executor.
- Component Responsibilities:
- Chaos Toolkit Executor and Gatling Executor handle their respective tasks.

### Completion & Reporting

- Metrics Collection:
- Gatling Executor sends HTML and JS metrics files.
- Experiment Executor marks completion, clears state, and transforms metrics for InfluxDB.
- Report Generation:
- Stores execution timestamps, goals, and threshold violations.
- Optionally stores raw metrics and queries Prometheus for additional data.
- Dashboard Creation:
- Grafana dashboard is parameterized and forwarded.
- Frontend is notified via server-side events with dashboard URL.

### Stopping an Experiment

- Stop Endpoint:
- Terminates MiSArch Experiment Configuration thread.
- Calls stop endpoints for Gatling Executor and Chaos Toolkit Executor.
- Clears experiment state.
48 changes: 48 additions & 0 deletions docs/docs/dev-manuals/other-repositories/gatling-executor.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,48 @@
# Gatling Executor Service

The Gatling Executor is a wrapper container that, on the one hand, contains a simple web server and web client to communicate with the Experiment
Executor and, on the other hand, contains the Gatling source files which will be compiled and executed dynamically at runtime using Gradle.

The details of a specific experiment are forwarded from the Experiment Executor
and executed immediately.
Finally, the results of the execution that are collected from Gatling are forwarded back to the Experiment Executor for finalization.
It is useful to deploy this service to a dedicated infrastructure environment, as resource consumption for load generation can be significant, so the
deployment of the Gatling Executor should be isolated.

## Technology Stack

- **Language**: Kotlin
- **Framework**: Spring Boot
- **Build Tool**: Gradle
- **Load Testing Tool**: [Gatling](https://gatling.io/)
- **Kotlin DSL for Gatling**: [Kotlin DSL](https://gatling.io/docs/gatling/reference/current/extensions/kotlin-dsl/)

## API

The Gatling Executor exposes the following REST API endpoints to manage and execute Gatling load tests.

- `POST /start-experiment` - Start the execution of a Gatling load test
- `POST /stop-experiment` - Stop the execution

## Repository Structure

<div className="repository-structure" >

- `/gatling-server/src`: Source code of the web server Spring Boot component
- `controller/`: Package that includes the REST controller
- `service/`: Package for all service classes containing the actual business logic

- `/gatling-test/src`: Source code of the Gatling load testing component
- `kotlin`: Kotlin source files that are compiled on experiment start by Gradle, which will be extend by the dynamic Kotlin files received from the API
- `resources`: Template files for reference

</div>

## Functionality Overview

Important to note is that the Gatling Executor creates a new Gradle job for each experiment execution.
This job compiles the Gatling test source code, which consists of static template files and dynamic files received from the Experiment Executor via the API.
After the compilation, the Gradle job executes the Gatling test, which generates a report that is sent back to the Experiment Executor after the
execution is finished in another dedicated Gradle Job.

If a steady-state hypothesis or warm-up is configured, those are executed first before the actual load test starts in dedicated Gatling jobs.
26 changes: 24 additions & 2 deletions docs/docs/dev-manuals/other-repositories/infrastructure-k8s.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -16,22 +16,44 @@ The repository is structured as follows:

- `/configmaps.tf`: Defines ConfigMaps used inside the cluster
- `/dapr.tf`: Sets up Dapr
- `/dbs.tf`: Sets up Postgres DBs for the cluster
- `/dbs-mongodb.tf`: Sets up Mongo DBs for the cluster
- `/dbs-postgres.tf`: Sets up Postgres DBs for the cluster
- `/influxdb.tf`: Sets up the InfluxDB used for storing Gatling Metrics of Experiments
- `/ingress.tf`: Sets up a central cluster ingress to access the cluster
- `/keycloak`: Keycloak submodule to grep the up to date MiSArch realm file
- `/keycloak.tf`: Sets up Keycloak
- `/latest-deployment.tfvars`: (optional, must be enabled (`-var-file="latest-deployment.tfvars"`)) Sets all used containers to the latest available version
- `/main.tf`: Sets up Terraform itself
- `/minio.tf`: Sets up minio
- `/misarch-address.tf`: Sets up the Address deployment
- `/misarch-catalog.tf`: Sets up the Catalog deployment
- `/misarch-chaostoolkit-executor.tf`: Sets up the Chaos Toolkit Executor deployment
- `/misarch-discount.tf`: Sets up the Discount deployment
- `/misarch-experiment-config-frontend.tf`: Sets up the Experiment Config Frontend deployment
- `/misarch-experiment-config.tf`: Sets up the Experiment Config deployment
- `/misarch-experiment-exectuor.tf`: Sets up the Experiment Executor deployment
- `/misarch-experiment-executor-frontend.tf`: Sets up the Experiment Executor Frontend deployment
- `/misarch-frontend.tf`: Sets up the Frontend
- `/misarch-gateway.tf`: Sets up the Gateway
- `/misarch-gatling-executor.tf`: Sets up the Gatling Executor deployment
- `/misarch-inventory.tf`: Sets up the Inventory deployment
- `/misarch-invoice.tf`: Sets up the Invoice deployment
- `/misarch-media.tf`: Sets up the Media deployment
- `/misarch-notification.tf`: Sets up the Notification deployment
- `/misarch-order.tf`: Sets up the Order deployment
- `/misarch-payment.tf`: Sets up the Payment deployment
- `/misarch-return.tf`: Sets up the Return deployment
- `/misarch-review.tf`: Sets up the Review deployment
- `/misarch-shipment.tf`: Sets up the Shipment deployment
- `/misarch-shoppingcart.tf`: Sets up the Shopping Cart deployment
- `/misarch-simulation.tf`: Sets up the Simulation deployment
- `/misarch-tax.tf`: Sets up the Tax deployment
- `/misarch-user.tf`: Sets up the User deployment
- `/otel.tf`: Sets up an Opentelemetry collector
- `/misarch-wishlist.tf`: Sets up the Wishlist deployment
- `/otel.tf`: Sets up an OpenTelemetry collector
- `/passwords-mongodb.tf`: Defines passwords for MongoDBs used by the cluster
- `/passwords.tf`: Defines passwords used by everything
- `/rabbitmq.tf`: Sets up RabbitMQ
- `/test-script.sh`: Script to avoid specifying variables prior to running `terraform apply`
- `/variables-annotations.tf`: Defines annotations for the platform and for each deployment
- `/variables-labels.tf`: Defines labels for the platform and for each deployment
Expand Down