The project objective was to analyse a Google traffic trace and to build an optimal pair of dispatcher and scheduler to minimize the jobs response time and comparing it against a baseline (LWL dispatcher and FCFS scheduler).
The dataset was provided by our Professor and it consist of a month of the Google traffic trace.
We divided the jobs in 4 categories:
Note: A job is a formed by one or more tasks.
And then we gave a look to the CPU utilization box plot of each category:
Our scheduler strategy was trying to prioritize jobs with with smaller number of tasks due to the fact of the imbalance of the traffic. To overcome the possible starvation we opted for a pre-emptive scheduling policy with an additional implementation of aging.
We wanted to preserve a LWL type of dispatcher, so we had to modify it to make it work properly with our scheduler: the dispatcher when sending a task to the server prioritize the servers with less number of working tasks left, then the ones with fewest work left. Ties are solved at random.
| Model | ||||
|---|---|---|---|---|
| Baseline | 27603 seconds | 1241601 | 0.5382 | 129 messages |
| Ours | 7225 seconds | 444314 | 0.5382 | 66 messages |
Comparison among the two models in terms of Response Time
$(\overline{R})$ , Job Slowdown$(\overline{S})$ , Mean Utilization Coefficient$(\rho)$ and mean message load$(\overline{L})$ .
Our pair of dispatcher and scheduler menage to outclass the baseline by decreasing the
From the eCCDF plots of
./root
|_ src/
| |_ notebook.ipynb
|_ img/
| |_ cpu_utilization.png
| |_ response_time_jobslowdown.png
| |_ packets_category.png
| |_ server_utilization.png
|_ report.pdf
|_ requirements.txt
Some python modules: to install them just run in a terminal pip install -r requirements.txt from the root directory of the project.
It is strongly suggested to create and activate a python virtual environment in the project root folder, then remember to set the environment as kernel for notebook.ipynb (here a guide).



