-
Notifications
You must be signed in to change notification settings - Fork 1
Description
Refer to the client for the latest version.
@simonedavico I compared your implementation with the latest working solution we used for the last experiments, and I have some remarks and questions. Some of them might be obsolete because we might have decided to do differently, but better to document the choices. If we decide to implement some of the following, please copy also the comments as part of the code, and feel free to improve the code quality for sure with logging of what happens.
Start and Stop of Monitors and Collectors (see also benchflow/monitors#23)
The operations to stop the collectors were in @PostRun.
Only the operation to interact with the monitors (monitors API) were in the Driver, the one for interacting with the collectors were in the Benchmark.
I guess, if we didn't decide differently, that we can have:
- the collector' start call it in the
Benchmarkin the method annotated with@StartRun, because it is the closest point of the lifecycle to the start of the issuing of the workload and all the drivers are instantiated by this method and ready to issue the workload. It has to be called before the call to thesuper.start()because otherwise it would not work when the driver start to generate the load as soon as they get started, because if we perform the call after thesuper.start()that would mean we only start collecting after all the drivers are started.
An alternative would be to call in theDriverin the method annotated with@OnceBefore, because it is the closest point of the lifecycle to the start of the issuing of the workload of the first started driver, but it makes more sense to have it in the Benchmark lifecycle. - the collectors' stop calls in the
Driverin the method annotated with@OnceAfter, because it is the closest point of the lifecycle to the end of the issuing of the workload. The alternative would the theBenchmarkmethod annotated with@EndRun, but that is called later in the benchmark lifecycle and after the drivers get undeployed, that might take some times on the Faban infrastructure and we don't want to collect metrics during that period. - the monitors' start/monitor/stop calls in the
Driversin the respective annotated methods.
This mean for the monitors, that according to the life cycles discussed in benchflow/monitors#23, the sequence of calls should be:
- Start of load: all the calls should be in the benchmark's method annotated with
@StartRun - End of load: all the calls should be in the driver's method annotated with
@OnceAfter. They could also be in the benchmark's method annotated with@PostRun, but according to the Faban lifecycle the most suitable place is in the driver, because the End of load monitors monitor the completion of the workload, that is something logically belonging to the drivers. - Entire load:
monitor:startshould be as for Start of load,monitor:monitorandmonitor:stopas for End of load
@simonedavico update benchflow/monitors#23 when modified.
We also had the following commented code in this respect:
// An @OnceBefore operation gets called by global thread 0 of the driver only once and before any other driver thread gets instantiated. This will guarantee that no operation is called while the @OnceBefore operation is running.
// This also means that starting stats here collect a lot of points during the initialsiation of other agents
//NOTE: it was in the OnceBefore method of the drivers, but this point it is more suitable,
//because it is just when the ramp up starts
/**
* Start monitors and collectors
* Only thread 0 does is and can share data with onceAfter
*/
@Override
@StartRun public void start() throws Exception {
super.start();
// You will need to ensure all the driver processes on all driver systems get started, and - if feasible - enter the rampup state before returning from this method as the tools timer will get started immediately after this method terminates.
//This means that starting the stats here might miss some data point at the beginning of the run, but
//we anyway delete some of the process at the beginning, part of the warm up period, so it is fine
}
Missing as Part of the Template for the WfMS Driver (mock model)
mock model)if(isStarted()){
//SPOON HERE
} else {
String startURL = wfms.startProcessDefinition("mock.bpmn");
logger.info("Response: " + startURL);
}
simonedavico: it is already done, but generated for each generated operation and not defined in the template.
Timeout for MySQL Monitor
The following code was in the MySQL monitor example, to have a limit timeout for the check, so that we avoid a possible deadlock. It is really risky because it is a mere heuristic and can lead to stopping of the benchmark before the workload is actually completed.
//TODO: we for sure want to have a better way to get the same.
//the point now is that it is not possible to throw an exception on the run method
try {
//wait maximum 2 minutes (heuristically set)
boolean processingCompleteWithin1Second = done.await(120000, TimeUnit.SECONDS);
} catch (InterruptedException ex) {
Thread t = Thread.currentThread();
t.getUncaughtExceptionHandler().uncaughtException(t, ex);
return;
}
We should evaluate how to improve it, for example:
- By limiting, if needed, the accepted query to a
COUNTquery comparing with a value, so that then we can use the changing of that value over time to calibrate the frequencies of checks and deciding about killing the check (in case it does not change for example).
The BenchFlow Services should be Started/Stopped in Parallel
In the starting benchmark you had, as for example for starting the benchflow services:
final CountDownLatch done = new CountDownLatch(1);
new Thread(new Runnable() {
@Override
public void run() {
// Monitor Business Logic
done.countDown();
}
}).start();
and the following to stop the services:
ExecutorService es = Executors.newSingleThreadExecutor();
Future<String> resDBFuture = es.submit(new BenchFlowServicesAsynchInteraction(statsDBStart));
String resDB = resDBFuture.get();
This was wrong, because we actually need that the services are started/stopped concurrently. In particular:
at start: the monitors with runPhasestartorentire loadmust finish all their lifecycle that should occur at starting of the benchmark, in their dependency order before the collectors are started all in parallel.at stop: the monitors with runPhasestoporentire loadmust finish all their lifecycle that should occur at stopping of the benchmark, in their dependency order before the collectors are started all in parallel.
This is very important because some of the services might take more time than the others to start/stop and we must avoid these differences impact the precision on the timeliness with which we collect the data (e.g., the ones collected by stats). This impacts mode the stop, because the start is designed to return very quickly.