Help operators to safely perform maintenance on portions of the Apache Mesos cluster by:
- Providing a cluster state where resource offers are prevented for a given
agent - Providing manipulation of this state via an api endpoint
/allocator/filters
Note that this module allows you to temporarily block offers for individual agents. This is a (hopefully) temporary alternative to the maintenance primitives mechanism in Mesos which only works when frameworks adopt it.
Use-case
- Isolate an agent from any new tasks, but let existing tasks on the agent keep running (to completion)
- Install the module library (example:
libofferfilterallocator-1.0.1-vXXX.so) on your masters- location is your choice--you'll point to it in configuration
- pre-built binaries can be found on the releases tab which consist of the module (
.sofile) and amodules.jsonfile which assumes the module itself will be placed at/usr/local/lib/; modify as needed
- Configure the
--modulescommand-line arg, orMESOS_MODULESenv variable to use the module- example:
--modules=file:///path-to-modules.json - example:
MESOS_MODULES=file:///path-to-modules.json
{ "libraries": [ { "file": "/path/to/libofferfilterallocator-1.0.1-vXXX.so", "modules": [ { "name": "im_getty_mesos_OfferFilteringAllocator", "parameters": [ { "key": "zk_url", "value": "zk://127.0.0.1:2181/mesos-allocator-filters" } } ] } ] }- Alternatively, place the (updated to your environment)
modules.jsonin the directory configured byMESOS_MODULES_DIRenv varaible or--modules-dircommand-line arg. - Note: merge as needed to incoporate other modules
- example:
- Configure the
--allocatorcommand-line arg, orMESOS_ALLOCATORenv variable to use the custom allocator- example:
--allocator=im_getty_mesos_OfferFilteringAllocator - example:
MESOS_ALLOCATOR=im_getty_mesos_OfferFilteringAllocator
- example:
- Start up
mesosand view the documentation of the API at:http(s)://your-mesos-master:5050/help/allocator/filters
Offer filters are accessed via the exposed API endpoints (relative to mesos master(s))
GET /allocator/filters
List/Get the active allocator filters
-
example response when no filters are present:
{"filters": []} -
example response with filters present:
{ "filters": [ { "agentId": "7532e174-d91a-49c4-85e6-389ea9fd73c3-S0", "hostname": "some-host.some-domain" } ] }
POST /allocator/filters
Content-Type: application/json
- body:
or:
{ "agentId": "VALUE"}{ "hostname": "VALUE"}
Add/Create an allocator filter for the specified agent
PUT /allocator/filters
Content-Type: application/json
- body:
{ "filters": [ { "agentId": "7532e174-d91a-49c4-85e6-389ea9fd73c3-S0", "hostname": "some-host.some-domain" }, ... ] }
Set the current state of all filters; either of agentId or hostname may be
omitted on an individual filter; set filters to an empty array to clear all filters
DELETE /allocator/filters
-
parameters:
agentIdid of the agent for which to remove filters ",hostnamehostname of the agent for which to remove filters ",
(one of
agentIdorhostnameis required)
Remove/Delete an allocator filter
All endpoints return:
200 OKwhen the operation performed successfully, along with the current state of the allocator filters (seeGET /allocator/filters).307 TEMPORARY_REDIRECTredirect to the leading master when" current master is not the leader.503 SERVICE_UNAVAILABLEif the leading master cannot be found.
This example will demonstrate the application of how a filter interacts with existing tasks and new tasks using a simple cli application in Marathon.
-
Spin up the test cluster, and view the Marathon UI
make run && while ! curl -s "http://$(make run ip):8080/" >/dev/null; do sleep 1; done && open "http://$(make run ip):8080/" -
create a task in Marathon with 2 instances (this would normally result in one on each agent)
curl -X POST "http://$(make run ip):8080/v2/apps" \ -H 'Content-Type: application/json' -d ' { "id": "/test1", "cmd": "while true; do echo \"hello\"; sleep 5; done", "cpus": 0.1, "mem": 32, "disk": 0, "instances": 2 } '- confirm that there's one instance on each agent (i.e., offers created normally for all agents)
-
create a filter for one of the agents (
agent-one) so that no new offers are sentcurl -L -X POST "http://$(make run ip):5050/allocator/filters" \ -H 'Content-Type: application/json' -d ' { "hostname": "agent-one" } ' -
verify that tasks targeted at agent-one are not initiated
curl -X POST "http://$(make run ip):8080/v2/apps" \ -H 'Content-Type: application/json' -d ' { "id": "/test2", "cmd": "while true; do echo \"hello\"; sleep 5; done", "cpus": 0.1, "mem": 32, "disk": 0, "instances": 1, "container": null, "constraints": [ [ "hostname", "LIKE", "agent-one" ] ] } '- view the Marathon UI and note that the task is in a
Waitingstatus - note that the original task
/test1still has an instance running onagent-one
- view the Marathon UI and note that the task is in a
-
remove the filter for agent-one
curl -L -X DELETE "http://$(make run ip):5050/allocator/filters?hostname=agent-one" -
confirm that the waiting tasks targeted for agent-one are now fulfilled/scheduled
This project has been configured to build and test using docker, as it can be quite complicated to set up your C++ development environment.
It is assumed that you have a few bare essentials (docker, docker-machine), and uses docker to handle the rest.
docker [1.9+]docker-machine [0.8+](if running on OSX or Windows)docker-for-macanddocker-for-windowsare not easily configured with thelibprocessinternals- a machine named
mesos-moduleswill be created (if not found); configure this name in./docker-machine-env.sh
The important configuration points are:
- change module, library name and
VERSIONin the top section of./CMakeLists.txt - change the name of the docker machine at top of
docker/docker-machine-env.sh - change the templated
#defineconstants intemplates/config.h.in
make build
To build for an alternate target version (than the default of 1.0.1), export the MESOS_VERSION env variable.
export MESOS_VERSION=0.28.2
Then all other commands will target that version. Note that specific docker-compose setups and build images have to be created to support a new version.
make run
- runs a
docker-compose-based mesos cluster on a single docker machine - view the live help docs:
open "http://$(make run ip):5050/help/allocator/filters" - view the Marathon UI:
open "http://$(make run ip):8080/"
make dev
- builds the module from source
- restarts the local cluster with new module
- tails logs of the masters