- Usage
- Methods
-
General
-
Packaging
-
Cluster
- get_resources()
- resources_needed()
- get_used_resources()
- get_unreserved_resources()
- get_reserved_resources()
- available_resources()
- dcos_canonical_version()
- dcos_version_less_than()
- required_cpus()
- required_mem()
- bootstrap_metadata()
- ui_config_metadata()
- dcos_version_metadata()
- ee_version()
- mesos_logging_strategy()
- dcos_1_7
- dcos_1_8
- dcos_1_9
- dcos_1_10
- strict
- permissive
- disabled
-
Command execution
-
File operations
-
Services
- get_service()
- delete_persistent_data()
- destroy_volumes()
- destroy_volume()
- unreserve_resources()
- unreserve_resource()
- get_service_framework_id()
- get_service_task()
- get_service_tasks()
- get_service_ips()
- get_marathon_task()
- get_marathon_tasks()
- service_healthy()
- wait_for_service_endpoint()
- wait_for_service_endpoint_removal()
-
Spinner
-
Tasks
-
ZooKeeper
-
Marathon
-
Masters
-
Agents
-
Network
-
from shakedown import *
Authenticate against an EE DC/OS cluster using a username and password.
| parameter | description | type | default |
|---|---|---|---|
| username | the username used for DC/OS authentication | str | |
| password | the password used for DC/OS authentication | str |
# Authenticate against DC/OS, receive an ACS token
token = authenticate('root', 's3cret')The URL to the DC/OS cluster under test.
None.
# Print the DC/OS dashboard URL.
dcos_url = dcos_url()
print("Dashboard located at: " + dcos_url)The URL to the mesos master on the DC/OS cluster under test.
None.
master_url = master_url()
print("Master located at: " + master_url)The URL to the agents end point for the master on the DC/OS cluster under test.
None.
agents_url = agents_url()
print("Agent state.json is located at: " + agents_url)The URI to a named service.
| parameter | description | type | default |
|---|---|---|---|
| service | the name of the service | str |
# Print the location of the Jenkins service's dashboard
jenkins_url = dcos_service_url('jenkins')
print("Jenkins dashboard located at: " + jenkins_url)A JSON hash containing DC/OS state information.
None.
# Print state information of DC/OS slaves.
state_json = json.loads(dcos_json_state())
print(state_json['slaves'])A JSON hash containing DC/OS state information for the agents.
None.
# Print state information of DC/OS slaves.
state_json = dcos_agents_state()
print(state_json['slaves'])The DC/OS version number.
None.
# Print the DC/OS version.
dcos_version = dcos_version()
print("Cluster is running DC/OS version " + dcos_version)The DC/OS ACS token (if authenticated).
None.
# Print the DC/OS ACS token.
token = dcos_acs_token()
print("Using token " + token)Provides a DC/OS url for the provide path.
| parameter | description | type | default |
|---|---|---|---|
| url_path | the url path | str |
url = shakedown.dcos_url_path('marathon/v2/apps')
response = dcos.http.request('get', url)The current Mesos master's IP address.
None.
# What's our Mesos master's IP?
master_ip = master_ip()
print("Current Mesos master: " + master_ip)Install a package.
| parameter | description | type | default |
|---|---|---|---|
| package_name | the name of the package to install | str | |
| package_version | the version of the package to install | str | latest |
| service_name | custom service name | str | None |
| options_file | a file containing options in JSON format | str | None |
| options_json | a dict containing options in JSON format | dict | None |
| wait_for_completion | wait for service to become healthy before completing? | bool | False |
| timeout_sec | how long in seconds to wait before timing out | int | 600 |
# Install the 'jenkins' package; don't wait the service to register
install_package('jenkins')Install a package, and wait for the service to register.
This method uses the same parameters as install_package()
Uninstall a package.
| parameter | description | type | default |
|---|---|---|---|
| package_name | the name of the package to install | str | |
| service_name | custom service name | str | None |
| all_instances | uninstall all instances? | bool | False |
| wait_for_completion | wait for service to become healthy before completing? | bool | False |
| timeout_sec | how long in seconds to wait before timing out | int | 600 |
# Uninstall the 'jenkins' package; don't wait for the service to unregister
uninstall_package('jenkins')Uninstall a package, and wait for the service to unregister. The cleans up the reserved resources, the reserved disk and zk entry associated with the service.
| parameter | description | type | default |
|---|---|---|---|
| package_name | the name of the package to install | str | |
| service_name | custom service name | str | None |
| role | role for the service if not -role | str | None |
| principal | principal for the service if not -principal | str | None |
| zk_node | zk node to delete for the service | str | None |
| timeout_sec | how long in seconds to wait before timing out | int | 600 |
uninstall_package_and_data('confluent-kafka', zk_node='/dcos-service-confluent-kafka')Uninstall a package, and wait for the service to unregister.
This method uses the same parameters as uninstall_package()
Check whether a specified package is currently installed.
| parameter | description | type | default |
|---|---|---|---|
| package_name | the name of the package to install | str | |
| service_name | custom service name | str | None |
# Is the 'jenkins' package installed?
if package_installed('jenkins'):
print('Jenkins is installed!')Add a repository to the list of package sources.
| parameter | description | type | default |
|---|---|---|---|
| repo_name | the name of the repository | str | |
| repo_url | the location of the repository | str | |
| index | the repository index order | int | -1 |
# Search the Multiverse before any other repositories
add_package_repo('Multiverse', 'https://github.com/mesosphere/multiverse/archive/version-2.x.zip', 0)Remove a repository from the list of package sources.
| parameter | description | type | default |
|---|---|---|---|
| repo_name | the name of the repository | str |
# No longer search the Multiverse
remove_package_repo('Multiverse')Retrieve a dictionary describing the configured package source repositories.
None
# Which repository am I searching through first?
repos = get_package_repos()
print("First searching " + repos['repositories'][0]['name'])Gets a Resource object which includes the current cpu and memory of the cluster
None.
resources = get_resources()
if resources.cpus > 2:
# do stuffGets a Resource object which includes the amount of cpu and memory being used in the cluster.
None.
resources = get_used_resources()
if resources.cpus > 2:
# do stuffGets a Resource object which includes the amount of cpu and memory that is not currently reserved.
None.
resources = get_unreserved_resources()
if resources.cpus > 2:
# do stuffGets a Resource object which includes the amount of cpu and memory that is currently reserved.
None.
resources = get_reserved_resources()
if resources.cpus > 2:
# do stuffGets a Resource object which includes the amount of cpu and memory that is currently available. This equates to (get_resources() - get_used_resources()).
None.
resources = available_resources()
if resources.cpus > 2:
# do stuffRun a command on a remote host via SSH.
| parameter | description | type | default |
|---|---|---|---|
| total_tasks | number of tasks | int | 1 |
| per_task_cpu | cpu per task requirement | float | 0.01 |
| per_task_mem | the username used for SSH authentication | float | 1 |
Provides a canonical version number. dcos_version returns a version string with a
few variations such as 1.9-dev. dcos_canonical_version returns a distutils.version.LooseVersion
and will strip -dev if present. It can be used to determine if the DC/OS cluster version
is correct for the test or if it should be skipped.
None.
@pytest.mark.skipif('dcos_canonical_version() < LooseVersion("1.9")')
def test_1_9_specific_test():Returns True if the DC/OS version is less than the provided version, otherwise returns False.
| parameter | description | type | default |
|---|---|---|---|
| version | version string "1.9.0" | str |
@pytest.mark.skipif('dcos_version_less_than("1.9")')
def test_1_9_specific_test():Preconfigured annotation which requires DC/OS 1.7+
None.
# if the DC/OS cluster version 1.6 the test will be skipped
@dcos_1_7
def test_1_7_plus_feature():Preconfigured annotation which requires DC/OS 1.8+
None.
# if the DC/OS cluster version 1.7 the test will be skipped
@dcos_1_8
def test_1_8_plus_feature():Preconfigured annotation which requires DC/OS 1.9+
None.
# if the DC/OS cluster version 1.8 the test will be skipped
@dcos_1_9
def test_1_9_plus_feature():Preconfigured annotation which requires DC/OS 1.10+
None.
# if the DC/OS cluster version 1.9 the test will be skipped
@dcos_1_10
def test_1_10_plus_feature():Preconfigured annotation which requires DC/OS Enterprise in strict mode
None.
# if the DC/OS enterprise cluster is not in strict mode it will be skipped
@strict
def test_strict_only_feature():Preconfigured annotation which requires DC/OS Enterprise in permissive mode
None.
# if the DC/OS enterprise cluster is not in permissive mode it will be skipped
@permissive
def test_permissive_only_feature():Preconfigured annotation which requires DC/OS Enterprise in disabled mode
None.
# if the DC/OS enterprise cluster is not in disabled mode it will be skipped
@disabled
def test_disabled_only_feature():Returns True if the cluster resources are less than the specified number of cores, otherwise returns False. This is based on available resources.
| parameter | description | type | default |
|---|---|---|---|
| cpus | number of cpus | int |
# skips test if there is only 1 core left in the cluster.
@pytest.mark.skipif('required_cpus(2)')
def test_requires_2_cores():Returns True if the cluster resources are less than the specified amount of memory, otherwise returns False. This is based on available resources.
| parameter | description | type | default |
|---|---|---|---|
| mem | amount of mem in M | int |
requires_2_cores = pytest.mark.skipif('required_cpus(2)')
@dcos_1_9
@requires_2_cores
@pytest.mark.skipif('required_mem(512)')
def test_requires_512m_memory():
# requires DC/OS 1.9, 2 cores and 512MReturns the JSON of the boostrap metadata for DC/OS Enterprise clusters. Return None if DC/OS Open or DC/OS Version is < 1.9.
None.
metadata = bootstrap_metadata()
if metadata:
print(metadata['security'])Returns the JSON of the UI configuration metadata for DC/OS Enterprise clusters. Return None if DC/OS Open or DC/OS Version is < 1.9.
None.
metadata = ui_config_metadata()
if metadata:
print(metadata['uiConfiguration']['plugins']['mesos']['logging-strategy'])Returns the JSON of the DC/OS version metadata for DC/OS Enterprise clusters. Returns None if not available.
None.
metadata = dcos_version_metadata()
if metadata:
print(metadata['dcos-image-commit'])Returns the DC/OS Enterprise version type which is {strict, permissive, disabled} Return None if DC/OS Open or DC/OS Version is < 1.9.
None.
@pytest.mark.skipif("ee_version() in {'strict', 'disabled'}")
def test_skips_strict_or_disabled():Returns the mesos logging strategy if available, otherwise None.
None.
strategy = mesos_logging_strategy()
print(strategy)Run a command on a remote host via SSH.
| parameter | description | type | default |
|---|---|---|---|
| host | the hostname or IP to run the command on | str | |
| command | the command to run | str | |
| username | the username used for SSH authentication | str | core |
| key_path | the path to the SSH keyfile used for authentication | str | None |
| noisy | Output to stdout if True | bool | True |
# I wonder what /etc/motd contains on the Mesos master?
exit_status, output = run_command(master_ip(), 'cat /etc/motd')Run a command on the Mesos master via SSH.
| parameter | description | type | default |
|---|---|---|---|
| command | the command to run | str | |
| username | the username used for SSH authentication | str | core |
| key_path | the path to the SSH keyfile used for authentication | str | None |
| noisy | Output to stdout if True | bool | True |
# What kernel is our Mesos master running?
exit_status, output = run_command_on_master('uname -a')Run a command on a Mesos agent via SSH, proxied via the Mesos master.
This method uses the same parameters as run_command()
Run a command using the dcos CLI.
| parameter | description | type | default |
|---|---|---|---|
| command | the command to run | str |
# What's the current version of the Jenkins package?
stdout, stderr, return_code = run_dcos_command('package search jenkins --json')
result_json = json.loads(stdout)
print(result_json['packages'][0]['currentVersion'])Copy a file via SCP.
| parameter | description | type | default |
|---|---|---|---|
| host | the hostname or IP to copy the file to/from | str | |
| file_path | the local path to the file to be copied | str | |
| remote_path | the remote path to copy the file to | str | . |
| username | the username used for SSH authentication | str | core |
| key_path | the path to the SSH keyfile used for authentication | str | None |
| action | 'put' (default) or 'get' | str | put |
# Copy a datafile onto the Mesos master
copy_file(master_ip(), '/var/data/datafile.txt')Copy a file to the Mesos master.
| parameter | description | type | default |
|---|---|---|---|
| file_path | the local path to the file to be copied | str | |
| remote_path | the remote path to copy the file to | str | . |
| username | the username used for SSH authentication | str | core |
| key_path | the path to the SSH keyfile used for authentication | str | None |
# Copy a datafile onto the Mesos master
copy_file_to_master('/var/data/datafile.txt')Copy a file to a Mesos agent, proxied through the Mesos master.
This method uses the same parameters as copy_file()
Copy a file from the Mesos master.
| parameter | description | type | default |
|---|---|---|---|
| remote_path | the remote path of the file to copy | str | |
| file_path | the local path to copy the file to | str | . |
| username | the username used for SSH authentication | str | core |
| key_path | the path to the SSH keyfile used for authentication | str | None |
# Copy a datafile from the Mesos master
copy_file_from_master('/var/data/datafile.txt')Copy a file from a Mesos agent, proxied through the Mesos master.
| parameter | description | type | default |
|---|---|---|---|
| host | the hostname or IP to copy the file from | str | |
| remote_path | the remote path of the file to copy | str | |
| file_path | the local path to copy the file to | str | . |
| username | the username used for SSH authentication | str | core |
| key_path | the path to the SSH keyfile used for authentication | str | None |
# Copy a datafile from an agent running Jenkins
service_ips = get_service_ips('marathon', 'jenkins')
for host in service_ips:
assert copy_file_from_agent(host, '/home/jenkins/datafile.txt')Retrieve a dictionary describing a named service.
| parameter | description | type | default |
|---|---|---|---|
| service_name | the name of the service | str | |
| inactive | include inactive services? | bool | False |
| completed | include completed services? | bool | False |
# Tell me about the 'jenkins' service
jenkins = get_service('jenkins')Delete the reserved_resources, destroys volumes and deletes the zk node for a given service.
| parameter | description | type | default |
|---|---|---|---|
| role | the role for the service | str | |
| zk_node | the zk node to delete | str |
delete_persistent_data('confluent-kafka-role', '/dcos-service-confluent-kafka')Destroys the volume for the given role (on all slaves in the cluster).
It is important to uninstall the service prior to calling this function.
| parameter | description | type | default |
|---|---|---|---|
| role | the role associated with the service | str |
destroy_volumes('confluent-kafka-role')Destroys the volume for the given role on a give agent. It is important to uninstall the service prior to calling this function.
| parameter | description | type | default |
|---|---|---|---|
| agent | an agent id in the cluster | str | |
| role | the role associated with the service | str |
destroy_volumes('a8571994-47f8-4590-8922-47f10886165a-S1', 'confluent-kafka-role')Unreserve resources for the given role (on all slaves in the cluster).
It is important to uninstall the service prior to calling this function.
| parameter | description | type | default |
|---|---|---|---|
| role | the role associated with the service | str |
unreserve_resources('confluent-kafka-role')Unreserve resources for the given role on a give agent. It is important to uninstall the service prior to calling this function.
| parameter | description | type | default |
|---|---|---|---|
| agent | an agent id in the cluster | str | |
| role | the role associated with the service | str |
unreserve_resource('a8571994-47f8-4590-8922-47f10886165a-S1', 'confluent-kafka-role')Get the framework ID of a named service.
| parameter | description | type | default |
|---|---|---|---|
| service_name | the name of the service | str | |
| inactive | include inactive services? | bool | False |
| completed | include completed services? | bool | False |
# What is the framework ID for the 'jenkins' service?
jenkins_framework_id = get_framework_id('jenkins')Get a dictionary describing a named service task.
| parameter | description | type | default |
|---|---|---|---|
| service_name | the name of the service | str | |
| task_name | the name of the task | str | |
| inactive | include inactive services? | bool | False |
| completed | include completed services? | bool | False |
# Tell me about marathon's 'jenkins' task
jenkins_tasks = get_service_task('marathon', 'jenkins')Get a list of task IDs associated with a named service.
| parameter | description | type | default |
|---|---|---|---|
| service_name | the name of the service | str | |
| inactive | include inactive services? | bool | False |
| completed | include completed services? | bool | False |
# What's marathon doing right now?
service_tasks = get_service_tasks('marathon')Get a dictionary describing a named Marathon task.
| parameter | description | type | default |
|---|---|---|---|
| task_name | the name of the task | str | |
| inactive | include inactive services? | bool | False |
| completed | include completed services? | bool | False |
# Tell me about marathon's 'jenkins' task
jenkins_tasks = get_marathon_task('jenkins')Get a list of Marathon tasks.
| parameter | description | type | default |
|---|---|---|---|
| inactive | include inactive services? | bool | False |
| completed | include completed services? | bool | False |
# What's marathon doing right now?
service_tasks = get_marathon_tasks()Get a set of the IPs associated with a service.
| parameter | description | type | default |
|---|---|---|---|
| service_name | the name of the service | str | |
| task_name | the name of the task to limit results to | str | None |
| inactive | include inactive services? | bool | False |
| completed | include completed services? | bool | False |
# Get all IPs associated with the 'chronos' task running in the 'marathon' service
service_ips = get_service_ips('marathon', 'chronos')
print('service_ips: ' + str(service_ips))Check whether a specified service is currently healthy.
| parameter | description | type | default |
|---|---|---|---|
| service_name | the name of the service | str |
# Is the 'jenkins' service healthy?
if service_healthy('jenkins'):
print('Jenkins is healthy!')Checks the service url returns HTTP 200 within a timeout if available it returns true on expiration it returns false.
| parameter | description | type | default |
|---|---|---|---|
| service_name | the name of the service | str | |
| timeout_sec | how long in seconds to wait before timing out | int | 120 |
# will wait
wait_for_service_endpoint("marathon-user")Checks the service url returns HTTP 500 within a timeout if available it returns true on expiration it returns time to remove.
| parameter | description | type | default |
|---|---|---|---|
| service_name | the name of the service | str | |
| timeout_sec | how long in seconds to wait before timing out | int | 120 |
# will wait
wait_for_service_endpoint_removal("marathon-user")Waits for a function to return true or times out.
| parameter | description | type | default |
|---|---|---|---|
| predicate | the predicate function | fn | |
| timeout_seconds | how long in seconds to wait before timing out | int | 120 |
| sleep_seconds | time to sleep between multiple calls to predicate | int | 1 |
| ignore_exceptions | ignore exceptions thrown by predicate | bool | True |
| inverse_predicate | if True look for False from predicate | bool | False |
# simple predicate
def deployment_predicate(client=None):
...
wait_for(deployment_predicate, timeout)
# predicate with a parameter
def service_available_predicate(service_name):
...
wait_for(lambda: service_available_predicate(service_name), timeout_seconds=timeout_sec)Waits for a function to return true or times out. Returns the elapsed time of wait.
| parameter | description | type | default |
|---|---|---|---|
| predicate | the predicate function | fn | |
| timeout_seconds | how long in seconds to wait before timing out | int | 120 |
| sleep_seconds | time to sleep between multiple calls to predicate | int | 1 |
| ignore_exceptions | ignore exceptions thrown by predicate | bool | True |
| inverse_predicate | if True look for False from predicate | bool | False |
# simple predicate
def deployment_predicate(client=None):
...
time_wait(deployment_predicate, timeout)
# predicate with a parameter
def service_available_predicate(service_name):
...
time_wait(lambda: service_available_predicate(service_name), timeout_seconds=timeout_sec)returns the time difference with a given precision.
| parameter | description | type | default |
|---|---|---|---|
| start | the start time | time | |
| end | end time, if not provided current time is used | time | None |
| precision | the number decimal places to maintain | int | 3 |
# will wait
elapse_time("marathon-user")Get information about a task.
This method uses the same parameters as get_tasks()
Get a list of tasks, optionally filtered by task ID.
| parameter | description | type | default |
|---|---|---|---|
| task_id | task ID | str | |
| completed | include completed tasks? | True |
# What tasks have been run?
tasks = get_tasks()
for task in tasks:
print("{} has state {}".format(task['id'], task['state']))Get a list of active tasks, optionally filtered by task name.
| parameter | description | type | default |
|---|---|---|---|
| task_id | task ID | str | |
| completed | include completed tasks? | False |
# What tasks are running?
tasks = get_active_tasks()
for task in tasks:
print("{} has state {}".format(task['id'], task['state']))Check whether a task has completed.
| parameter | description | type | default |
|---|---|---|---|
| task_id | task ID | str |
# Wait for task 'driver-20160517222552-0072' to complete
while not task_completed('driver-20160517222552-0072'):
print('Task not complete; sleeping...')
time.sleep(5)Wait for a task to be reported running by Mesos. Returns the elapsed time of wait.
| parameter | description | type | default |
|---|---|---|---|
| service | framework service name | str | |
| task | task name | str | |
| timeout_sec | timeout | int | 120 |
wait_for_task('marathon', 'marathon-user')Wait for a task to be report having a specific property. Returns the elapsed time of wait.
| parameter | description | type | default |
|---|---|---|---|
| service | framework service name | str | |
| task | task name | str | |
| prop | property name | str | |
| timeout_sec | timeout | int | 120 |
wait_for_task_property('marathon', 'chronos', 'resources')Wait for a task to be reported having a property with a specific value. Returns the elapsed time of wait.
| parameter | description | type | default |
|---|---|---|---|
| service | framework service name | str | |
| task | task name | str | |
| prop | property name | str | |
| value | value of property | str | |
| timeout_sec | timeout | int | 120 |
wait_for_task_property_value('marathon', 'marathon-user', 'state', 'TASK_RUNNING')Wait for a task dns. Returns the elapsed time of wait.
| parameter | description | type | default |
|---|---|---|---|
| name | dns name | str | |
| timeout_sec | timeout | int | 120 |
wait_for_dns('marathon-user.marathon.mesos')Delete a named ZooKeeper node.
| parameter | description | type | default |
|---|---|---|---|
| node_name | the name of the node | str |
# Delete a 'universe/marathon-user' ZooKeeper node
delete_zk_node('universe/marathon-user')Get data for a Zookeeper node.
| parameter | description | type | default |
|---|---|---|---|
| node_name | the name of the node | str |
# Get data for a 'universe/marathon-user' ZooKeeper node
get_zk_node_data('universe/marathon-user')Get child nodes for a Zookeeper node.
| parameter | description | type | default |
|---|---|---|---|
| node_name | the name of the node | str |
# Get children for a 'universe/marathon-user' ZooKeeper node
get_zk_node_children('universe/marathon-user')Waits for Marathon Deployment to complete or times out.
| parameter | description | type | default |
|---|---|---|---|
| timeout | max time to wait for deployment | int | 120 |
| app_id | wait for deployments on this app | string | None |
# assuming a client.add_app() or similar
deployment_wait()Deletes all apps running on Marathon.
None.
delete_all_apps()Deletes all apps running on Marathon and waits for deployment to finish.
None.
delete_all_apps_wait()Returns True if the given app is healthy.
| parameter | description | type | default |
|---|---|---|---|
| app_id | marathon app ID | String |
is_app_healthy(app_id)Returns the distutils.version.LooseVersion version of marathon.
None.
@pytest.mark.skipif('marathon_version() < LooseVersion("1.4")')
def test_requires_marathon_1_4():Returns True if the marathon version is less than the version specified, otherwise returns False.
| parameter | description | type | default |
|---|---|---|---|
| version | version str "1.4" | String |
@pytest.mark.skipif('marthon_version_less_than("1.4")')
def test_requires_marathon_1_4():Preconfigured annotation which requires marathon 1.3+
None.
# skips test if marathon 1.2 otherwise runs
@marathon_1_3
def test_requires_marathon_1_3():Preconfigured annotation which requires marathon 1.4+
None.
# skips test if marathon 1.3 otherwise runs
@marathon_1_4
def test_requires_marathon_1_4():Preconfigured annotation which requires marathon 1.5+
None.
# skips test if marathon 1.4 otherwise runs
@marathon_1_5
def test_requires_marathon_1_5():Separates the master from the cluster by disabling inbound and/or outbound traffic.
| parameter | description | type | default |
|---|---|---|---|
| incoming | disable incoming traffic? | bool | True |
| outgoing | disable outgoing traffic? | bool | True |
# Disable incoming traffic ONLY to the DC/OS master.
partition_master(True, False)Reconnect a previously partitioned master to the network
None.
# Reconnect the master.
reconnect_master()Retrieve a list of all agent node IP addresses.
None
# What do I look like in IP space?
nodes = get_agents()
print("Node IP addresses: " + nodes)Retrieve a list of all private agent node IP addresses.
None
# What do I look like in IP space?
private_nodes = get_private_agents()
print("Private IP addresses: " + private_nodes)Retrieve a list of all public agent node IP addresses.
None
# What do I look like in IP space?
public_nodes = get_public_agents()
print("Public IP addresses: " + public_nodes)Separates the agent from the cluster by adjusting IPTables with the following:
sudo iptables -F INPUT
sudo iptables -I INPUT -p tcp --dport 22 -j ACCEPT
sudo iptables -I INPUT -p icmp -j ACCEPT
sudo iptables -I OUTPUT -p tcp --sport 5051 -j REJECT
sudo iptables -A INPUT -j REJECT
| parameter | description | type | default |
|---|---|---|---|
| hostname | the hostname or IP of the node | str |
# Partition all the public nodes
public_nodes = get_public_agents()
for public_node in public_nodes:
partition_agent(public_node)Reconnects a previously partitioned agent by reversing the IPTable changes.
| parameter | description | type | default |
|---|---|---|---|
| hostname | the hostname or IP of the node | str |
# Reconnect the public agents
for public_node in public_nodes:
reconnect_agent(public_node)Restarts an agent process at the host.
| parameter | description | type | default |
|---|---|---|---|
| hostname | the hostname or IP of the node | str |
# Reconnect the public agents
for public_node in public_nodes:
restart_agent(public_node)Stops an agent process at the host.
| parameter | description | type | default |
|---|---|---|---|
| hostname | the hostname or IP of the node | str |
# Reconnect the public agents
for public_node in public_nodes:
stop_agent(public_node)Start an agent process at the host.
| parameter | description | type | default |
|---|---|---|---|
| hostname | the hostname or IP of the node | str |
# Reconnect the public agents
for public_node in public_nodes:
start_agent(public_node)| parameter | description | type | default |
|---|---|---|---|
| hostname | the hostname or IP of the node | str |
# Delete agent logs on the public agents
for public_node in public_nodes:
delete_agent_log(public_node)Kill the process(es) matching pattern at ip. This will potentially kill infrastructure processes.
| parameter | description | type | default |
|---|---|---|---|
| hostname | the hostname or IP of the node | str | |
| pattern | A regular expression matching the name of the process to | ||
| kill | str |
# kill java on the public agents
for public_node in public_nodes:
kill_process_on_host(public_node, "java")Managed context which will disconnect an agent for the duration of the context then restore the agent
| parameter | description | type | default |
|---|---|---|---|
| hostname | the hostname or IP of the node | str |
# disconnects agent
with disconnected_agent(host):
service_delay()
# agent is reconnected
wait_for_service_url(PACKAGE_APP_ID)Function which returns True if the number of required agents is NOT present, otherwise returns False. The purpose of this function is to be used to determine if a test would be skipped or not.
| parameter | description | type | default |
|---|---|---|---|
| count | required number of agents | int |
# if the DC/OS cluster has less than 2 private agents it will be skipped
# it will run with 2 or more agents.
@pytest.mark.skipif('required_private_agents(2)')
def test_fancy_multi_agent_check():Function which returns True if the number of required agents is NOT present, otherwise returns False. The purpose of this function is to be used to determine if a test would be skipped or not.
| parameter | description | type | default |
|---|---|---|---|
| count | required number of agents | int |
# if the DC/OS cluster has less than 2 public agents it will be skipped
# it will run with 2 or more agents.
@pytest.mark.skipif('required_public_agents(2)')
def test_fancy_multi_agent_check():Annotation decorator factory. It requires the import of required_private_agents in order to function.
| parameter | description | type | default |
|---|---|---|---|
| count | required number of private agents | int | 1 |
# if the DC/OS cluster has less than 1 private agents it will be skipped
@private_agents(1)
def test_fancy_multi_agent_check():Annotation decorator factory. It requires the import of required_public_agents in order to function.
| parameter | description | type | default |
|---|---|---|---|
| count | required number of public agents | int | 1 |
# if the DC/OS cluster has less than 1 public agents it will be skipped
@public_agents(1)
def test_fancy_multi_agent_check():Managed context which will disconnect the master for the duration of the context then restore the master
None
# disconnects agent
with disconnected_master(host):
service_delay()
# master is reconnected
wait_for_service_url(PACKAGE_APP_ID)Checks the mesos url returns HTTP 200 within a timeout if available it returns true on expiration it returns false.
None
# disconnect master
restart_master_node()
# master is reconnected
wait_for_mesos_endpoint()Provides a list of all masters in the cluster
None
for master in get_all_masters():
# do master like thingsProvides a list of all the IP address for the masters
None
for ip in get_all_master_ips():Managed context which will save the firewall rules then restore them at the end of the context for the host.
It calls save_iptables before the context and restore_iptables and the end of the context.
| parameter | description | type | default |
|---|---|---|---|
| hostname | the hostname or IP of the node | str |
# disconnects agent
with iptable_rules(shakedown.master_ip()):
block_port(host, port)
time.sleep(7)
# firewalls restored
wait_for_service_url(PACKAGE_APP_ID)Reverses and restores saved iptable rules. It works with save_iptables.
| parameter | description | type | default |
|---|---|---|---|
| hostname | the hostname or IP of the node | str |
# disconnects agent
restore_iptables(host)Saves the current iptables to a file on the host.
| parameter | description | type | default |
|---|---|---|---|
| hostname | the hostname or IP of the node | str |
# disconnects agent
save_iptables(host)Flushes the iptables rules for the host. sudo iptables -F INPUT. Consider using save_iptables prior to use.
| parameter | description | type | default |
|---|---|---|---|
| hostname | the hostname or IP of the node | str |
# disconnects agent
flush_all_rules(host)Removes iptable rules allow full access. Consider using save_iptables prior to using.
sudo iptables --policy INPUT ACCEPT && sudo iptables --policy OUTPUT ACCEPT && sudo iptables --policy FORWARD ACCEPT'
| parameter | description | type | default |
|---|---|---|---|
| hostname | the hostname or IP of the node | str |
# disconnects agent
allow_all_traffic(host)