CASH is a utility for administrators of large computer clusters to quickly run shell commands on all or a subset of the cluster nodes. CASH generates a cascading, or tree-like topology of the nodes, and is therefore much faster than other tools that simply iterate the nodes or try to access many nodes in parallel.
CASH is supposed to be run from the administrator's machine, but may also be run from one of the cluster nodes. In the first case, all communication between the computer cluster and the admin machine is channelled over a gateway host.
Please see below for the execution/communication model.
CASH has the following requirements:
- python > 3.6 on each node
- password-less SSH access to and between all nodes
Please run pip install cascading-shell and use the cash command line tool on the admin machine. Then, configure your
cluster(s). Nodes and nodegroups are configured in ~/.cash.topo.json like this:
{
"nodes": {
"group1": "clus1node001,clus1node002,clus1node003",
"all": {
"site1": {
"cluster1": {
"rack1": "clus1node[001-020]",
"rack2": "clus1node[021-040]",
"rack3": "clus1node[041-060]"
},
"cluster2": {
"rack1": "clus2node[001-020]",
"rack2": "clus2node[021-040]",
"rack3": "clus2node[041-060]"
}
},
"site2": {
"cluster3": {
"rack1": "clus3node[001-020]",
"rack2": "clus3node[021-040]",
"rack3": "clus3node[041-060]"
},
"cluster4": {
"rack1": "clus4node[001-020]",
"rack2": "clus4node[021-040]",
"rack3": "clus4node[041-060]"
},
"cluster5": "clus5node[001-020]"
}
}
}
}The config file has the following rules:
- Right now, everything lives under the
nodesobject. - The file format is standard JSON, where each key is a group name and each value is a comma separated list of nodes.
- Nodes with sequential numbers can be shortened using square brackets, e.g.,
node[001-003]resolves tonode001,node002,node003. Be careful with leading zeros here! You may also use a comma here, such as:node[001-003,005]->node001,node002,node003,node005. You can also use multiple bracket instances:clus[1-3]node[001-003]->clus1node001,clus1node002,clus1node003,clus2node001,clus2node002,clus2node003,clus3node001,clus3node002,clus3node003and so on. - Groups can be nested. The topology of the node tree is specified in the mandatory
allgroup. It is wise to reflect network latency/bandwidth in the tree; for instance, as in the above example, you may divide your HPC into groups of site, cluster, rack if applicable. - Aside from
all, you can specify as many groups as you wish and nest them to your liking.
CASH communicates with each node in a cascading fashion, where CASH itself on each node acts as a proxy for its immediate children and forwards all messages from the children to its parent and vice versa. Let's try to understand this with an example. Imaging the following topology configuration:
{
"nodes": {
"all": {
"site1": {
"cluster1": {
"rack1": "clus1node[1-3]",
"rack2": "clus1node[4-6]"
},
"cluster2": {
"rack1": "clus2node[1-3]",
"rack2": "clus2node[4-6]"
}
},
"site2": {
"cluster3": {
"rack1": "clus3node[1-3]",
"rack2": "clus3node[4-6]"
},
"cluster4": {
"rack1": "clus4node[1-3]",
"rack2": "clus4node[4-6]"
}
}
}
}
}We have a total of four clusters in two geographical sites, each cluster has two racks with three nodes each. We now
want to execute a command on all nodes using CASH. First, CASH spawns an instance of itself on the gateway host, that
can be specified via the DEFAULT_JUMP_HOST variable or via the command line parameter --jump-host. From the gateway,
a connection to the first host of site1 and the first host of site2 is established, i.e., clus1node1 and
clus3node1. From each of those two nodes, CASH hops to the first node of each cluster (e.g., clus1node2 for
cluster1, as clus1node1 was already used, and clus2node1), from there to the first
node of each rack, and then to the remaining nodes.
For example, clus4node5 is reached in the following way:
ADMIN_MACHINE -> gateway -> clus3node1 (site) -> clus4node1 (cluster) -> clus4node4 (rack) -> clus4node5 (node). This
tiered or cascading execution model of course makes sense only for a larger number of nodes than in this example. You
can tell CASH to use a flat instead of cascading connection model with the --flatten parameter.
The number of parallel connections on each node is limited by the --fan-size parameter (env DEFAULT_FANSIZE = 50).
When more that FANSIZE nodes are direct children of one node, they are grouped by FANSIZE and an additional layer is
formed.
Every node that is part of the tree receives and forwards messages from/to its parent and its children, and also executes the desired shell command locally.
Here is a copy of cash --help:
usage: cash [-h] [-n NODES] [--jumphost JUMPHOST] [--ssh-timeout SSH_TIMEOUT]
[-s FANSIZE] [--flatten] [-p] [--json | --shell | --quiet]
{run,plan} ...
positional arguments:
{run,plan} Please use one of the following sub commands
run Run command
plan Print tree as json to stdout (view with, e.g.,
firefox)
optional arguments:
-h, --help show this help message and exit
-n NODES, --nodes NODES
Node or node groups.
--jumphost JUMPHOST Gateway host to cluster.
--ssh-timeout SSH_TIMEOUT
Define a timeout for SSH sessions. 0 = no timeout
-s FANSIZE, --fansize FANSIZE
Maximum number of parallel SSH sessions.
--flatten Disable tree mode.
-p, --progress Show progress of received answers
--json JSON output format
--shell Shell friendly output format
--quiet No output
- Node groups can be specified with
@group_namein the--nodesparameter. - You can exclude hosts by using
-n "@group,-node01". - You can use the square bracket syntax here, too:
-n "node[1-9]".
You can specify the defaults of the CLI parameter via the following environment variables:
DEFAULT_SSH_TIMEOUT = 30
DEFAULT_FANSIZE = 50
DEFAULT_NODES_STRING = "@all"
DEFAULT_OUT_FORMAT = "text"
DEFAULT_JUMP_HOST = "jumphost"
DEFAULT_FLATTEN = False
DEFAULT_RUN_SHELL = True