relative_major_compact.py: compact up to X GB.vnodes_token_generator.py: generate evenly distributed initial tokens for a vnodes Cassandra cluster.
Force a compaction to run and compact up to X GB of data.
Useful with SizeTieredCompactionStrategy if you don't have enough free
space to run a major compaction and you want to compact as much as possible data.
--dry-run option allows to see which SSTables will be compacted.
Wordplay with the musical concept Relative major
jmxterm must be installed where relative_major_compact.py is run.
Download:
wget http://downloads.sourceforge.net/cyclops-group/jmxterm-1.0-alpha-4-uber.jar
$ ./relative_major_compact.py -h
usage: relative_major_compact.py [-h] [--verbose] [--dry-run] [--java JAVA]
[--jmxterm JMXTERM] [--host HOST:PORT]
table_path target_size
positional arguments:
table_path Path to sstables to compact: e.g
/var/lib/cassandra/data/ks/table/
target_size Size in bytes of the sum of sstables to compact. M and G
could be used: e.g. 1073741824 or 1G
optional arguments:
-h, --help show this help message and exit
--verbose, -v Verbose output. Print each sstable name that will
participate in the compaction.
--dry-run, -d Simulation. Useful with --verbose.
--java JAVA Path to Java. By default the java command is assumed to
be on the path.
--jmxterm JMXTERM Path to JmxTerm. By default looks for jmxterm.jar in the
current directory.
--host HOST:PORT JMX IP and port. Default: 127.0.0.1:7199
Vnodes Murmur3 tokens generator: generate evenly distributed initial tokens for a vnodes Cassandra cluster.
initial_token can be set in cassandra.yaml whith vnodes by using comma separated values.
Note: You should be aware of consequences when using a low number of vnodes. On the other hand, some operations like repairs will be faster. Also, don't forget that vnodes can't be moved like single token, think about it if you plan to scale out.
$ ./vnodes_token_generator.py -h
usage: vnodes_token_generator.py [-h] [--offset OFFSET] [-i INDENT]
[-j | -y | -t] [-n NUM | -s SERVERS]
vnodes
positional arguments:
vnodes Number of vnodes per server
optional arguments:
-h, --help show this help message and exit
--offset OFFSET Value to add to each token, to avoid potential
conflicts (e.g. 1). Default is 0 (no offset).
-i INDENT, --indent INDENT
JSON indentation spaces (e.g. 4)
-j, --json JSON output
-y, --yaml initial_token for cassandra.yaml
-t, --text Space separated values. First column: IP address, then
one column per token
-n NUM, --num NUM Number of Cassandra servers
-s SERVERS, --servers SERVERS
Cassandra servers file. One IP/hostname per line.
First, create a file which contains one host per line:
$ cat hosts
192.168.1.1
192.168.2.1
192.168.3.1
Then run vnodes_token_generator.py with the number of vnodes you want.
Here, just 4 to keep the example readable.
$ ./vnodes_token_generator.py --servers hosts 4
192.168.2.1 -7686143364045646507 -3074457345618258604 1537228672809129299 6148914691236517202
192.168.3.1 -6148914691236517206 -1537228672809129303 3074457345618258600 7686143364045646503
192.168.1.1 -9223372036854775808 -4611686018427387905 -2 4611686018427387901
$ ./vnodes_token_generator.py --json --indent 2 --servers hosts 4
{
"192.168.1.1": "-9223372036854775808,-4611686018427387905,-2,4611686018427387901",
"192.168.2.1": "-7686143364045646507,-3074457345618258604,1537228672809129299,6148914691236517202",
"192.168.3.1": "-6148914691236517206,-1537228672809129303,3074457345618258600,7686143364045646503"
}
$ ./vnodes_token_generator.py --yaml --servers hosts 4
192.168.2.1 initial_token: -7686143364045646507,-3074457345618258604,1537228672809129299,6148914691236517202
192.168.3.1 initial_token: -6148914691236517206,-1537228672809129303,3074457345618258600,7686143364045646503
192.168.1.1 initial_token: -9223372036854775808,-4611686018427387905,-2,4611686018427387901
Only the number of servers is specified:
$ ./vnodes_token_generator.py -n 3 4
0 -9223372036854775808 -4611686018427387905 -2 4611686018427387901
1 -7686143364045646507 -3074457345618258604 1537228672809129299 6148914691236517202
2 -6148914691236517206 -1537228672809129303 3074457345618258600 7686143364045646503
Here is a common use case:
- Replication strategies: NetworkTopologyStrategy
- Replication factor: 3
- 3 Cassandra racks (i.e. the same as replication factor)
We will choose 18 nodes -- 6 nodes per rack -- in this example.
You want that each rack owns 100% of data. But you also want that replicas are evenly distributed across each rack.
To do so, you have to interleaved racks in a text file prior to run vnodes_token_generator.py.
Let's say you have 3 files, each one contains IPs of one rack:
$ cat hosts_rackA.txt
192.168.1.1
192.168.1.2
192.168.1.3
192.168.1.4
192.168.1.5
192.168.1.6
$ cat hosts_rackB.txt
192.168.2.1
192.168.2.2
192.168.2.3
192.168.2.4
192.168.2.5
192.168.2.6
$ cat hosts_rackC.txt
192.168.3.1
192.168.3.2
192.168.3.3
192.168.3.4
192.168.3.5
192.168.3.6
Create a new file where racks are interleaved:
paste -d "\n" hosts_rackA.txt hosts_rackB.txt hosts_rackC.txt > hosts_interleaved_racks.txt
Or for the lazy:
paste -d "\n" hosts_rack* > hosts_interleaved_racks.txt
Look how the each rack appears in the file, I emphasize the rack1:
$ cat hosts_interleaved_racks.txt
192.168.1.1 # <--- rack1
192.168.2.1
192.168.3.1
192.168.1.2 # <--- rack1
192.168.2.2
192.168.3.2
192.168.1.3 # <--- rack1
192.168.2.3
192.168.3.3
192.168.1.4 # <--- rack1
192.168.2.4
192.168.3.4
192.168.1.5 # <--- rack1
192.168.2.5
192.168.3.5
192.168.1.6 # <--- rack1
192.168.2.6
192.168.3.6
(You will find theses files in the /res directory of this repo if you want to play with vnodes_token_generator.py)
Now let's say the Cassandra tokens are stored in a Chef data bag,
we can call vnodes_token_generator.py with --json and copy/paste the output in the data bag. Here 8 tokens per C* node:
$ ./vnodes_token_generator.py --json --indent 4 --servers res/hosts_interleaved_racks.txt 8
{
"192.168.1.1": "-9223372036854775808,-6917529027641081858,-4611686018427387908,-2305843009213693958,-8,2305843009213693942,4611686018427387892,6917529027641081842",
"192.168.1.2": "-8839064868652493483,-6533221859438799533,-4227378850225105583,-1921535841011411633,384307168202282317,2690150177415976267,4995993186629670217,7301836195843364167",
"192.168.1.3": "-8454757700450211158,-6148914691236517208,-3843071682022823258,-1537228672809129308,768614336404564642,3074457345618258592,5380300354831952542,7686143364045646492",
"192.168.1.4": "-8070450532247928833,-5764607523034234883,-3458764513820540933,-1152921504606846983,1152921504606846967,3458764513820540917,5764607523034234867,8070450532247928817",
"192.168.1.5": "-7686143364045646508,-5380300354831952558,-3074457345618258608,-768614336404564658,1537228672809129292,3843071682022823242,6148914691236517192,8454757700450211142",
"192.168.1.6": "-7301836195843364183,-4995993186629670233,-2690150177415976283,-384307168202282333,1921535841011411617,4227378850225105567,6533221859438799517,8839064868652493467",
"192.168.2.1": "-9095269647454015033,-6789426638240321083,-4483583629026627133,-2177740619812933183,128102389400760767,2433945398614454717,4739788407828148667,7045631417041842617",
"192.168.2.2": "-8710962479251732708,-6405119470038038758,-4099276460824344808,-1793433451610650858,512409557603043092,2818252566816737042,5124095576030430992,7429938585244124942",
"192.168.2.3": "-8326655311049450383,-6020812301835756433,-3714969292622062483,-1409126283408368533,896716725805325417,3202559735019019367,5508402744232713317,7814245753446407267",
"192.168.2.4": "-7942348142847168058,-5636505133633474108,-3330662124419780158,-1024819115206086208,1281023894007607742,3586866903221301692,5892709912434995642,8198552921648689592",
"192.168.2.5": "-7558040974644885733,-5252197965431191783,-2946354956217497833,-640511947003803883,1665331062209890067,3971174071423584017,6277017080637277967,8582860089850971917",
"192.168.2.6": "-7173733806442603408,-4867890797228909458,-2562047788015215508,-256204778801521558,2049638230412172392,4355481239625866342,6661324248839560292,8967167258053254242",
"192.168.3.1": "-8967167258053254258,-6661324248839560308,-4355481239625866358,-2049638230412172408,256204778801521542,2562047788015215492,4867890797228909442,7173733806442603392",
"192.168.3.2": "-8582860089850971933,-6277017080637277983,-3971174071423584033,-1665331062209890083,640511947003803867,2946354956217497817,5252197965431191767,7558040974644885717",
"192.168.3.3": "-8198552921648689608,-5892709912434995658,-3586866903221301708,-1281023894007607758,1024819115206086192,3330662124419780142,5636505133633474092,7942348142847168042",
"192.168.3.4": "-7814245753446407283,-5508402744232713333,-3202559735019019383,-896716725805325433,1409126283408368517,3714969292622062467,6020812301835756417,8326655311049450367",
"192.168.3.5": "-7429938585244124958,-5124095576030431008,-2818252566816737058,-512409557603043108,1793433451610650842,4099276460824344792,6405119470038038742,8710962479251732692",
"192.168.3.6": "-7045631417041842633,-4739788407828148683,-2433945398614454733,-128102389400760783,2177740619812933167,4483583629026627117,6789426638240321067,9095269647454015017"
}
If you want to check the alternation of racks you can sort by the first token column (i.e. column #2 in the text output). Then check the first column i.e. racks:
$ ./vnodes_token_generator.py --servers res/hosts_interleaved_racks.txt 8 | sort -nk2
192.168.1.1 -9223372036854775808 -6917529027641081858 -4611686018427387908 -2305843009213693958 -8 2305843009213693942 4611686018427387892 6917529027641081842
192.168.2.1 -9095269647454015033 -6789426638240321083 -4483583629026627133 -2177740619812933183 128102389400760767 2433945398614454717 4739788407828148667 7045631417041842617
192.168.3.1 -8967167258053254258 -6661324248839560308 -4355481239625866358 -2049638230412172408 256204778801521542 2562047788015215492 4867890797228909442 7173733806442603392
192.168.1.2 -8839064868652493483 -6533221859438799533 -4227378850225105583 -1921535841011411633 384307168202282317 2690150177415976267 4995993186629670217 7301836195843364167
192.168.2.2 -8710962479251732708 -6405119470038038758 -4099276460824344808 -1793433451610650858 512409557603043092 2818252566816737042 5124095576030430992 7429938585244124942
192.168.3.2 -8582860089850971933 -6277017080637277983 -3971174071423584033 -1665331062209890083 640511947003803867 2946354956217497817 5252197965431191767 7558040974644885717
192.168.1.3 -8454757700450211158 -6148914691236517208 -3843071682022823258 -1537228672809129308 768614336404564642 3074457345618258592 5380300354831952542 7686143364045646492
192.168.2.3 -8326655311049450383 -6020812301835756433 -3714969292622062483 -1409126283408368533 896716725805325417 3202559735019019367 5508402744232713317 7814245753446407267
192.168.3.3 -8198552921648689608 -5892709912434995658 -3586866903221301708 -1281023894007607758 1024819115206086192 3330662124419780142 5636505133633474092 7942348142847168042
192.168.1.4 -8070450532247928833 -5764607523034234883 -3458764513820540933 -1152921504606846983 1152921504606846967 3458764513820540917 5764607523034234867 8070450532247928817
192.168.2.4 -7942348142847168058 -5636505133633474108 -3330662124419780158 -1024819115206086208 1281023894007607742 3586866903221301692 5892709912434995642 8198552921648689592
192.168.3.4 -7814245753446407283 -5508402744232713333 -3202559735019019383 -896716725805325433 1409126283408368517 3714969292622062467 6020812301835756417 8326655311049450367
192.168.1.5 -7686143364045646508 -5380300354831952558 -3074457345618258608 -768614336404564658 1537228672809129292 3843071682022823242 6148914691236517192 8454757700450211142
192.168.2.5 -7558040974644885733 -5252197965431191783 -2946354956217497833 -640511947003803883 1665331062209890067 3971174071423584017 6277017080637277967 8582860089850971917
192.168.3.5 -7429938585244124958 -5124095576030431008 -2818252566816737058 -512409557603043108 1793433451610650842 4099276460824344792 6405119470038038742 8710962479251732692
192.168.1.6 -7301836195843364183 -4995993186629670233 -2690150177415976283 -384307168202282333 1921535841011411617 4227378850225105567 6533221859438799517 8839064868652493467
192.168.2.6 -7173733806442603408 -4867890797228909458 -2562047788015215508 -256204778801521558 2049638230412172392 4355481239625866342 6661324248839560292 8967167258053254242
192.168.3.6 -7045631417041842633 -4739788407828148683 -2433945398614454733 -128102389400760783 2177740619812933167 4483583629026627117 6789426638240321067 9095269647454015017
Cassandra does not allow a node to have the same token as another one, so you may have to generate an offset token map.
The --offset option can do that. Same output as above but offset by 1:
$ ./vnodes_token_generator.py --json --indent 4 --servers res/hosts_interleaved_racks.txt --offset 1 8
{
"192.168.1.1": "-9223372036854775807,-6917529027641081857,-4611686018427387907,-2305843009213693957,-7,2305843009213693943,4611686018427387893,6917529027641081843",
"192.168.1.2": "-8839064868652493482,-6533221859438799532,-4227378850225105582,-1921535841011411632,384307168202282318,2690150177415976268,4995993186629670218,7301836195843364168",
"192.168.1.3": "-8454757700450211157,-6148914691236517207,-3843071682022823257,-1537228672809129307,768614336404564643,3074457345618258593,5380300354831952543,7686143364045646493",
"192.168.1.4": "-8070450532247928832,-5764607523034234882,-3458764513820540932,-1152921504606846982,1152921504606846968,3458764513820540918,5764607523034234868,8070450532247928818",
"192.168.1.5": "-7686143364045646507,-5380300354831952557,-3074457345618258607,-768614336404564657,1537228672809129293,3843071682022823243,6148914691236517193,8454757700450211143",
"192.168.1.6": "-7301836195843364182,-4995993186629670232,-2690150177415976282,-384307168202282332,1921535841011411618,4227378850225105568,6533221859438799518,8839064868652493468",
"192.168.2.1": "-9095269647454015032,-6789426638240321082,-4483583629026627132,-2177740619812933182,128102389400760768,2433945398614454718,4739788407828148668,7045631417041842618",
"192.168.2.2": "-8710962479251732707,-6405119470038038757,-4099276460824344807,-1793433451610650857,512409557603043093,2818252566816737043,5124095576030430993,7429938585244124943",
"192.168.2.3": "-8326655311049450382,-6020812301835756432,-3714969292622062482,-1409126283408368532,896716725805325418,3202559735019019368,5508402744232713318,7814245753446407268",
"192.168.2.4": "-7942348142847168057,-5636505133633474107,-3330662124419780157,-1024819115206086207,1281023894007607743,3586866903221301693,5892709912434995643,8198552921648689593",
"192.168.2.5": "-7558040974644885732,-5252197965431191782,-2946354956217497832,-640511947003803882,1665331062209890068,3971174071423584018,6277017080637277968,8582860089850971918",
"192.168.2.6": "-7173733806442603407,-4867890797228909457,-2562047788015215507,-256204778801521557,2049638230412172393,4355481239625866343,6661324248839560293,8967167258053254243",
"192.168.3.1": "-8967167258053254257,-6661324248839560307,-4355481239625866357,-2049638230412172407,256204778801521543,2562047788015215493,4867890797228909443,7173733806442603393",
"192.168.3.2": "-8582860089850971932,-6277017080637277982,-3971174071423584032,-1665331062209890082,640511947003803868,2946354956217497818,5252197965431191768,7558040974644885718",
"192.168.3.3": "-8198552921648689607,-5892709912434995657,-3586866903221301707,-1281023894007607757,1024819115206086193,3330662124419780143,5636505133633474093,7942348142847168043",
"192.168.3.4": "-7814245753446407282,-5508402744232713332,-3202559735019019382,-896716725805325432,1409126283408368518,3714969292622062468,6020812301835756418,8326655311049450368",
"192.168.3.5": "-7429938585244124957,-5124095576030431007,-2818252566816737057,-512409557603043107,1793433451610650843,4099276460824344793,6405119470038038743,8710962479251732693",
"192.168.3.6": "-7045631417041842632,-4739788407828148682,-2433945398614454732,-128102389400760782,2177740619812933168,4483583629026627118,6789426638240321068,9095269647454015018"
}
Create a keyspace with the replication strategy you want to use and set the replication factor in a given DC. In the "real world example" from above it means:
cqlsh -e "CREATE KEYSPACE nts_rf3 WITH replication = {'class': 'NetworkTopologyStrategy', 'test-eu': 3}"
Be sure to set a replication factor greater than 1. With RF=1 you will check the partitioner range but not the replica ownership.
Then run a nodetool status on the newly created keyspace and pay attention at the Owns (effective) column, the value should be the same on each line:
$ nodetool status nts_rf3
Datacenter: test-eu
===================
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
-- Address Load Tokens Owns (effective) Host ID Rack
UN 192.168.3.64 61,6 KB 8 16,7% 0ba59b8e-34fc-4774-a52b-2c5ede4296f2 rack3
UN 192.168.2.99 72,84 KB 8 16,7% 57c0a12c-dba7-440d-b984-980c52578970 rack2
UN 192.168.1.69 51,49 KB 8 16,7% 4ad834ec-4e0c-48f9-a309-19a7c9aeccb2 rack1
UN 192.168.3.133 89,66 KB 8 16,7% 7c858432-62a7-4874-98c3-84b46299beb2 rack3
UN 192.168.2.230 89,89 KB 8 16,7% 5262cd0d-4c30-4c53-b3d4-4b4f8243745b rack2
UN 192.168.3.199 83,54 KB 8 16,7% d140a761-d211-4526-97fe-3145634dacf6 rack3
UN 192.168.2.200 66,72 KB 8 16,7% b81be237-f1d8-4f12-9601-718cb3732b15 rack2
UN 192.168.1.233 90,29 KB 8 16,7% eba7dc22-8a16-4547-b42c-ce0a882ca750 rack1
UN 192.168.1.138 89,96 KB 8 16,7% 7957136d-8d93-422f-97d7-63a3e1665a7d rack1
UN 192.168.3.106 80,48 KB 8 16,7% 184a6fdf-664a-478c-9a1e-e15e27d4a1e3 rack3
UN 192.168.3.11 80,44 KB 8 16,7% d1be4b39-2129-4b2b-8452-b06cf7bd7872 rack3
UN 192.168.3.237 120,79 KB 8 16,7% 391e00b2-e2e2-4898-811f-4277e97e8f7c rack3
UN 192.168.2.109 78,12 KB 8 16,7% e312d477-660a-48e5-a76d-7d646a401acc rack2
UN 192.168.1.244 164,89 KB 8 16,7% 47f50509-08d0-437a-a868-58eaa0a0a548 rack1
UN 192.168.2.180 158,74 KB 8 16,7% 7173a4ca-c312-4b8a-8007-a8591d82c2d3 rack2
UN 192.168.1.25 61,36 KB 8 16,7% 0392697b-e7df-4017-90ff-53977a063041 rack1
UN 192.168.2.90 56,6 KB 8 16,7% 27113d8c-f8a3-4d40-9011-aeffe1391fb0 rack2
UN 192.168.1.91 73,07 KB 8 16,7% 01483483-f63e-49af-aba8-9987078260f3 rack1
Example of a bad (unbalanced) token map:
Datacenter: test-eu
===================
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
-- Address Load Tokens Owns (effective) Host ID Rack
UN 192.168.3.64 61,6 KB 8 5,6% 0ba59b8e-34fc-4774-a52b-2c5ede4296f2 rack3
UN 192.168.2.99 72,84 KB 8 5,6% 57c0a12c-dba7-440d-b984-980c52578970 rack2
UN 192.168.1.69 51,49 KB 8 5,6% 4ad834ec-4e0c-48f9-a309-19a7c9aeccb2 rack1
UN 192.168.3.133 89,66 KB 8 5,6% 7c858432-62a7-4874-98c3-84b46299beb2 rack3
UN 192.168.2.230 89,89 KB 8 5,6% 5262cd0d-4c30-4c53-b3d4-4b4f8243745b rack2
UN 192.168.3.199 83,54 KB 8 5,6% d140a761-d211-4526-97fe-3145634dacf6 rack3
UN 192.168.2.200 66,72 KB 8 5,6% b81be237-f1d8-4f12-9601-718cb3732b15 rack2
UN 192.168.1.233 90,29 KB 8 72,2% eba7dc22-8a16-4547-b42c-ce0a882ca750 rack1
UN 192.168.1.138 89,96 KB 8 5,6% 7957136d-8d93-422f-97d7-63a3e1665a7d rack1
UN 192.168.3.106 80,48 KB 8 5,6% 184a6fdf-664a-478c-9a1e-e15e27d4a1e3 rack3
UN 192.168.3.11 80,44 KB 8 5,6% d1be4b39-2129-4b2b-8452-b06cf7bd7872 rack3
UN 192.168.3.237 120,79 KB 8 5,6% 391e00b2-e2e2-4898-811f-4277e97e8f7c rack3
UN 192.168.2.109 78,12 KB 8 72,2% e312d477-660a-48e5-a76d-7d646a401acc rack2
UN 192.168.1.244 164,89 KB 8 72,2% 47f50509-08d0-437a-a868-58eaa0a0a548 rack1
UN 192.168.2.180 158,74 KB 8 5,6% 7173a4ca-c312-4b8a-8007-a8591d82c2d3 rack2
UN 192.168.1.25 61,36 KB 8 5,6% 0392697b-e7df-4017-90ff-53977a063041 rack1
UN 192.168.2.90 56,6 KB 8 5,6% 27113d8c-f8a3-4d40-9011-aeffe1391fb0 rack2
UN 192.168.1.91 73,07 KB 8 5,6% 01483483-f63e-49af-aba8-9987078260f3 rack1