Skip to content

rhardouin/cassandra-scripts

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

15 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

cassandra-scripts

  • relative_major_compact.py: compact up to X GB.
  • vnodes_token_generator.py: generate evenly distributed initial tokens for a vnodes Cassandra cluster.

relative_major_compact.py

Force a compaction to run and compact up to X GB of data. Useful with SizeTieredCompactionStrategy if you don't have enough free space to run a major compaction and you want to compact as much as possible data. --dry-run option allows to see which SSTables will be compacted.

Wordplay with the musical concept Relative major

Dependencies

jmxterm must be installed where relative_major_compact.py is run.

Download:

wget http://downloads.sourceforge.net/cyclops-group/jmxterm-1.0-alpha-4-uber.jar

Usage

$ ./relative_major_compact.py -h
usage: relative_major_compact.py [-h] [--verbose] [--dry-run] [--java JAVA]
                                 [--jmxterm JMXTERM] [--host HOST:PORT]
                                 table_path target_size

positional arguments:
  table_path         Path to sstables to compact: e.g
                     /var/lib/cassandra/data/ks/table/
  target_size        Size in bytes of the sum of sstables to compact. M and G
                     could be used: e.g. 1073741824 or 1G

optional arguments:
  -h, --help         show this help message and exit
  --verbose, -v      Verbose output. Print each sstable name that will
                     participate in the compaction.
  --dry-run, -d      Simulation. Useful with --verbose.
  --java JAVA        Path to Java. By default the java command is assumed to
                     be on the path.
  --jmxterm JMXTERM  Path to JmxTerm. By default looks for jmxterm.jar in the
                     current directory.
  --host HOST:PORT   JMX IP and port. Default: 127.0.0.1:7199

vnodes_token_generator.py

Vnodes Murmur3 tokens generator: generate evenly distributed initial tokens for a vnodes Cassandra cluster. initial_token can be set in cassandra.yaml whith vnodes by using comma separated values.

Note: You should be aware of consequences when using a low number of vnodes. On the other hand, some operations like repairs will be faster. Also, don't forget that vnodes can't be moved like single token, think about it if you plan to scale out.

Usage

$ ./vnodes_token_generator.py -h
usage: vnodes_token_generator.py [-h] [--offset OFFSET] [-i INDENT]
                                 [-j | -y | -t] [-n NUM | -s SERVERS]
                                 vnodes

positional arguments:
  vnodes                Number of vnodes per server

optional arguments:
  -h, --help            show this help message and exit
  --offset OFFSET       Value to add to each token, to avoid potential
                        conflicts (e.g. 1). Default is 0 (no offset).
  -i INDENT, --indent INDENT
                        JSON indentation spaces (e.g. 4)
  -j, --json            JSON output
  -y, --yaml            initial_token for cassandra.yaml
  -t, --text            Space separated values. First column: IP address, then
                        one column per token
  -n NUM, --num NUM     Number of Cassandra servers
  -s SERVERS, --servers SERVERS
                        Cassandra servers file. One IP/hostname per line.

Basic examples

First, create a file which contains one host per line:

$ cat hosts 
192.168.1.1
192.168.2.1
192.168.3.1

Then run vnodes_token_generator.py with the number of vnodes you want. Here, just 4 to keep the example readable.

Text output

$ ./vnodes_token_generator.py --servers hosts 4
192.168.2.1 -7686143364045646507 -3074457345618258604 1537228672809129299 6148914691236517202
192.168.3.1 -6148914691236517206 -1537228672809129303 3074457345618258600 7686143364045646503
192.168.1.1 -9223372036854775808 -4611686018427387905 -2 4611686018427387901

Json output

$ ./vnodes_token_generator.py --json --indent 2 --servers hosts 4
{
  "192.168.1.1": "-9223372036854775808,-4611686018427387905,-2,4611686018427387901", 
  "192.168.2.1": "-7686143364045646507,-3074457345618258604,1537228672809129299,6148914691236517202", 
  "192.168.3.1": "-6148914691236517206,-1537228672809129303,3074457345618258600,7686143364045646503"
}

cassandra.yaml initial_token output

$ ./vnodes_token_generator.py --yaml --servers hosts 4
192.168.2.1 initial_token: -7686143364045646507,-3074457345618258604,1537228672809129299,6148914691236517202
192.168.3.1 initial_token: -6148914691236517206,-1537228672809129303,3074457345618258600,7686143364045646503
192.168.1.1 initial_token: -9223372036854775808,-4611686018427387905,-2,4611686018427387901

No servers file

Only the number of servers is specified:

$ ./vnodes_token_generator.py -n 3 4
0 -9223372036854775808 -4611686018427387905 -2 4611686018427387901
1 -7686143364045646507 -3074457345618258604 1537228672809129299 6148914691236517202
2 -6148914691236517206 -1537228672809129303 3074457345618258600 7686143364045646503

Real world example

Here is a common use case:

  • Replication strategies: NetworkTopologyStrategy
  • Replication factor: 3
  • 3 Cassandra racks (i.e. the same as replication factor)

We will choose 18 nodes -- 6 nodes per rack -- in this example.

You want that each rack owns 100% of data. But you also want that replicas are evenly distributed across each rack. To do so, you have to interleaved racks in a text file prior to run vnodes_token_generator.py.

Let's say you have 3 files, each one contains IPs of one rack:

$ cat hosts_rackA.txt 
192.168.1.1
192.168.1.2
192.168.1.3
192.168.1.4
192.168.1.5
192.168.1.6

$ cat hosts_rackB.txt 
192.168.2.1
192.168.2.2
192.168.2.3
192.168.2.4
192.168.2.5
192.168.2.6

$ cat hosts_rackC.txt 
192.168.3.1
192.168.3.2
192.168.3.3
192.168.3.4
192.168.3.5
192.168.3.6

Create a new file where racks are interleaved:

paste -d "\n" hosts_rackA.txt hosts_rackB.txt hosts_rackC.txt > hosts_interleaved_racks.txt

Or for the lazy:

paste -d "\n" hosts_rack* > hosts_interleaved_racks.txt

Look how the each rack appears in the file, I emphasize the rack1:

$ cat hosts_interleaved_racks.txt 
192.168.1.1  # <--- rack1
192.168.2.1
192.168.3.1
192.168.1.2  # <--- rack1
192.168.2.2
192.168.3.2
192.168.1.3  # <--- rack1
192.168.2.3
192.168.3.3
192.168.1.4  # <--- rack1
192.168.2.4
192.168.3.4
192.168.1.5  # <--- rack1
192.168.2.5
192.168.3.5
192.168.1.6  # <--- rack1
192.168.2.6
192.168.3.6

(You will find theses files in the /res directory of this repo if you want to play with vnodes_token_generator.py)

Now let's say the Cassandra tokens are stored in a Chef data bag, we can call vnodes_token_generator.py with --json and copy/paste the output in the data bag. Here 8 tokens per C* node:

$ ./vnodes_token_generator.py --json --indent 4 --servers res/hosts_interleaved_racks.txt 8
{
    "192.168.1.1": "-9223372036854775808,-6917529027641081858,-4611686018427387908,-2305843009213693958,-8,2305843009213693942,4611686018427387892,6917529027641081842", 
    "192.168.1.2": "-8839064868652493483,-6533221859438799533,-4227378850225105583,-1921535841011411633,384307168202282317,2690150177415976267,4995993186629670217,7301836195843364167", 
    "192.168.1.3": "-8454757700450211158,-6148914691236517208,-3843071682022823258,-1537228672809129308,768614336404564642,3074457345618258592,5380300354831952542,7686143364045646492", 
    "192.168.1.4": "-8070450532247928833,-5764607523034234883,-3458764513820540933,-1152921504606846983,1152921504606846967,3458764513820540917,5764607523034234867,8070450532247928817", 
    "192.168.1.5": "-7686143364045646508,-5380300354831952558,-3074457345618258608,-768614336404564658,1537228672809129292,3843071682022823242,6148914691236517192,8454757700450211142", 
    "192.168.1.6": "-7301836195843364183,-4995993186629670233,-2690150177415976283,-384307168202282333,1921535841011411617,4227378850225105567,6533221859438799517,8839064868652493467", 
    "192.168.2.1": "-9095269647454015033,-6789426638240321083,-4483583629026627133,-2177740619812933183,128102389400760767,2433945398614454717,4739788407828148667,7045631417041842617", 
    "192.168.2.2": "-8710962479251732708,-6405119470038038758,-4099276460824344808,-1793433451610650858,512409557603043092,2818252566816737042,5124095576030430992,7429938585244124942", 
    "192.168.2.3": "-8326655311049450383,-6020812301835756433,-3714969292622062483,-1409126283408368533,896716725805325417,3202559735019019367,5508402744232713317,7814245753446407267", 
    "192.168.2.4": "-7942348142847168058,-5636505133633474108,-3330662124419780158,-1024819115206086208,1281023894007607742,3586866903221301692,5892709912434995642,8198552921648689592", 
    "192.168.2.5": "-7558040974644885733,-5252197965431191783,-2946354956217497833,-640511947003803883,1665331062209890067,3971174071423584017,6277017080637277967,8582860089850971917", 
    "192.168.2.6": "-7173733806442603408,-4867890797228909458,-2562047788015215508,-256204778801521558,2049638230412172392,4355481239625866342,6661324248839560292,8967167258053254242", 
    "192.168.3.1": "-8967167258053254258,-6661324248839560308,-4355481239625866358,-2049638230412172408,256204778801521542,2562047788015215492,4867890797228909442,7173733806442603392", 
    "192.168.3.2": "-8582860089850971933,-6277017080637277983,-3971174071423584033,-1665331062209890083,640511947003803867,2946354956217497817,5252197965431191767,7558040974644885717", 
    "192.168.3.3": "-8198552921648689608,-5892709912434995658,-3586866903221301708,-1281023894007607758,1024819115206086192,3330662124419780142,5636505133633474092,7942348142847168042", 
    "192.168.3.4": "-7814245753446407283,-5508402744232713333,-3202559735019019383,-896716725805325433,1409126283408368517,3714969292622062467,6020812301835756417,8326655311049450367", 
    "192.168.3.5": "-7429938585244124958,-5124095576030431008,-2818252566816737058,-512409557603043108,1793433451610650842,4099276460824344792,6405119470038038742,8710962479251732692", 
    "192.168.3.6": "-7045631417041842633,-4739788407828148683,-2433945398614454733,-128102389400760783,2177740619812933167,4483583629026627117,6789426638240321067,9095269647454015017"
}

If you want to check the alternation of racks you can sort by the first token column (i.e. column #2 in the text output). Then check the first column i.e. racks:

$ ./vnodes_token_generator.py --servers res/hosts_interleaved_racks.txt 8 | sort -nk2
192.168.1.1 -9223372036854775808 -6917529027641081858 -4611686018427387908 -2305843009213693958 -8 2305843009213693942 4611686018427387892 6917529027641081842
192.168.2.1 -9095269647454015033 -6789426638240321083 -4483583629026627133 -2177740619812933183 128102389400760767 2433945398614454717 4739788407828148667 7045631417041842617
192.168.3.1 -8967167258053254258 -6661324248839560308 -4355481239625866358 -2049638230412172408 256204778801521542 2562047788015215492 4867890797228909442 7173733806442603392
192.168.1.2 -8839064868652493483 -6533221859438799533 -4227378850225105583 -1921535841011411633 384307168202282317 2690150177415976267 4995993186629670217 7301836195843364167
192.168.2.2 -8710962479251732708 -6405119470038038758 -4099276460824344808 -1793433451610650858 512409557603043092 2818252566816737042 5124095576030430992 7429938585244124942
192.168.3.2 -8582860089850971933 -6277017080637277983 -3971174071423584033 -1665331062209890083 640511947003803867 2946354956217497817 5252197965431191767 7558040974644885717
192.168.1.3 -8454757700450211158 -6148914691236517208 -3843071682022823258 -1537228672809129308 768614336404564642 3074457345618258592 5380300354831952542 7686143364045646492
192.168.2.3 -8326655311049450383 -6020812301835756433 -3714969292622062483 -1409126283408368533 896716725805325417 3202559735019019367 5508402744232713317 7814245753446407267
192.168.3.3 -8198552921648689608 -5892709912434995658 -3586866903221301708 -1281023894007607758 1024819115206086192 3330662124419780142 5636505133633474092 7942348142847168042
192.168.1.4 -8070450532247928833 -5764607523034234883 -3458764513820540933 -1152921504606846983 1152921504606846967 3458764513820540917 5764607523034234867 8070450532247928817
192.168.2.4 -7942348142847168058 -5636505133633474108 -3330662124419780158 -1024819115206086208 1281023894007607742 3586866903221301692 5892709912434995642 8198552921648689592
192.168.3.4 -7814245753446407283 -5508402744232713333 -3202559735019019383 -896716725805325433 1409126283408368517 3714969292622062467 6020812301835756417 8326655311049450367
192.168.1.5 -7686143364045646508 -5380300354831952558 -3074457345618258608 -768614336404564658 1537228672809129292 3843071682022823242 6148914691236517192 8454757700450211142
192.168.2.5 -7558040974644885733 -5252197965431191783 -2946354956217497833 -640511947003803883 1665331062209890067 3971174071423584017 6277017080637277967 8582860089850971917
192.168.3.5 -7429938585244124958 -5124095576030431008 -2818252566816737058 -512409557603043108 1793433451610650842 4099276460824344792 6405119470038038742 8710962479251732692
192.168.1.6 -7301836195843364183 -4995993186629670233 -2690150177415976283 -384307168202282333 1921535841011411617 4227378850225105567 6533221859438799517 8839064868652493467
192.168.2.6 -7173733806442603408 -4867890797228909458 -2562047788015215508 -256204778801521558 2049638230412172392 4355481239625866342 6661324248839560292 8967167258053254242
192.168.3.6 -7045631417041842633 -4739788407828148683 -2433945398614454733 -128102389400760783 2177740619812933167 4483583629026627117 6789426638240321067 9095269647454015017

Cassandra does not allow a node to have the same token as another one, so you may have to generate an offset token map. The --offset option can do that. Same output as above but offset by 1:

$ ./vnodes_token_generator.py --json --indent 4 --servers res/hosts_interleaved_racks.txt --offset 1 8
{
    "192.168.1.1": "-9223372036854775807,-6917529027641081857,-4611686018427387907,-2305843009213693957,-7,2305843009213693943,4611686018427387893,6917529027641081843", 
    "192.168.1.2": "-8839064868652493482,-6533221859438799532,-4227378850225105582,-1921535841011411632,384307168202282318,2690150177415976268,4995993186629670218,7301836195843364168", 
    "192.168.1.3": "-8454757700450211157,-6148914691236517207,-3843071682022823257,-1537228672809129307,768614336404564643,3074457345618258593,5380300354831952543,7686143364045646493", 
    "192.168.1.4": "-8070450532247928832,-5764607523034234882,-3458764513820540932,-1152921504606846982,1152921504606846968,3458764513820540918,5764607523034234868,8070450532247928818", 
    "192.168.1.5": "-7686143364045646507,-5380300354831952557,-3074457345618258607,-768614336404564657,1537228672809129293,3843071682022823243,6148914691236517193,8454757700450211143", 
    "192.168.1.6": "-7301836195843364182,-4995993186629670232,-2690150177415976282,-384307168202282332,1921535841011411618,4227378850225105568,6533221859438799518,8839064868652493468", 
    "192.168.2.1": "-9095269647454015032,-6789426638240321082,-4483583629026627132,-2177740619812933182,128102389400760768,2433945398614454718,4739788407828148668,7045631417041842618", 
    "192.168.2.2": "-8710962479251732707,-6405119470038038757,-4099276460824344807,-1793433451610650857,512409557603043093,2818252566816737043,5124095576030430993,7429938585244124943", 
    "192.168.2.3": "-8326655311049450382,-6020812301835756432,-3714969292622062482,-1409126283408368532,896716725805325418,3202559735019019368,5508402744232713318,7814245753446407268", 
    "192.168.2.4": "-7942348142847168057,-5636505133633474107,-3330662124419780157,-1024819115206086207,1281023894007607743,3586866903221301693,5892709912434995643,8198552921648689593", 
    "192.168.2.5": "-7558040974644885732,-5252197965431191782,-2946354956217497832,-640511947003803882,1665331062209890068,3971174071423584018,6277017080637277968,8582860089850971918", 
    "192.168.2.6": "-7173733806442603407,-4867890797228909457,-2562047788015215507,-256204778801521557,2049638230412172393,4355481239625866343,6661324248839560293,8967167258053254243", 
    "192.168.3.1": "-8967167258053254257,-6661324248839560307,-4355481239625866357,-2049638230412172407,256204778801521543,2562047788015215493,4867890797228909443,7173733806442603393", 
    "192.168.3.2": "-8582860089850971932,-6277017080637277982,-3971174071423584032,-1665331062209890082,640511947003803868,2946354956217497818,5252197965431191768,7558040974644885718", 
    "192.168.3.3": "-8198552921648689607,-5892709912434995657,-3586866903221301707,-1281023894007607757,1024819115206086193,3330662124419780143,5636505133633474093,7942348142847168043", 
    "192.168.3.4": "-7814245753446407282,-5508402744232713332,-3202559735019019382,-896716725805325432,1409126283408368518,3714969292622062468,6020812301835756418,8326655311049450368", 
    "192.168.3.5": "-7429938585244124957,-5124095576030431007,-2818252566816737057,-512409557603043107,1793433451610650843,4099276460824344793,6405119470038038743,8710962479251732693", 
    "192.168.3.6": "-7045631417041842632,-4739788407828148682,-2433945398614454732,-128102389400760782,2177740619812933168,4483583629026627118,6789426638240321068,9095269647454015018"
}

FAQ

How to check if the output of vnodes_token_generator.py is good for me?

Create a keyspace with the replication strategy you want to use and set the replication factor in a given DC. In the "real world example" from above it means:

cqlsh -e "CREATE KEYSPACE nts_rf3 WITH replication = {'class': 'NetworkTopologyStrategy', 'test-eu': 3}"

Be sure to set a replication factor greater than 1. With RF=1 you will check the partitioner range but not the replica ownership.

Then run a nodetool status on the newly created keyspace and pay attention at the Owns (effective) column, the value should be the same on each line:

$ nodetool status nts_rf3

Datacenter: test-eu
===================
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address       Load       Tokens  Owns (effective)  Host ID                               Rack
UN  192.168.3.64     61,6 KB    8       16,7%             0ba59b8e-34fc-4774-a52b-2c5ede4296f2  rack3
UN  192.168.2.99     72,84 KB   8       16,7%             57c0a12c-dba7-440d-b984-980c52578970  rack2
UN  192.168.1.69     51,49 KB   8       16,7%             4ad834ec-4e0c-48f9-a309-19a7c9aeccb2  rack1
UN  192.168.3.133    89,66 KB   8       16,7%             7c858432-62a7-4874-98c3-84b46299beb2  rack3
UN  192.168.2.230    89,89 KB   8       16,7%             5262cd0d-4c30-4c53-b3d4-4b4f8243745b  rack2
UN  192.168.3.199    83,54 KB   8       16,7%             d140a761-d211-4526-97fe-3145634dacf6  rack3
UN  192.168.2.200    66,72 KB   8       16,7%             b81be237-f1d8-4f12-9601-718cb3732b15  rack2
UN  192.168.1.233    90,29 KB   8       16,7%             eba7dc22-8a16-4547-b42c-ce0a882ca750  rack1
UN  192.168.1.138    89,96 KB   8       16,7%             7957136d-8d93-422f-97d7-63a3e1665a7d  rack1
UN  192.168.3.106    80,48 KB   8       16,7%             184a6fdf-664a-478c-9a1e-e15e27d4a1e3  rack3
UN  192.168.3.11     80,44 KB   8       16,7%             d1be4b39-2129-4b2b-8452-b06cf7bd7872  rack3
UN  192.168.3.237    120,79 KB  8       16,7%             391e00b2-e2e2-4898-811f-4277e97e8f7c  rack3
UN  192.168.2.109    78,12 KB   8       16,7%             e312d477-660a-48e5-a76d-7d646a401acc  rack2
UN  192.168.1.244    164,89 KB  8       16,7%             47f50509-08d0-437a-a868-58eaa0a0a548  rack1
UN  192.168.2.180    158,74 KB  8       16,7%             7173a4ca-c312-4b8a-8007-a8591d82c2d3  rack2
UN  192.168.1.25     61,36 KB   8       16,7%             0392697b-e7df-4017-90ff-53977a063041  rack1
UN  192.168.2.90     56,6 KB    8       16,7%             27113d8c-f8a3-4d40-9011-aeffe1391fb0  rack2
UN  192.168.1.91     73,07 KB   8       16,7%             01483483-f63e-49af-aba8-9987078260f3  rack1

Example of a bad (unbalanced) token map:

Datacenter: test-eu
===================
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address       Load       Tokens  Owns (effective)  Host ID                               Rack
UN  192.168.3.64     61,6 KB    8       5,6%              0ba59b8e-34fc-4774-a52b-2c5ede4296f2  rack3
UN  192.168.2.99     72,84 KB   8       5,6%              57c0a12c-dba7-440d-b984-980c52578970  rack2
UN  192.168.1.69     51,49 KB   8       5,6%              4ad834ec-4e0c-48f9-a309-19a7c9aeccb2  rack1
UN  192.168.3.133    89,66 KB   8       5,6%              7c858432-62a7-4874-98c3-84b46299beb2  rack3
UN  192.168.2.230    89,89 KB   8       5,6%              5262cd0d-4c30-4c53-b3d4-4b4f8243745b  rack2
UN  192.168.3.199    83,54 KB   8       5,6%              d140a761-d211-4526-97fe-3145634dacf6  rack3
UN  192.168.2.200    66,72 KB   8       5,6%              b81be237-f1d8-4f12-9601-718cb3732b15  rack2
UN  192.168.1.233    90,29 KB   8       72,2%             eba7dc22-8a16-4547-b42c-ce0a882ca750  rack1
UN  192.168.1.138    89,96 KB   8       5,6%              7957136d-8d93-422f-97d7-63a3e1665a7d  rack1
UN  192.168.3.106    80,48 KB   8       5,6%              184a6fdf-664a-478c-9a1e-e15e27d4a1e3  rack3
UN  192.168.3.11     80,44 KB   8       5,6%              d1be4b39-2129-4b2b-8452-b06cf7bd7872  rack3
UN  192.168.3.237    120,79 KB  8       5,6%              391e00b2-e2e2-4898-811f-4277e97e8f7c  rack3
UN  192.168.2.109    78,12 KB   8       72,2%             e312d477-660a-48e5-a76d-7d646a401acc  rack2
UN  192.168.1.244    164,89 KB  8       72,2%             47f50509-08d0-437a-a868-58eaa0a0a548  rack1
UN  192.168.2.180    158,74 KB  8       5,6%              7173a4ca-c312-4b8a-8007-a8591d82c2d3  rack2
UN  192.168.1.25     61,36 KB   8       5,6%              0392697b-e7df-4017-90ff-53977a063041  rack1
UN  192.168.2.90     56,6 KB    8       5,6%              27113d8c-f8a3-4d40-9011-aeffe1391fb0  rack2
UN  192.168.1.91     73,07 KB   8       5,6%              01483483-f63e-49af-aba8-9987078260f3  rack1

About

Apache Cassandra scripts tools

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages