How much data is bootstrapping supposed to send?

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

How much data is bootstrapping supposed to send?

Dave Galbraith
I had a one-node Cassandra 2.1.3 cluster, where the output of nodetool status looked like this:

Datacenter: datacenter1
=======================
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address       Load       Tokens  Owns    Host ID                               Rack
UN  172.31.20.10  12.94 MB   256     ?       f803cae9-3f12-40c9-b681-caf4829b6bc6  rack1


Then I added another host to the cluster, and according to the logs it did some bootstrapping:

INFO  [main] 2015-04-23 06:25:41,955 StorageService.java:1008 - JOINING: schema complete, ready to bootstrap
INFO  [main] 2015-04-23 06:25:41,955 StorageService.java:1008 - JOINING: calculation complete, ready to bootstrap
INFO  [main] 2015-04-23 06:25:41,956 StorageService.java:1008 - JOINING: getting bootstrap token
INFO  [main] 2015-04-23 06:26:11,999 StorageService.java:1008 - JOINING: Starting to bootstrap...
INFO  [main] 2015-04-23 06:26:12,159 StreamResultFuture.java:86 - [Stream #a2d70110-e981-11e4-90fe-03a9e0dac111] Executing streaming plan for Bootstrap
INFO  [main] 2015-04-23 06:26:13,225 StorageService.java:1037 - Bootstrap completed! for the tokens [-6649489682159922872,


But when I ran nodetool status after the new node had joined the cluster, it looked like this:

Datacenter: datacenter1
=======================
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address        Load       Tokens  Owns    Host ID                               Rack
UN  172.31.21.108  3.78 MB    256     ?       0fc3f5ac-c414-4340-b072-7d9959a28209  rack1
UN  172.31.20.10   15.45 MB   256     ?       f803cae9-3f12-40c9-b681-caf4829b6bc6  rack1


So I was expecting the load to drop to about 6.5 MB on my original node while the new node would pick up about 6.5 MB, so they'd be balanced, but instead the disk usage on my original node somehow increased by 2.5 MB while the new node only picked up 3.78 MB. Why didn't I get a balanced load? Why did the load on my original node go up when I added another node? I didn't write any points during the bootstrap. All my keyspaces that have a lot of data have replication factor 1, so I think and hope it wasn't just replicating data on the new node. Thanks!
Reply | Threaded
Open this post in threaded view
|

Re: How much data is bootstrapping supposed to send?

Robert Coli-3
On Wed, Apr 22, 2015 at 11:57 PM, Dave Galbraith <[hidden email]> wrote:
So I was expecting the load to drop to about 6.5 MB on my original node while the new node would pick up about 6.5 MB, so they'd be balanced, but instead the disk usage on my original node somehow increased by 2.5 MB while the new node only picked up 3.78 MB. Why didn't I get a balanced load? Why did the load on my original node go up when I added another node? I didn't write any points during the bootstrap. All my keyspaces that have a lot of data have replication factor 1, so I think and hope it wasn't just replicating data on the new node. Thanks!

To remove data from the source node which no longer belongs there, run "nodetool cleanup" on it.

=Rob