Best Practice to add a node in a Cluster

classic Classic list List threaded Threaded
8 messages Options
Reply | Threaded
Open this post in threaded view
|

Best Practice to add a node in a Cluster

Neha Trivedi
Hi
We have a 2 Cluster Node with RF=2. We are planing to add a new node.

Should we change RF to 3 in the schema?
OR Just added a new node with the same RF=2?

Any other Best Practice that we need to take care?

Thanks
regards
Neha

Reply | Threaded
Open this post in threaded view
|

Re: Best Practice to add a node in a Cluster

Eric Stevens
It depends on why you're adding a new node.  If you're running out of disk space or IO capacity in your 2 node cluster, then changing RF to 3 will not improve either condition - you'd still be writing all data to all three nodes.

However if you're looking to improve reliability, a 2 node RF=2 cluster cannot have either node offline without losing quorum, while a 3 node RF=3 cluster can have one node offline and still be able to achieve quorum.  RF=3 is a common replication factor because of this characteristic.

Make sure your new node is not in its own seeds list, or it will not bootstrap (it will come online immediately and start serving requests).  

On Mon, Apr 27, 2015 at 8:46 AM, Neha Trivedi <[hidden email]> wrote:
Hi
We have a 2 Cluster Node with RF=2. We are planing to add a new node.

Should we change RF to 3 in the schema?
OR Just added a new node with the same RF=2?

Any other Best Practice that we need to take care?

Thanks
regards
Neha


Reply | Threaded
Open this post in threaded view
|

RE: Best Practice to add a node in a Cluster

Matthew Johnson
In reply to this post by Neha Trivedi

Hi Neha,

 

I guess it depends why you are adding a new node – do you need more storage capacity, do you want better resilience, or are you trying to increase performance?

 

If you add a new node with the same amount of storage as the previous two, but you increase the RF, you will use up all of the storage you have added by replicating the existing data onto the new node. If you keep it at RF=2, once you have done all the bootstrapping and cleanup then your usage on the existing two should decrease by about 30% (of their total size).

 

However, if it is resilience you are after (being able to take down nodes without losing availability) then increasing the RF will give you this, at the expense of using more storage.

 

Hope that helps.

 

Cheers,

Matt

 

 

From: Neha Trivedi [mailto:[hidden email]]
Sent: 27 April 2015 16:46
To: [hidden email]
Subject: Best Practice to add a node in a Cluster

 

Hi

We have a 2 Cluster Node with RF=2. We are planing to add a new node.

Should we change RF to 3 in the schema?
OR Just added a new node with the same RF=2?

Any other Best Practice that we need to take care?

Thanks

regards

Neha

 

Reply | Threaded
Open this post in threaded view
|

Re: Best Practice to add a node in a Cluster

Neha Trivedi
In reply to this post by Eric Stevens
Thanks Eric and Matt :) !!

Yes the purpose is to improve reliability.
Right now, from our driver we are querying using degradePolicy for reliability.

For changing the keyspace for RF=3, the procedure is as under:

1. Add a new node to the cluster (new node is not in seed list)

2. ALTER KEYSPACE system_auth WITH REPLICATION =
  {'class' : 'NetworkTopologyStrategy', 'dc1' : 3};

  1. On each affected node, run nodetool repair.
  2. Wait until repair completes on a node, then move to the next node.

Any other things to take care?

Thanks
Regards
neha


On Mon, Apr 27, 2015 at 9:45 PM, Eric Stevens <[hidden email]> wrote:
It depends on why you're adding a new node.  If you're running out of disk space or IO capacity in your 2 node cluster, then changing RF to 3 will not improve either condition - you'd still be writing all data to all three nodes.

However if you're looking to improve reliability, a 2 node RF=2 cluster cannot have either node offline without losing quorum, while a 3 node RF=3 cluster can have one node offline and still be able to achieve quorum.  RF=3 is a common replication factor because of this characteristic.

Make sure your new node is not in its own seeds list, or it will not bootstrap (it will come online immediately and start serving requests).  

On Mon, Apr 27, 2015 at 8:46 AM, Neha Trivedi <[hidden email]> wrote:
Hi
We have a 2 Cluster Node with RF=2. We are planing to add a new node.

Should we change RF to 3 in the schema?
OR Just added a new node with the same RF=2?

Any other Best Practice that we need to take care?

Thanks
regards
Neha



Reply | Threaded
Open this post in threaded view
|

Re: Best Practice to add a node in a Cluster

arun sirimalla
Hi Neha,


After you add the node to the cluster, run nodetool cleanup on all nodes.
Next running repair on each node will replicate the data. Make sure you run the repair on one node at a time, because repair is an expensive process (Utilizes high CPU).




On Mon, Apr 27, 2015 at 8:36 PM, Neha Trivedi <[hidden email]> wrote:
Thanks Eric and Matt :) !!

Yes the purpose is to improve reliability.
Right now, from our driver we are querying using degradePolicy for reliability.

For changing the keyspace for RF=3, the procedure is as under:

1. Add a new node to the cluster (new node is not in seed list)

2. ALTER KEYSPACE system_auth WITH REPLICATION =
  {'class' : 'NetworkTopologyStrategy', 'dc1' : 3};

  1. On each affected node, run nodetool repair.
  2. Wait until repair completes on a node, then move to the next node.

Any other things to take care?

Thanks
Regards
neha


On Mon, Apr 27, 2015 at 9:45 PM, Eric Stevens <[hidden email]> wrote:
It depends on why you're adding a new node.  If you're running out of disk space or IO capacity in your 2 node cluster, then changing RF to 3 will not improve either condition - you'd still be writing all data to all three nodes.

However if you're looking to improve reliability, a 2 node RF=2 cluster cannot have either node offline without losing quorum, while a 3 node RF=3 cluster can have one node offline and still be able to achieve quorum.  RF=3 is a common replication factor because of this characteristic.

Make sure your new node is not in its own seeds list, or it will not bootstrap (it will come online immediately and start serving requests).  

On Mon, Apr 27, 2015 at 8:46 AM, Neha Trivedi <[hidden email]> wrote:
Hi
We have a 2 Cluster Node with RF=2. We are planing to add a new node.

Should we change RF to 3 in the schema?
OR Just added a new node with the same RF=2?

Any other Best Practice that we need to take care?

Thanks
regards
Neha






--
Arun 
Senior Hadoop/Cassandra Engineer
Cloudwick

Champion of Big Data (Cloudera)

2014 Data Impact Award Winner (Cloudera)

Reply | Threaded
Open this post in threaded view
|

Re: Best Practice to add a node in a Cluster

Neha Trivedi
Thans Arun !

On Tue, Apr 28, 2015 at 9:44 AM, arun sirimalla <[hidden email]> wrote:
Hi Neha,


After you add the node to the cluster, run nodetool cleanup on all nodes.
Next running repair on each node will replicate the data. Make sure you run the repair on one node at a time, because repair is an expensive process (Utilizes high CPU).




On Mon, Apr 27, 2015 at 8:36 PM, Neha Trivedi <[hidden email]> wrote:
Thanks Eric and Matt :) !!

Yes the purpose is to improve reliability.
Right now, from our driver we are querying using degradePolicy for reliability.

For changing the keyspace for RF=3, the procedure is as under:

1. Add a new node to the cluster (new node is not in seed list)

2. ALTER KEYSPACE system_auth WITH REPLICATION =
  {'class' : 'NetworkTopologyStrategy', 'dc1' : 3};

  1. On each affected node, run nodetool repair.
  2. Wait until repair completes on a node, then move to the next node.

Any other things to take care?

Thanks
Regards
neha


On Mon, Apr 27, 2015 at 9:45 PM, Eric Stevens <[hidden email]> wrote:
It depends on why you're adding a new node.  If you're running out of disk space or IO capacity in your 2 node cluster, then changing RF to 3 will not improve either condition - you'd still be writing all data to all three nodes.

However if you're looking to improve reliability, a 2 node RF=2 cluster cannot have either node offline without losing quorum, while a 3 node RF=3 cluster can have one node offline and still be able to achieve quorum.  RF=3 is a common replication factor because of this characteristic.

Make sure your new node is not in its own seeds list, or it will not bootstrap (it will come online immediately and start serving requests).  

On Mon, Apr 27, 2015 at 8:46 AM, Neha Trivedi <[hidden email]> wrote:
Hi
We have a 2 Cluster Node with RF=2. We are planing to add a new node.

Should we change RF to 3 in the schema?
OR Just added a new node with the same RF=2?

Any other Best Practice that we need to take care?

Thanks
regards
Neha






--
Arun 
Senior Hadoop/Cassandra Engineer
Cloudwick

Champion of Big Data (Cloudera)

2014 Data Impact Award Winner (Cloudera)


Reply | Threaded
Open this post in threaded view
|

Re: Best Practice to add a node in a Cluster

Eric Stevens
I would double check in a test cluster (or with a tool like CCM to confirm to set up a local throwaway cluster), but for this *specific* use case (going from RF==NodeCount to RF==NodeCount with a higher number) you should be able to have a simpler path.  Set RF=3 before you add your new node, then add the new node.  It will bootstrap all data from the other two nodes, then your job is done.

You shouldn't have to run repair (which you normally have to do after increasing RF in order to make sure all nodes have their data - the nodes already have all their data), and you shouldn't have to run cleanup (which you normally have to do after increasing node count to instruct the old nodes to forget data for which they are no longer responsible).  The data responsibility hasn't changed for any node, all nodes are still responsible for all data.

On Mon, Apr 27, 2015 at 9:19 PM, Neha Trivedi <[hidden email]> wrote:
Thans Arun !

On Tue, Apr 28, 2015 at 9:44 AM, arun sirimalla <[hidden email]> wrote:
Hi Neha,


After you add the node to the cluster, run nodetool cleanup on all nodes.
Next running repair on each node will replicate the data. Make sure you run the repair on one node at a time, because repair is an expensive process (Utilizes high CPU).




On Mon, Apr 27, 2015 at 8:36 PM, Neha Trivedi <[hidden email]> wrote:
Thanks Eric and Matt :) !!

Yes the purpose is to improve reliability.
Right now, from our driver we are querying using degradePolicy for reliability.

For changing the keyspace for RF=3, the procedure is as under:

1. Add a new node to the cluster (new node is not in seed list)

2. ALTER KEYSPACE system_auth WITH REPLICATION =
  {'class' : 'NetworkTopologyStrategy', 'dc1' : 3};

  1. On each affected node, run nodetool repair.
  2. Wait until repair completes on a node, then move to the next node.

Any other things to take care?

Thanks
Regards
neha


On Mon, Apr 27, 2015 at 9:45 PM, Eric Stevens <[hidden email]> wrote:
It depends on why you're adding a new node.  If you're running out of disk space or IO capacity in your 2 node cluster, then changing RF to 3 will not improve either condition - you'd still be writing all data to all three nodes.

However if you're looking to improve reliability, a 2 node RF=2 cluster cannot have either node offline without losing quorum, while a 3 node RF=3 cluster can have one node offline and still be able to achieve quorum.  RF=3 is a common replication factor because of this characteristic.

Make sure your new node is not in its own seeds list, or it will not bootstrap (it will come online immediately and start serving requests).  

On Mon, Apr 27, 2015 at 8:46 AM, Neha Trivedi <[hidden email]> wrote:
Hi
We have a 2 Cluster Node with RF=2. We are planing to add a new node.

Should we change RF to 3 in the schema?
OR Just added a new node with the same RF=2?

Any other Best Practice that we need to take care?

Thanks
regards
Neha






--
Arun 
Senior Hadoop/Cassandra Engineer
Cloudwick

Champion of Big Data (Cloudera)

2014 Data Impact Award Winner (Cloudera)



Reply | Threaded
Open this post in threaded view
|

Re: Best Practice to add a node in a Cluster

Neha Trivedi
Interesting Eric !!!
Not sure if this would be allowed. Alter keyspace to RF=3 and then add a node.

On Tue, Apr 28, 2015 at 8:54 PM, Eric Stevens <[hidden email]> wrote:
I would double check in a test cluster (or with a tool like CCM to confirm to set up a local throwaway cluster), but for this *specific* use case (going from RF==NodeCount to RF==NodeCount with a higher number) you should be able to have a simpler path.  Set RF=3 before you add your new node, then add the new node.  It will bootstrap all data from the other two nodes, then your job is done.

You shouldn't have to run repair (which you normally have to do after increasing RF in order to make sure all nodes have their data - the nodes already have all their data), and you shouldn't have to run cleanup (which you normally have to do after increasing node count to instruct the old nodes to forget data for which they are no longer responsible).  The data responsibility hasn't changed for any node, all nodes are still responsible for all data.

On Mon, Apr 27, 2015 at 9:19 PM, Neha Trivedi <[hidden email]> wrote:
Thans Arun !

On Tue, Apr 28, 2015 at 9:44 AM, arun sirimalla <[hidden email]> wrote:
Hi Neha,


After you add the node to the cluster, run nodetool cleanup on all nodes.
Next running repair on each node will replicate the data. Make sure you run the repair on one node at a time, because repair is an expensive process (Utilizes high CPU).




On Mon, Apr 27, 2015 at 8:36 PM, Neha Trivedi <[hidden email]> wrote:
Thanks Eric and Matt :) !!

Yes the purpose is to improve reliability.
Right now, from our driver we are querying using degradePolicy for reliability.

For changing the keyspace for RF=3, the procedure is as under:

1. Add a new node to the cluster (new node is not in seed list)

2. ALTER KEYSPACE system_auth WITH REPLICATION =
  {'class' : 'NetworkTopologyStrategy', 'dc1' : 3};

  1. On each affected node, run nodetool repair.
  2. Wait until repair completes on a node, then move to the next node.

Any other things to take care?

Thanks
Regards
neha


On Mon, Apr 27, 2015 at 9:45 PM, Eric Stevens <[hidden email]> wrote:
It depends on why you're adding a new node.  If you're running out of disk space or IO capacity in your 2 node cluster, then changing RF to 3 will not improve either condition - you'd still be writing all data to all three nodes.

However if you're looking to improve reliability, a 2 node RF=2 cluster cannot have either node offline without losing quorum, while a 3 node RF=3 cluster can have one node offline and still be able to achieve quorum.  RF=3 is a common replication factor because of this characteristic.

Make sure your new node is not in its own seeds list, or it will not bootstrap (it will come online immediately and start serving requests).  

On Mon, Apr 27, 2015 at 8:46 AM, Neha Trivedi <[hidden email]> wrote:
Hi
We have a 2 Cluster Node with RF=2. We are planing to add a new node.

Should we change RF to 3 in the schema?
OR Just added a new node with the same RF=2?

Any other Best Practice that we need to take care?

Thanks
regards
Neha






--
Arun 
Senior Hadoop/Cassandra Engineer
Cloudwick

Champion of Big Data (Cloudera)

2014 Data Impact Award Winner (Cloudera)