Adding new node to Cassandra cluster is too slow

classic Classic list List threaded Threaded
13 messages Options
Reply | Threaded
Open this post in threaded view
|

Adding new node to Cassandra cluster is too slow

Pranay Agarwal
Hi, 

I have 14 nodes cassandra cluster, each node as around 50gb of data. I added 3 new nodes to the cluster and I can see the status as UJ for the new nodes. They have been in that for almost a day now and their data size seems to be same as well. There is almost no CPU or disk usage either on them.

It can't be so slow to add new nodes or there is no benefit of scaling up or down in real time as the requests.


-Pranay
Reply | Threaded
Open this post in threaded view
|

Re: Adding new node to Cassandra cluster is too slow

Robert Coli-3
On Thu, Mar 19, 2015 at 5:32 PM, Pranay Agarwal <[hidden email]> wrote:
I have 14 nodes cassandra cluster, each node as around 50gb of data. I added 3 new nodes to the cluster and I can see the status as UJ for the new nodes. They have been in that for almost a day now and their data size seems to be same as well. There is almost no CPU or disk usage either on them.
 
It is not supported to add multiple nodes to a cluster simultaneously until 2.1.1 [1]. Usually what happens is one or more of the bootstraps fails and hangs forever. This seems to be what has happened to you.

To resolve :

1) stop each of the bootstrapping nodes
2) wipe their data directories completey
3) verify that they do not show up in gossip on the other nodes
4) bootstrap them again, one at a time

=Rob


Reply | Threaded
Open this post in threaded view
|

Re: Adding new node to Cassandra cluster is too slow

Pranay Agarwal
Thanks Rob, You are right. I am using ReleaseVersion: 2.1.0
 
What do you mean by point 3? Also, by doing one at a time, does it mean wait till nodetool status of the new node is UN from UJ?

On Thu, Mar 19, 2015 at 5:44 PM, Robert Coli <[hidden email]> wrote:
On Thu, Mar 19, 2015 at 5:32 PM, Pranay Agarwal <[hidden email]> wrote:
I have 14 nodes cassandra cluster, each node as around 50gb of data. I added 3 new nodes to the cluster and I can see the status as UJ for the new nodes. They have been in that for almost a day now and their data size seems to be same as well. There is almost no CPU or disk usage either on them.
 
It is not supported to add multiple nodes to a cluster simultaneously until 2.1.1 [1]. Usually what happens is one or more of the bootstraps fails and hangs forever. This seems to be what has happened to you.

To resolve :

1) stop each of the bootstrapping nodes
2) wipe their data directories completey
3) verify that they do not show up in gossip on the other nodes
4) bootstrap them again, one at a time

=Rob



Reply | Threaded
Open this post in threaded view
|

Re: Adding new node to Cassandra cluster is too slow

Pranay Agarwal
Also, the new nodes (3 of them, in UJ state) are showing some data size (~10g). Is there any data loss chances with stopping the cassandra on them? 

On Thu, Mar 19, 2015 at 6:02 PM, Pranay Agarwal <[hidden email]> wrote:
Thanks Rob, You are right. I am using ReleaseVersion: 2.1.0
 
What do you mean by point 3? Also, by doing one at a time, does it mean wait till nodetool status of the new node is UN from UJ?

On Thu, Mar 19, 2015 at 5:44 PM, Robert Coli <[hidden email]> wrote:
On Thu, Mar 19, 2015 at 5:32 PM, Pranay Agarwal <[hidden email]> wrote:
I have 14 nodes cassandra cluster, each node as around 50gb of data. I added 3 new nodes to the cluster and I can see the status as UJ for the new nodes. They have been in that for almost a day now and their data size seems to be same as well. There is almost no CPU or disk usage either on them.
 
It is not supported to add multiple nodes to a cluster simultaneously until 2.1.1 [1]. Usually what happens is one or more of the bootstraps fails and hangs forever. This seems to be what has happened to you.

To resolve :

1) stop each of the bootstrapping nodes
2) wipe their data directories completey
3) verify that they do not show up in gossip on the other nodes
4) bootstrap them again, one at a time

=Rob




Reply | Threaded
Open this post in threaded view
|

Re: Adding new node to Cassandra cluster is too slow

Rahul Neelakantan
You won't lose data unless you have run nodetool cleanup on the existing nodes.

Rahul

On Mar 19, 2015, at 9:16 PM, Pranay Agarwal <[hidden email]> wrote:

Also, the new nodes (3 of them, in UJ state) are showing some data size (~10g). Is there any data loss chances with stopping the cassandra on them? 

On Thu, Mar 19, 2015 at 6:02 PM, Pranay Agarwal <[hidden email]> wrote:
Thanks Rob, You are right. I am using ReleaseVersion: 2.1.0
 
What do you mean by point 3? Also, by doing one at a time, does it mean wait till nodetool status of the new node is UN from UJ?

On Thu, Mar 19, 2015 at 5:44 PM, Robert Coli <[hidden email]> wrote:
On Thu, Mar 19, 2015 at 5:32 PM, Pranay Agarwal <[hidden email]> wrote:
I have 14 nodes cassandra cluster, each node as around 50gb of data. I added 3 new nodes to the cluster and I can see the status as UJ for the new nodes. They have been in that for almost a day now and their data size seems to be same as well. There is almost no CPU or disk usage either on them.
 
It is not supported to add multiple nodes to a cluster simultaneously until 2.1.1 [1]. Usually what happens is one or more of the bootstraps fails and hangs forever. This seems to be what has happened to you.

To resolve :

1) stop each of the bootstrapping nodes
2) wipe their data directories completey
3) verify that they do not show up in gossip on the other nodes
4) bootstrap them again, one at a time

=Rob




Reply | Threaded
Open this post in threaded view
|

Re: Adding new node to Cassandra cluster is too slow

Pranay Agarwal
Thank Rahul, you are right. Unless the node complete joins the ring, there is no data dependency on them.


On Fri, Mar 20, 2015 at 4:46 AM, Rahul Neelakantan <[hidden email]> wrote:
You won't lose data unless you have run nodetool cleanup on the existing nodes.

Rahul

On Mar 19, 2015, at 9:16 PM, Pranay Agarwal <[hidden email]> wrote:

Also, the new nodes (3 of them, in UJ state) are showing some data size (~10g). Is there any data loss chances with stopping the cassandra on them? 

On Thu, Mar 19, 2015 at 6:02 PM, Pranay Agarwal <[hidden email]> wrote:
Thanks Rob, You are right. I am using ReleaseVersion: 2.1.0
 
What do you mean by point 3? Also, by doing one at a time, does it mean wait till nodetool status of the new node is UN from UJ?

On Thu, Mar 19, 2015 at 5:44 PM, Robert Coli <[hidden email]> wrote:
On Thu, Mar 19, 2015 at 5:32 PM, Pranay Agarwal <[hidden email]> wrote:
I have 14 nodes cassandra cluster, each node as around 50gb of data. I added 3 new nodes to the cluster and I can see the status as UJ for the new nodes. They have been in that for almost a day now and their data size seems to be same as well. There is almost no CPU or disk usage either on them.
 
It is not supported to add multiple nodes to a cluster simultaneously until 2.1.1 [1]. Usually what happens is one or more of the bootstraps fails and hangs forever. This seems to be what has happened to you.

To resolve :

1) stop each of the bootstrapping nodes
2) wipe their data directories completey
3) verify that they do not show up in gossip on the other nodes
4) bootstrap them again, one at a time

=Rob





Reply | Threaded
Open this post in threaded view
|

Re: Adding new node to Cassandra cluster is too slow

Pranay Agarwal
Also, typically how long does it take for a node to join? I have in total 1 TB of data in 15 nodes cassandra cluster.

On Fri, Mar 20, 2015 at 10:53 AM, Pranay Agarwal <[hidden email]> wrote:
Thank Rahul, you are right. Unless the node complete joins the ring, there is no data dependency on them.


On Fri, Mar 20, 2015 at 4:46 AM, Rahul Neelakantan <[hidden email]> wrote:
You won't lose data unless you have run nodetool cleanup on the existing nodes.

Rahul

On Mar 19, 2015, at 9:16 PM, Pranay Agarwal <[hidden email]> wrote:

Also, the new nodes (3 of them, in UJ state) are showing some data size (~10g). Is there any data loss chances with stopping the cassandra on them? 

On Thu, Mar 19, 2015 at 6:02 PM, Pranay Agarwal <[hidden email]> wrote:
Thanks Rob, You are right. I am using ReleaseVersion: 2.1.0
 
What do you mean by point 3? Also, by doing one at a time, does it mean wait till nodetool status of the new node is UN from UJ?

On Thu, Mar 19, 2015 at 5:44 PM, Robert Coli <[hidden email]> wrote:
On Thu, Mar 19, 2015 at 5:32 PM, Pranay Agarwal <[hidden email]> wrote:
I have 14 nodes cassandra cluster, each node as around 50gb of data. I added 3 new nodes to the cluster and I can see the status as UJ for the new nodes. They have been in that for almost a day now and their data size seems to be same as well. There is almost no CPU or disk usage either on them.
 
It is not supported to add multiple nodes to a cluster simultaneously until 2.1.1 [1]. Usually what happens is one or more of the bootstraps fails and hangs forever. This seems to be what has happened to you.

To resolve :

1) stop each of the bootstrapping nodes
2) wipe their data directories completey
3) verify that they do not show up in gossip on the other nodes
4) bootstrap them again, one at a time

=Rob






Reply | Threaded
Open this post in threaded view
|

Re: Adding new node to Cassandra cluster is too slow

Robert Coli-3
In reply to this post by Pranay Agarwal
On Thu, Mar 19, 2015 at 6:02 PM, Pranay Agarwal <[hidden email]> wrote:
What do you mean by point 3? Also, by doing one at a time, does it mean wait till nodetool status of the new node is UN from UJ?

point 3 is "look at nodetool status/nodetool ring/nodetool info/etc. on other cluster nodes and make sure the node you just stopped isn't in their list of, for example, UJ hosts.

One at a time means, if you can afford it, I would wait for the node to be UN. If not, wait a few minutes between each join.

Also, 2.1.0 is super broken, read this and consider using 1.2.x.


=Rob

Reply | Threaded
Open this post in threaded view
|

Re: Adding new node to Cassandra cluster is too slow

Pranay Agarwal
Thanks a lot Rob. 

I guess now, I have decide it's better to upgrade to 2.1.6+ or downgrade to stable release and safe way to do that.


On Fri, Mar 20, 2015 at 3:35 PM, Robert Coli <[hidden email]> wrote:
On Thu, Mar 19, 2015 at 6:02 PM, Pranay Agarwal <[hidden email]> wrote:
What do you mean by point 3? Also, by doing one at a time, does it mean wait till nodetool status of the new node is UN from UJ?

point 3 is "look at nodetool status/nodetool ring/nodetool info/etc. on other cluster nodes and make sure the node you just stopped isn't in their list of, for example, UJ hosts.

One at a time means, if you can afford it, I would wait for the node to be UN. If not, wait a few minutes between each join.

Also, 2.1.0 is super broken, read this and consider using 1.2.x.


=Rob


Reply | Threaded
Open this post in threaded view
|

Re: Adding new node to Cassandra cluster is too slow

Robert Coli-3
On Fri, Mar 20, 2015 at 3:57 PM, Pranay Agarwal <[hidden email]> wrote:
I guess now, I have decide it's better to upgrade to 2.1.6+ or downgrade to stable release and safe way to do that.

You can't downgrade across major versions, you'd have to read out everything from the "new" cluster and write it to a different "old' cluster.

If I were you, I would likely just prioritize upgrading to 2.1.3, and then immediately 2.1.4 when it comes out, etc.

=Rob
 
Reply | Threaded
Open this post in threaded view
|

Re: Adding new node to Cassandra cluster is too slow

Pranay Agarwal
Thanks, will do that. 

Also, the very fact the new nodes get some data (~10gb) and then get stuck, the configurations and process I am using must be correct, and there is no immediate fix besides upgrade the version?

On Fri, Mar 20, 2015 at 4:02 PM, Robert Coli <[hidden email]> wrote:
On Fri, Mar 20, 2015 at 3:57 PM, Pranay Agarwal <[hidden email]> wrote:
I guess now, I have decide it's better to upgrade to 2.1.6+ or downgrade to stable release and safe way to do that.

You can't downgrade across major versions, you'd have to read out everything from the "new" cluster and write it to a different "old' cluster.

If I were you, I would likely just prioritize upgrading to 2.1.3, and then immediately 2.1.4 when it comes out, etc.

=Rob
 

Reply | Threaded
Open this post in threaded view
|

Re: Adding new node to Cassandra cluster is too slow

Robert Coli-3
On Fri, Mar 20, 2015 at 4:08 PM, Pranay Agarwal <[hidden email]> wrote:
Also, the very fact the new nodes get some data (~10gb) and then get stuck, the configurations and process I am using must be correct, and there is no immediate fix besides upgrade the version?

That is correct, it sounds like a correct config and then streaming fails.

=Rob
 
Reply | Threaded
Open this post in threaded view
|

Re: Adding new node to Cassandra cluster is too slow

Pranay Agarwal
Thanks Rob. 

Anyway, Ideally for a new node to join with ~50GB data of it's share, it should be done in couple of minutes or hour tops, right?

On Fri, Mar 20, 2015 at 6:07 PM, Robert Coli <[hidden email]> wrote:
On Fri, Mar 20, 2015 at 4:08 PM, Pranay Agarwal <[hidden email]> wrote:
Also, the very fact the new nodes get some data (~10gb) and then get stuck, the configurations and process I am using must be correct, and there is no immediate fix besides upgrade the version?

That is correct, it sounds like a correct config and then streaming fails.

=Rob