Adding nodes to existing cluster

classic Classic list List threaded Threaded
8 messages Options
Reply | Threaded
Open this post in threaded view
|

Adding nodes to existing cluster

or.sher1
Hi all,
In the near future I'll need to add more than 10 nodes to a 2.0.9
cluster (using vnodes).
I read this documentation on datastax website:
http://docs.datastax.com/en/cassandra/2.0/cassandra/operations/ops_add_node_to_cluster_t.html

In one point it says:
"If you are using racks, you can safely bootstrap two nodes at a time
when both nodes are on the same rack."

And in another is says:
"Start Cassandra on each new node. Allow two minutes between node
initializations. You can monitor the startup and data streaming
process using nodetool netstats."

We're not using racks configuration and from reading this
documentation I'm not really sure is it safe for us to bootstrap all
nodes together (with two minutes between each other).
I really hate the tought of doing it one by one, I assume it will take
more than 6H per node.

What do you say?
--
Or Sher
Reply | Threaded
Open this post in threaded view
|

Re: Adding nodes to existing cluster

Carlos Rolo
Start one node at a time. Wait 2 minutes before starting each node.


How much data and nodes you have already? Depending on that, the streaming of data can stress on the resources you have.
I would recommend to start one and monitor, if things are ok, add another one. And so on.

Regards,

Carlos Juzarte Rolo
Cassandra Consultant
 
Pythian - Love your data

rolo@pythian | Twitter: cjrolo | Linkedin: linkedin.com/in/carlosjuzarterolo
Mobile: +31 6 159 61 814 | Tel: +1 613 565 8696 x1649

On Mon, Apr 20, 2015 at 11:02 AM, Or Sher <[hidden email]> wrote:
Hi all,
In the near future I'll need to add more than 10 nodes to a 2.0.9
cluster (using vnodes).
I read this documentation on datastax website:
http://docs.datastax.com/en/cassandra/2.0/cassandra/operations/ops_add_node_to_cluster_t.html

In one point it says:
"If you are using racks, you can safely bootstrap two nodes at a time
when both nodes are on the same rack."

And in another is says:
"Start Cassandra on each new node. Allow two minutes between node
initializations. You can monitor the startup and data streaming
process using nodetool netstats."

We're not using racks configuration and from reading this
documentation I'm not really sure is it safe for us to bootstrap all
nodes together (with two minutes between each other).
I really hate the tought of doing it one by one, I assume it will take
more than 6H per node.

What do you say?
--
Or Sher


--



Reply | Threaded
Open this post in threaded view
|

Re: Adding nodes to existing cluster

or.sher1
Thanks for the response.
Sure we'll monitor as we're adding nodes.
We're now using 6 nodes on each DC. (We have 2 DCs)
Each node contains ~800GB

Do you know how rack configurations are relevant here?
Do you see any reason to bootstrap them one by one if we're not using
rack awareness?


On Mon, Apr 20, 2015 at 2:49 PM, Carlos Rolo <[hidden email]> wrote:

> Start one node at a time. Wait 2 minutes before starting each node.
>
>
> How much data and nodes you have already? Depending on that, the streaming
> of data can stress on the resources you have.
> I would recommend to start one and monitor, if things are ok, add another
> one. And so on.
>
> Regards,
>
> Carlos Juzarte Rolo
> Cassandra Consultant
>
> Pythian - Love your data
>
> rolo@pythian | Twitter: cjrolo | Linkedin: linkedin.com/in/carlosjuzarterolo
> Mobile: +31 6 159 61 814 | Tel: +1 613 565 8696 x1649
> www.pythian.com
>
> On Mon, Apr 20, 2015 at 11:02 AM, Or Sher <[hidden email]> wrote:
>>
>> Hi all,
>> In the near future I'll need to add more than 10 nodes to a 2.0.9
>> cluster (using vnodes).
>> I read this documentation on datastax website:
>>
>> http://docs.datastax.com/en/cassandra/2.0/cassandra/operations/ops_add_node_to_cluster_t.html
>>
>> In one point it says:
>> "If you are using racks, you can safely bootstrap two nodes at a time
>> when both nodes are on the same rack."
>>
>> And in another is says:
>> "Start Cassandra on each new node. Allow two minutes between node
>> initializations. You can monitor the startup and data streaming
>> process using nodetool netstats."
>>
>> We're not using racks configuration and from reading this
>> documentation I'm not really sure is it safe for us to bootstrap all
>> nodes together (with two minutes between each other).
>> I really hate the tought of doing it one by one, I assume it will take
>> more than 6H per node.
>>
>> What do you say?
>> --
>> Or Sher
>
>
>
> --
>
>
>



--
Or Sher
Reply | Threaded
Open this post in threaded view
|

Re: Adding nodes to existing cluster

Carlos Rolo
Independent of the snitch, data needs to travel to the new nodes (plus all the keyspace information that goes via gossip). So I won't bootstrap them all at once, even if it is only for network traffic generated.

Don't forget to run cleanup on the old nodes once all nodes are in place to reclaim disk space.

Regards,

Carlos Juzarte Rolo
Cassandra Consultant
 
Pythian - Love your data

rolo@pythian | Twitter: cjrolo | Linkedin: linkedin.com/in/carlosjuzarterolo
Mobile: +31 6 159 61 814 | Tel: +1 613 565 8696 x1649

On Mon, Apr 20, 2015 at 1:58 PM, Or Sher <[hidden email]> wrote:
Thanks for the response.
Sure we'll monitor as we're adding nodes.
We're now using 6 nodes on each DC. (We have 2 DCs)
Each node contains ~800GB

Do you know how rack configurations are relevant here?
Do you see any reason to bootstrap them one by one if we're not using
rack awareness?


On Mon, Apr 20, 2015 at 2:49 PM, Carlos Rolo <[hidden email]> wrote:
> Start one node at a time. Wait 2 minutes before starting each node.
>
>
> How much data and nodes you have already? Depending on that, the streaming
> of data can stress on the resources you have.
> I would recommend to start one and monitor, if things are ok, add another
> one. And so on.
>
> Regards,
>
> Carlos Juzarte Rolo
> Cassandra Consultant
>
> Pythian - Love your data
>
> rolo@pythian | Twitter: cjrolo | Linkedin: linkedin.com/in/carlosjuzarterolo
> Mobile: <a href="tel:%2B31%206%20159%2061%20814" value="+31615961814">+31 6 159 61 814 | Tel: <a href="tel:%2B1%20613%20565%208696%20x1649" value="+16135658696">+1 613 565 8696 x1649
> www.pythian.com
>
> On Mon, Apr 20, 2015 at 11:02 AM, Or Sher <[hidden email]> wrote:
>>
>> Hi all,
>> In the near future I'll need to add more than 10 nodes to a 2.0.9
>> cluster (using vnodes).
>> I read this documentation on datastax website:
>>
>> http://docs.datastax.com/en/cassandra/2.0/cassandra/operations/ops_add_node_to_cluster_t.html
>>
>> In one point it says:
>> "If you are using racks, you can safely bootstrap two nodes at a time
>> when both nodes are on the same rack."
>>
>> And in another is says:
>> "Start Cassandra on each new node. Allow two minutes between node
>> initializations. You can monitor the startup and data streaming
>> process using nodetool netstats."
>>
>> We're not using racks configuration and from reading this
>> documentation I'm not really sure is it safe for us to bootstrap all
>> nodes together (with two minutes between each other).
>> I really hate the tought of doing it one by one, I assume it will take
>> more than 6H per node.
>>
>> What do you say?
>> --
>> Or Sher
>
>
>
> --
>
>
>



--
Or Sher


--



Reply | Threaded
Open this post in threaded view
|

Re: Adding nodes to existing cluster

Colin Clark-2
unsubscribe


On Apr 20, 2015, at 8:08 AM, Carlos Rolo <[hidden email]> wrote:

Independent of the snitch, data needs to travel to the new nodes (plus all the keyspace information that goes via gossip). So I won't bootstrap them all at once, even if it is only for network traffic generated.

Don't forget to run cleanup on the old nodes once all nodes are in place to reclaim disk space.

Regards,

Carlos Juzarte Rolo
Cassandra Consultant
 
Pythian - Love your data

rolo@pythian | Twitter: cjrolo | Linkedin: linkedin.com/in/carlosjuzarterolo
Mobile: +31 6 159 61 814 | Tel: +1 613 565 8696 x1649

On Mon, Apr 20, 2015 at 1:58 PM, Or Sher <[hidden email]> wrote:
Thanks for the response.
Sure we'll monitor as we're adding nodes.
We're now using 6 nodes on each DC. (We have 2 DCs)
Each node contains ~800GB

Do you know how rack configurations are relevant here?
Do you see any reason to bootstrap them one by one if we're not using
rack awareness?


On Mon, Apr 20, 2015 at 2:49 PM, Carlos Rolo <[hidden email]> wrote:
> Start one node at a time. Wait 2 minutes before starting each node.
>
>
> How much data and nodes you have already? Depending on that, the streaming
> of data can stress on the resources you have.
> I would recommend to start one and monitor, if things are ok, add another
> one. And so on.
>
> Regards,
>
> Carlos Juzarte Rolo
> Cassandra Consultant
>
> Pythian - Love your data
>
> rolo@pythian | Twitter: cjrolo | Linkedin: linkedin.com/in/carlosjuzarterolo
> Mobile: <a href="tel:%2B31%206%20159%2061%20814" value="+31615961814" class="">+31 6 159 61 814 | Tel: <a href="tel:%2B1%20613%20565%208696%20x1649" value="+16135658696" class="">+1 613 565 8696 x1649
> www.pythian.com
>
> On Mon, Apr 20, 2015 at 11:02 AM, Or Sher <[hidden email]> wrote:
>>
>> Hi all,
>> In the near future I'll need to add more than 10 nodes to a 2.0.9
>> cluster (using vnodes).
>> I read this documentation on datastax website:
>>
>> http://docs.datastax.com/en/cassandra/2.0/cassandra/operations/ops_add_node_to_cluster_t.html
>>
>> In one point it says:
>> "If you are using racks, you can safely bootstrap two nodes at a time
>> when both nodes are on the same rack."
>>
>> And in another is says:
>> "Start Cassandra on each new node. Allow two minutes between node
>> initializations. You can monitor the startup and data streaming
>> process using nodetool netstats."
>>
>> We're not using racks configuration and from reading this
>> documentation I'm not really sure is it safe for us to bootstrap all
>> nodes together (with two minutes between each other).
>> I really hate the tought of doing it one by one, I assume it will take
>> more than 6H per node.
>>
>> What do you say?
>> --
>> Or Sher
>
>
>
> --
>
>
>



--
Or Sher


--






smime.p7s (4K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

RE: Adding nodes to existing cluster

Matthew Johnson

Hi Colin,

 

To remove your address from the list, send a message to:

   <[hidden email]>

 

Cheers,

Matt

 

 

From: Colin Clark [mailto:[hidden email]]
Sent: 20 April 2015 14:10
To: [hidden email]
Subject: Re: Adding nodes to existing cluster

 

unsubscribe

 

 

On Apr 20, 2015, at 8:08 AM, Carlos Rolo <[hidden email]> wrote:

 

Independent of the snitch, data needs to travel to the new nodes (plus all the keyspace information that goes via gossip). So I won't bootstrap them all at once, even if it is only for network traffic generated.

Don't forget to run cleanup on the old nodes once all nodes are in place to reclaim disk space.


Regards,

 

Carlos Juzarte Rolo

Cassandra Consultant

 

Pythian - Love your data

 

rolo@pythian | Twitter: cjrolo | Linkedin: linkedin.com/in/carlosjuzarterolo

Mobile: +31 6 159 61 814 | Tel: +1 613 565 8696 x1649

 

On Mon, Apr 20, 2015 at 1:58 PM, Or Sher <[hidden email]> wrote:

Thanks for the response.
Sure we'll monitor as we're adding nodes.
We're now using 6 nodes on each DC. (We have 2 DCs)
Each node contains ~800GB

Do you know how rack configurations are relevant here?
Do you see any reason to bootstrap them one by one if we're not using
rack awareness?



On Mon, Apr 20, 2015 at 2:49 PM, Carlos Rolo <[hidden email]> wrote:


> Start one node at a time. Wait 2 minutes before starting each node.
>
>
> How much data and nodes you have already? Depending on that, the streaming
> of data can stress on the resources you have.
> I would recommend to start one and monitor, if things are ok, add another
> one. And so on.
>
> Regards,
>
> Carlos Juzarte Rolo
> Cassandra Consultant
>
> Pythian - Love your data
>
> rolo@pythian | Twitter: cjrolo | Linkedin: linkedin.com/in/carlosjuzarterolo
> Mobile: <a href="tel:%2B31%206%20159%2061%20814">+31 6 159 61 814 | Tel: <a href="tel:%2B1%20613%20565%208696%20x1649">+1 613 565 8696 x1649
> www.pythian.com
>
> On Mon, Apr 20, 2015 at 11:02 AM, Or Sher <[hidden email]> wrote:
>>
>> Hi all,
>> In the near future I'll need to add more than 10 nodes to a 2.0.9
>> cluster (using vnodes).
>> I read this documentation on datastax website:
>>
>> http://docs.datastax.com/en/cassandra/2.0/cassandra/operations/ops_add_node_to_cluster_t.html
>>
>> In one point it says:
>> "If you are using racks, you can safely bootstrap two nodes at a time
>> when both nodes are on the same rack."
>>
>> And in another is says:
>> "Start Cassandra on each new node. Allow two minutes between node
>> initializations. You can monitor the startup and data streaming
>> process using nodetool netstats."
>>
>> We're not using racks configuration and from reading this
>> documentation I'm not really sure is it safe for us to bootstrap all
>> nodes together (with two minutes between each other).
>> I really hate the tought of doing it one by one, I assume it will take
>> more than 6H per node.
>>
>> What do you say?
>> --
>> Or Sher
>
>
>

> --

>
>
>



--
Or Sher

 

 

--

 

 

 

Reply | Threaded
Open this post in threaded view
|

Re: Adding nodes to existing cluster

or.sher1
In reply to this post by Carlos Rolo
OK.
Thanks.
I'll monitor the resources status (network, memory, cpu, io) as I go
and try to bootsrap them at chunks which seems not to have a bad
impact.
Will do regarding the cleanup.

Thanks!

On Mon, Apr 20, 2015 at 4:08 PM, Carlos Rolo <[hidden email]> wrote:

> Independent of the snitch, data needs to travel to the new nodes (plus all
> the keyspace information that goes via gossip). So I won't bootstrap them
> all at once, even if it is only for network traffic generated.
>
> Don't forget to run cleanup on the old nodes once all nodes are in place to
> reclaim disk space.
>
> Regards,
>
> Carlos Juzarte Rolo
> Cassandra Consultant
>
> Pythian - Love your data
>
> rolo@pythian | Twitter: cjrolo | Linkedin: linkedin.com/in/carlosjuzarterolo
> Mobile: +31 6 159 61 814 | Tel: +1 613 565 8696 x1649
> www.pythian.com
>
> On Mon, Apr 20, 2015 at 1:58 PM, Or Sher <[hidden email]> wrote:
>>
>> Thanks for the response.
>> Sure we'll monitor as we're adding nodes.
>> We're now using 6 nodes on each DC. (We have 2 DCs)
>> Each node contains ~800GB
>>
>> Do you know how rack configurations are relevant here?
>> Do you see any reason to bootstrap them one by one if we're not using
>> rack awareness?
>>
>>
>> On Mon, Apr 20, 2015 at 2:49 PM, Carlos Rolo <[hidden email]> wrote:
>> > Start one node at a time. Wait 2 minutes before starting each node.
>> >
>> >
>> > How much data and nodes you have already? Depending on that, the
>> > streaming
>> > of data can stress on the resources you have.
>> > I would recommend to start one and monitor, if things are ok, add
>> > another
>> > one. And so on.
>> >
>> > Regards,
>> >
>> > Carlos Juzarte Rolo
>> > Cassandra Consultant
>> >
>> > Pythian - Love your data
>> >
>> > rolo@pythian | Twitter: cjrolo | Linkedin:
>> > linkedin.com/in/carlosjuzarterolo
>> > Mobile: +31 6 159 61 814 | Tel: +1 613 565 8696 x1649
>> > www.pythian.com
>> >
>> > On Mon, Apr 20, 2015 at 11:02 AM, Or Sher <[hidden email]> wrote:
>> >>
>> >> Hi all,
>> >> In the near future I'll need to add more than 10 nodes to a 2.0.9
>> >> cluster (using vnodes).
>> >> I read this documentation on datastax website:
>> >>
>> >>
>> >> http://docs.datastax.com/en/cassandra/2.0/cassandra/operations/ops_add_node_to_cluster_t.html
>> >>
>> >> In one point it says:
>> >> "If you are using racks, you can safely bootstrap two nodes at a time
>> >> when both nodes are on the same rack."
>> >>
>> >> And in another is says:
>> >> "Start Cassandra on each new node. Allow two minutes between node
>> >> initializations. You can monitor the startup and data streaming
>> >> process using nodetool netstats."
>> >>
>> >> We're not using racks configuration and from reading this
>> >> documentation I'm not really sure is it safe for us to bootstrap all
>> >> nodes together (with two minutes between each other).
>> >> I really hate the tought of doing it one by one, I assume it will take
>> >> more than 6H per node.
>> >>
>> >> What do you say?
>> >> --
>> >> Or Sher
>> >
>> >
>> >
>> > --
>> >
>> >
>> >
>>
>>
>>
>> --
>> Or Sher
>
>
>
> --
>
>
>



--
Or Sher
Reply | Threaded
Open this post in threaded view
|

Re: Adding nodes to existing cluster

Sebastian Estevez
The documentation is referring to Consistent Range Movements.

There is a change in 2.1 that won't allow you to bootstrap multiple nodes at the same time unless you explicitly turn off consistent range movements. Check out the jira:


All the best,


datastax_logo.png

Sebastián Estévez

Solutions Architect | 954 905 8615 | [hidden email]

linkedin.png facebook.png twitter.png g+.png



DataStax is the fastest, most scalable distributed database technology, delivering Apache Cassandra to the world’s most innovative enterprises. Datastax is built to be agile, always-on, and predictably scalable to any size. With more than 500 customers in 45 countries, DataStax is the database technology and transactional backbone of choice for the worlds most innovative companies such as Netflix, Adobe, Intuit, and eBay.

On Mon, Apr 20, 2015 at 10:40 AM, Or Sher <[hidden email]> wrote:
OK.
Thanks.
I'll monitor the resources status (network, memory, cpu, io) as I go
and try to bootsrap them at chunks which seems not to have a bad
impact.
Will do regarding the cleanup.

Thanks!

On Mon, Apr 20, 2015 at 4:08 PM, Carlos Rolo <[hidden email]> wrote:
> Independent of the snitch, data needs to travel to the new nodes (plus all
> the keyspace information that goes via gossip). So I won't bootstrap them
> all at once, even if it is only for network traffic generated.
>
> Don't forget to run cleanup on the old nodes once all nodes are in place to
> reclaim disk space.
>
> Regards,
>
> Carlos Juzarte Rolo
> Cassandra Consultant
>
> Pythian - Love your data
>
> rolo@pythian | Twitter: cjrolo | Linkedin: linkedin.com/in/carlosjuzarterolo
> Mobile: <a href="tel:%2B31%206%20159%2061%20814" value="+31615961814">+31 6 159 61 814 | Tel: <a href="tel:%2B1%20613%20565%208696%20x1649" value="+16135658696">+1 613 565 8696 x1649
> www.pythian.com
>
> On Mon, Apr 20, 2015 at 1:58 PM, Or Sher <[hidden email]> wrote:
>>
>> Thanks for the response.
>> Sure we'll monitor as we're adding nodes.
>> We're now using 6 nodes on each DC. (We have 2 DCs)
>> Each node contains ~800GB
>>
>> Do you know how rack configurations are relevant here?
>> Do you see any reason to bootstrap them one by one if we're not using
>> rack awareness?
>>
>>
>> On Mon, Apr 20, 2015 at 2:49 PM, Carlos Rolo <[hidden email]> wrote:
>> > Start one node at a time. Wait 2 minutes before starting each node.
>> >
>> >
>> > How much data and nodes you have already? Depending on that, the
>> > streaming
>> > of data can stress on the resources you have.
>> > I would recommend to start one and monitor, if things are ok, add
>> > another
>> > one. And so on.
>> >
>> > Regards,
>> >
>> > Carlos Juzarte Rolo
>> > Cassandra Consultant
>> >
>> > Pythian - Love your data
>> >
>> > rolo@pythian | Twitter: cjrolo | Linkedin:
>> > linkedin.com/in/carlosjuzarterolo
>> > Mobile: <a href="tel:%2B31%206%20159%2061%20814" value="+31615961814">+31 6 159 61 814 | Tel: <a href="tel:%2B1%20613%20565%208696%20x1649" value="+16135658696">+1 613 565 8696 x1649
>> > www.pythian.com
>> >
>> > On Mon, Apr 20, 2015 at 11:02 AM, Or Sher <[hidden email]> wrote:
>> >>
>> >> Hi all,
>> >> In the near future I'll need to add more than 10 nodes to a 2.0.9
>> >> cluster (using vnodes).
>> >> I read this documentation on datastax website:
>> >>
>> >>
>> >> http://docs.datastax.com/en/cassandra/2.0/cassandra/operations/ops_add_node_to_cluster_t.html
>> >>
>> >> In one point it says:
>> >> "If you are using racks, you can safely bootstrap two nodes at a time
>> >> when both nodes are on the same rack."
>> >>
>> >> And in another is says:
>> >> "Start Cassandra on each new node. Allow two minutes between node
>> >> initializations. You can monitor the startup and data streaming
>> >> process using nodetool netstats."
>> >>
>> >> We're not using racks configuration and from reading this
>> >> documentation I'm not really sure is it safe for us to bootstrap all
>> >> nodes together (with two minutes between each other).
>> >> I really hate the tought of doing it one by one, I assume it will take
>> >> more than 6H per node.
>> >>
>> >> What do you say?
>> >> --
>> >> Or Sher
>> >
>> >
>> >
>> > --
>> >
>> >
>> >
>>
>>
>>
>> --
>> Or Sher
>
>
>
> --
>
>
>



--
Or Sher