Quantcast

Multi DC setup

classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Multi DC setup

Cassa L
I am trying to understand multi DC setup for cassandra. As I understand, in this setup,  replicas exists in same cluster ring, but physically nodes are distributed across DCs. Is this correct?
I have two different cluster rings in two DCs, and want to replicate data  bidirectionally. They both have same keyspace. They take  data traffic from different sources, but we want to make sure, data exists in both the rings. What could be the way to achieve this?

Thanks,
L.

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Multi DC setup

Milind Parikh

Why have two rings? Cassandra manages the replication for you....one ring with physical nodes in two dc might be a better option. Of course, depending on the inter-dc failure characteristics, might need to endure split-brain for a while.

/***********************
sent from my android...please pardon occasional typos as I respond @ the speed of thought
************************/

On Oct 10, 2011 10:09 PM, "Cassa L" <[hidden email]> wrote:

I am trying to understand multi DC setup for cassandra. As I understand, in this setup,  replicas exists in same cluster ring, but physically nodes are distributed across DCs. Is this correct?
I have two different cluster rings in two DCs, and want to replicate data  bidirectionally. They both have same keyspace. They take  data traffic from different sources, but we want to make sure, data exists in both the rings. What could be the way to achieve this?

Thanks,
L.

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Multi DC setup

Cassa L
We already have two separate rings. Idea of bidirectional sync is, if one ring is down, we can still send the traffic to other ring. When original cluster comes back, it will pick up the data from available cluster. I'm not sure if it makes sense to have separate rings or combine these two rings into one.



On Mon, Oct 10, 2011 at 10:17 PM, Milind Parikh <[hidden email]> wrote:

Why have two rings? Cassandra manages the replication for you....one ring with physical nodes in two dc might be a better option. Of course, depending on the inter-dc failure characteristics, might need to endure split-brain for a while.

/***********************
sent from my android...please pardon occasional typos as I respond @ the speed of thought
************************/

On Oct 10, 2011 10:09 PM, "Cassa L" <[hidden email]> wrote:

I am trying to understand multi DC setup for cassandra. As I understand, in this setup,  replicas exists in same cluster ring, but physically nodes are distributed across DCs. Is this correct?
I have two different cluster rings in two DCs, and want to replicate data  bidirectionally. They both have same keyspace. They take  data traffic from different sources, but we want to make sure, data exists in both the rings. What could be the way to achieve this?

Thanks,
L.


Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Multi DC setup

Peter Schuller
> We already have two separate rings. Idea of bidirectional sync is, if one
> ring is down, we can still send the traffic to other ring. When original
> cluster comes back, it will pick up the data from available cluster. I'm not
> sure if it makes sense to have separate rings or combine these two rings
> into one.

Cassandra doesn't have support for synchronizing data between two
different rings. The multi-dc support in Cassandra amounts to having a
single ring containing all nodes from all data centers. Cassandra is
told (by configuring the snitch, such as through a property files)
which nodes are in which data center. Using the
NetworkTopologyStrategy, you then make sure to distribute replicas in
DC:s as you see fit.

Cassandra will then prefer local nodes for read and write operations,
and you can use e.g. LOCAL_QUORUM consistency level to get quorum like
consistency within a DC.

Google/check wiki/read docs about NetworkTopologyStrategy and
PropertyFileSnitch. I don't have a good link to multi-dc off hand
(anyone got a good link to suggest that goes through this?).

--
/ Peter Schuller (@scode on twitter)
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Multi DC setup

Brandon Williams
On Tue, Oct 11, 2011 at 2:36 AM, Peter Schuller
<[hidden email]> wrote:
> Google/check wiki/read docs about NetworkTopologyStrategy and
> PropertyFileSnitch. I don't have a good link to multi-dc off hand
> (anyone got a good link to suggest that goes through this?).

http://www.datastax.com/docs/0.8/cluster_architecture/replication is
pretty good imo.

-Brandon
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Multi DC setup

Eric tamme
In reply to this post by Peter Schuller

>> We already have two separate rings. Idea of bidirectional sync is, if one
>> ring is down, we can still send the traffic to other ring. When original
>> cluster comes back, it will pick up the data from available cluster. I'm not
>> sure if it makes sense to have separate rings or combine these two rings
>> into one.
I am not sure you fully understand how Cassandra is supposed to work -
you do not need two rings to have two complete sets of data that you can
"hot cutover" between.

> Cassandra doesn't have support for synchronizing data between two
> different rings. The multi-dc support in Cassandra amounts to having a
> single ring containing all nodes from all data centers. Cassandra is
> told (by configuring the snitch, such as through a property files)
> which nodes are in which data center. Using the
> NetworkTopologyStrategy, you then make sure to distribute replicas in
> DC:s as you see fit.
Using NTS you can configure a single ring into multiple "logical
rings".  This is effectively what the property file snitch does in
conjunction with NTS.

I gave a presentation on the NTS internals, and replicating data across
geographically distributed data centers. You can find the slides here
http://files.meetup.com/1794037/NTS_presentation.pdf

Also Edward Capriolio's book "high performance cassandra" has some
recipes for using NTS.

I currently have 4 nodes in two data centers and I use NTS with property
file snitch to write 1 copy of data to each DC (one node per DC) so that
in the event of a total DC failure, we can still get to the data.  The
first write is "local" and the replica is asynchronous if you set write
consistency to 1 - so you get fast writes with distribution.

-Eric


Loading...