I am trying to understand multi DC setup for cassandra. As I understand, in this setup, replicas exists in same cluster ring, but physically nodes are distributed across DCs. Is this correct?
I have two different cluster rings in two DCs, and want to replicate data bidirectionally. They both have same keyspace. They take data traffic from different sources, but we want to make sure, data exists in both the rings. What could be the way to achieve this? Thanks, L. |
Why have two rings? Cassandra manages the replication for you....one ring with physical nodes in two dc might be a better option. Of course, depending on the inter-dc failure characteristics, might need to endure split-brain for a while. /*********************** On Oct 10, 2011 10:09 PM, "Cassa L" <[hidden email]> wrote: |
We already have two separate rings. Idea of bidirectional sync is, if one ring is down, we can still send the traffic to other ring. When original cluster comes back, it will pick up the data from available cluster. I'm not sure if it makes sense to have separate rings or combine these two rings into one.
On Mon, Oct 10, 2011 at 10:17 PM, Milind Parikh <[hidden email]> wrote:
|
> We already have two separate rings. Idea of bidirectional sync is, if one
> ring is down, we can still send the traffic to other ring. When original > cluster comes back, it will pick up the data from available cluster. I'm not > sure if it makes sense to have separate rings or combine these two rings > into one. Cassandra doesn't have support for synchronizing data between two different rings. The multi-dc support in Cassandra amounts to having a single ring containing all nodes from all data centers. Cassandra is told (by configuring the snitch, such as through a property files) which nodes are in which data center. Using the NetworkTopologyStrategy, you then make sure to distribute replicas in DC:s as you see fit. Cassandra will then prefer local nodes for read and write operations, and you can use e.g. LOCAL_QUORUM consistency level to get quorum like consistency within a DC. Google/check wiki/read docs about NetworkTopologyStrategy and PropertyFileSnitch. I don't have a good link to multi-dc off hand (anyone got a good link to suggest that goes through this?). -- / Peter Schuller (@scode on twitter) |
On Tue, Oct 11, 2011 at 2:36 AM, Peter Schuller
<[hidden email]> wrote: > Google/check wiki/read docs about NetworkTopologyStrategy and > PropertyFileSnitch. I don't have a good link to multi-dc off hand > (anyone got a good link to suggest that goes through this?). http://www.datastax.com/docs/0.8/cluster_architecture/replication is pretty good imo. -Brandon |
In reply to this post by Peter Schuller
>> We already have two separate rings. Idea of bidirectional sync is, if one >> ring is down, we can still send the traffic to other ring. When original >> cluster comes back, it will pick up the data from available cluster. I'm not >> sure if it makes sense to have separate rings or combine these two rings >> into one. I am not sure you fully understand how Cassandra is supposed to work - you do not need two rings to have two complete sets of data that you can "hot cutover" between. > Cassandra doesn't have support for synchronizing data between two > different rings. The multi-dc support in Cassandra amounts to having a > single ring containing all nodes from all data centers. Cassandra is > told (by configuring the snitch, such as through a property files) > which nodes are in which data center. Using the > NetworkTopologyStrategy, you then make sure to distribute replicas in > DC:s as you see fit. Using NTS you can configure a single ring into multiple "logical rings". This is effectively what the property file snitch does in conjunction with NTS. I gave a presentation on the NTS internals, and replicating data across geographically distributed data centers. You can find the slides here http://files.meetup.com/1794037/NTS_presentation.pdf Also Edward Capriolio's book "high performance cassandra" has some recipes for using NTS. I currently have 4 nodes in two data centers and I use NTS with property file snitch to write 1 copy of data to each DC (one node per DC) so that in the event of a total DC failure, we can still get to the data. The first write is "local" and the replica is asynchronous if you set write consistency to 1 - so you get fast writes with distribution. -Eric |
Free forum by Nabble | Edit this page |