Question about cassandra (replication)

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Question about cassandra (replication)

Harold Lim

Hi All,

I posted a similar message on the google groups page. Hopefully, I'll get more feedback here.


I just started reading about dynamo and Cassandra and I am thinking
about possibly using cassandra for my system.

I was reading the dynamo paper and they mentioned about a preference
list for a particular key. Is this preference list configurable?

How does Cassandra choose which nodes are in the preference list?
Also, are the number of replica for each key/column configurable? For
example, can I set the replication factor per key/value?

I read that Cassandra has optimistic replication. What exactly does
that mean? Underneath the hood, how does cassandra maintain/detect the
number of replicas? Does it aggressively replicates an item, when it
detects that the number of replica of a particular item goes below the
specified repliation factor?

Is the replication strategy (when to replicate, aggresiveness, etc)
configurable too?






Thanks,
Harold


     
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Question about cassandra (replication)

Jonathan Ellis-3
Rather than post the same question verbatim, it would be more useful
if you explained what you still don't understand after Alexander and
Sandeep's explanations on the google group.

(http://groups.google.com/group/cassandra-user/browse_thread/thread/4330e415e959e9d9)

On Thu, Jun 25, 2009 at 9:11 AM, Harold Lim<[hidden email]> wrote:

>
> Hi All,
>
> I posted a similar message on the google groups page. Hopefully, I'll get more feedback here.
>
>
> I just started reading about dynamo and Cassandra and I am thinking
> about possibly using cassandra for my system.
>
> I was reading the dynamo paper and they mentioned about a preference
> list for a particular key. Is this preference list configurable?
>
> How does Cassandra choose which nodes are in the preference list?
> Also, are the number of replica for each key/column configurable? For
> example, can I set the replication factor per key/value?
>
> I read that Cassandra has optimistic replication. What exactly does
> that mean? Underneath the hood, how does cassandra maintain/detect the
> number of replicas? Does it aggressively replicates an item, when it
> detects that the number of replica of a particular item goes below the
> specified repliation factor?
>
> Is the replication strategy (when to replicate, aggresiveness, etc)
> configurable too?
>
>
>
>
>
>
> Thanks,
> Harold
>
>
>
>
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Question about cassandra (replication)

Harold Lim
In reply to this post by Harold Lim

Hi,

Is the replication factor configurable? For example, Can I configure the replication factor per column-family (e.g., 5 for column-family a and 3 for column-family b).

Also, I am interested about the replication details. Sandeep wrote:
"When there's a failure and the #of replicas for a given key goes down,
Cassandra does not aggressively create a new copy for the data. The
assumption is that the failed node will be replaced soon enough, and work
can continue with the other 2 replicas."

When and how does cassandra replicate when the replication count of a particular data goes below the replication factor? How does it monitor the replication count of a particular data?


-Harold






--- On Thu, 6/25/09, Jonathan Ellis <[hidden email]> wrote:

> From: Jonathan Ellis <[hidden email]>
> Subject: Re: Question about cassandra (replication)
> To: [hidden email]
> Date: Thursday, June 25, 2009, 10:17 AM
> Rather than post the same question
> verbatim, it would be more useful
> if you explained what you still don't understand after
> Alexander and
> Sandeep's explanations on the google group.
>
> (http://groups.google.com/group/cassandra-user/browse_thread/thread/4330e415e959e9d9)
>
> On Thu, Jun 25, 2009 at 9:11 AM, Harold Lim<[hidden email]>
> wrote:
> >
> > Hi All,
> >
> > I posted a similar message on the google groups page.
> Hopefully, I'll get more feedback here.
> >
> >
> > I just started reading about dynamo and Cassandra and
> I am thinking
> > about possibly using cassandra for my system.
> >
> > I was reading the dynamo paper and they mentioned
> about a preference
> > list for a particular key. Is this preference list
> configurable?
> >
> > How does Cassandra choose which nodes are in the
> preference list?
> > Also, are the number of replica for each key/column
> configurable? For
> > example, can I set the replication factor per
> key/value?
> >
> > I read that Cassandra has optimistic replication. What
> exactly does
> > that mean? Underneath the hood, how does cassandra
> maintain/detect the
> > number of replicas? Does it aggressively replicates an
> item, when it
> > detects that the number of replica of a particular
> item goes below the
> > specified repliation factor?
> >
> > Is the replication strategy (when to replicate,
> aggresiveness, etc)
> > configurable too?
> >
> >
> >
> >
> >
> >
> > Thanks,
> > Harold
> >
> >
> >
> >
>


     
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Question about cassandra (replication)

Jonathan Ellis-3
On Thu, Jun 25, 2009 at 10:10 AM, Harold Lim<[hidden email]> wrote:
>
> Hi,
>
> Is the replication factor configurable? For example, Can I configure the replication factor per column-family (e.g., 5 for column-family a and 3 for column-family b).

It is currently only configurable globally.  It may make sense to
configure on a table/namespace basis.  IMO it does not make sense on a
CF basis.

> Also, I am interested about the replication details. Sandeep wrote:
> "When there's a failure and the #of replicas for a given key goes down,
> Cassandra does not aggressively create a new copy for the data. The
> assumption is that the failed node will be replaced soon enough, and work
> can continue with the other 2 replicas."
>
> When and how does cassandra replicate when the replication count of a particular data goes below the replication factor? How does it monitor the replication count of a particular data?

Currently it re-replicates (repairs) lazily.  This is called "read
repair" and we follow essentially the model given in the Dynamo paper.

Non-lazy repair is being worked on at
https://issues.apache.org/jira/browse/CASSANDRA-193

-Jonathan
Loading...