Hi,
I was wondering what the viability of running cassandra on ec2 was. I believe that it currently runs on some pretty hefty hardware at facebook, so I'm wondering what the minimum hardware config is (in other words can I run it on a cluster of 2core 4GB machines)? Also, running on Amazon means no multicast, network partitions and machines just disappearing. How does cassandra deal with these constraints/failures? Thanks for information, -Anthony -- ------------------------------------------------------------------------ Anthony Molinaro <[hidden email]> |
IMO the biggest downside to running on EC2 is that IO is terrible. I
haven't done benchmarks, but anecdotally disk performance in particular seems like an order of magnitude slower than you'd get on non-virtual disks. So that is worth investigating before assuming that the price/performance on EC2 is what you think it is. Other than that, Cassandra is designed to emphasize availability so it should work fine in the situations you describe. Hinted handoff in particular will get writes to the right nodes quickly when machines come back online. (However, Cassandra is not yet good at dealing with machines becoming permanently dead.) Of course if _all_ of some keys' replicas are temporarily partitioned off from you you won't be able to read that data until they are visible again. -Jonathan On Sat, Jun 13, 2009 at 11:20 AM, Anthony Molinaro<[hidden email]> wrote: > Hi, > > I was wondering what the viability of running cassandra on ec2 was. > I believe that it currently runs on some pretty hefty hardware at > facebook, so I'm wondering what the minimum hardware config is > (in other words can I run it on a cluster of 2core 4GB machines)? > Also, running on Amazon means no multicast, network partitions and > machines just disappearing. How does cassandra deal with these > constraints/failures? > > Thanks for information, > > -Anthony > > -- > ------------------------------------------------------------------------ > Anthony Molinaro <[hidden email]> > |
And any problems with small memory boxes? I see some chatter on the
cassandra development list about OOM errors. Are they more prevalent on smaller footprint boxes? Thanks again, -Anthony On Sat, Jun 13, 2009 at 11:33:21AM -0500, Jonathan Ellis wrote: > IMO the biggest downside to running on EC2 is that IO is terrible. I > haven't done benchmarks, but anecdotally disk performance in > particular seems like an order of magnitude slower than you'd get on > non-virtual disks. So that is worth investigating before assuming > that the price/performance on EC2 is what you think it is. > > Other than that, Cassandra is designed to emphasize availability so it > should work fine in the situations you describe. Hinted handoff in > particular will get writes to the right nodes quickly when machines > come back online. (However, Cassandra is not yet good at dealing with > machines becoming permanently dead.) > > Of course if _all_ of some keys' replicas are temporarily partitioned > off from you you won't be able to read that data until they are > visible again. > > -Jonathan > > On Sat, Jun 13, 2009 at 11:20 AM, Anthony > Molinaro<[hidden email]> wrote: > > Hi, > > > > I was wondering what the viability of running cassandra on ec2 was. > > I believe that it currently runs on some pretty hefty hardware at > > facebook, so I'm wondering what the minimum hardware config is > > (in other words can I run it on a cluster of 2core 4GB machines)? > > Also, running on Amazon means no multicast, network partitions and > > machines just disappearing. How does cassandra deal with these > > constraints/failures? > > > > Thanks for information, > > > > -Anthony > > > > -- > > ------------------------------------------------------------------------ > > Anthony Molinaro <[hidden email]> > > -- ------------------------------------------------------------------------ Anthony Molinaro <[hidden email]> |
https://issues.apache.org/jira/browse/CASSANDRA-208 is probably the
issue you are referring to. It is fixed in trunk. Our goal is to run most workloads fine with 1GB of heap out of the box, which should be fine even on a small EC2 instance iirc. See http://wiki.apache.org/cassandra/MemtableThresholds for tuning memory use. -Jonathan On Sat, Jun 13, 2009 at 3:10 PM, Anthony Molinaro<[hidden email]> wrote: > And any problems with small memory boxes? I see some chatter on the > cassandra development list about OOM errors. Are they more prevalent > on smaller footprint boxes? > > Thanks again, > > -Anthony > > On Sat, Jun 13, 2009 at 11:33:21AM -0500, Jonathan Ellis wrote: >> IMO the biggest downside to running on EC2 is that IO is terrible. I >> haven't done benchmarks, but anecdotally disk performance in >> particular seems like an order of magnitude slower than you'd get on >> non-virtual disks. So that is worth investigating before assuming >> that the price/performance on EC2 is what you think it is. >> >> Other than that, Cassandra is designed to emphasize availability so it >> should work fine in the situations you describe. Hinted handoff in >> particular will get writes to the right nodes quickly when machines >> come back online. (However, Cassandra is not yet good at dealing with >> machines becoming permanently dead.) >> >> Of course if _all_ of some keys' replicas are temporarily partitioned >> off from you you won't be able to read that data until they are >> visible again. >> >> -Jonathan >> >> On Sat, Jun 13, 2009 at 11:20 AM, Anthony >> Molinaro<[hidden email]> wrote: >> > Hi, >> > >> > I was wondering what the viability of running cassandra on ec2 was. >> > I believe that it currently runs on some pretty hefty hardware at >> > facebook, so I'm wondering what the minimum hardware config is >> > (in other words can I run it on a cluster of 2core 4GB machines)? >> > Also, running on Amazon means no multicast, network partitions and >> > machines just disappearing. How does cassandra deal with these >> > constraints/failures? >> > >> > Thanks for information, >> > >> > -Anthony >> > >> > -- >> > ------------------------------------------------------------------------ >> > Anthony Molinaro <[hidden email]> >> > > > -- > ------------------------------------------------------------------------ > Anthony Molinaro <[hidden email]> > |
Free forum by Nabble | Edit this page |