|
does anybody have experience with these projects?
https://github.com/btoddb/cassandra-queue https://github.com/btoddb/cassandra-queue-spring thanks, deno |
|
Tried using Cassandra as a queue once. It's an attractive idea, having the data replicated and durably stored. However the problems we hit were with trying to utilize a secondary index for whether or not something had been processed or not. Since that index field was basically on a type with arity of 2, we ended up with a column family with two really wide rows, which Cassandra, though it can do it, really does not perform very well with.
That coupled with Hector timeout issues became a real problem for us. My thoughts now are don't use Cassandra as a queue, but there may be much smarter ways to do it. Dave
On Wed, Jun 27, 2012 at 11:34 PM, Deno Vichas <[hidden email]> wrote: does anybody have experience with these projects? |
|
On 6/28/2012 9:37 AM, David Leimbach wrote:
> > That coupled with Hector timeout issues became a real problem for us. could you share some details on this? we're using hector and we see random timeout warns in the logs and not sure how to address them. thanks, deno |
|
Using Cassandra as a queue is generally thought of as a bas idea, owing to the high delete workload. Levelled compaction handles it better but it is still no the best approach.
Depending on your needs consider running http://incubator.apache.org/kafka/
First determine if they are server side or client side timeouts. Then determine what the query was. Cheers On 29/06/2012, at 7:02 AM, Deno Vichas wrote:
|
|
is anybody using kafka? what other
options is there? currently i need to do around 50,000 (is that a
lot?) a minute.
On 7/1/2012 11:39 AM, aaron morton wrote: Using Cassandra as a queue is generally thought of as a bas idea, owing to the high delete workload. Levelled compaction handles it better but it is still no the best approach. |
|
lots of folks use Apache Kafka, check out https://cwiki.apache.org/confluence/display/KAFKA/Powered+By just to name a few
you can read about the performance for yourself http://incubator.apache.org/kafka/performance.html
@ http://www.medialets.com we use Kafka upstream of Cassandra acting like a queue so our workers can do their business logic prior to storing their results in Cassandra, Hadoop & MySQL
this decouples our backend analytics from our forwarding facing system keeping our forward facing system (ad serving to mobile devices) as fast as possible and our backend results near realtime (seconds from data coming in)
here are some papers and presentations https://cwiki.apache.org/confluence/display/KAFKA/Kafka+papers+and+presentations
On Mon, Jul 2, 2012 at 10:09 PM, Deno Vichas <[hidden email]> wrote:
/* Joe Stein http://www.linkedin.com/in/charmalloc Twitter: @allthingshadoop */ |
| Powered by Nabble | Edit this page |
