Quantcast

cassandra queue

classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate
star

cassandra queue

denov
does anybody have experience with these projects?

https://github.com/btoddb/cassandra-queue
https://github.com/btoddb/cassandra-queue-spring

thanks,
deno
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate
star

Re: cassandra queue

David Leimbach
Tried using Cassandra as a queue once.  It's an attractive idea, having the data replicated and durably stored.  However the problems we hit were with trying to utilize a secondary index for whether or not something had been processed or not.  Since that index field was basically on a type with arity of 2, we ended up with a column family with two really wide rows, which Cassandra, though it can do it, really does not perform very well with.

That coupled with Hector timeout issues became a real problem for us.

My thoughts now are don't use Cassandra as a queue, but there may be much smarter ways to do it.

Dave

On Wed, Jun 27, 2012 at 11:34 PM, Deno Vichas <[hidden email]> wrote:
does anybody have experience with these projects?

https://github.com/btoddb/cassandra-queue
https://github.com/btoddb/cassandra-queue-spring

thanks,
deno

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate
star

Re: hector timeouts

denov
On 6/28/2012 9:37 AM, David Leimbach wrote:
>
> That coupled with Hector timeout issues became a real problem for us.

could you share some details on this?  we're using hector and we see
random timeout warns in the logs and not sure how to address them.


thanks,
deno
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate
star

Re: hector timeouts

aaron morton
Using Cassandra as a queue is generally thought of as a bas idea, owing to the high delete workload. Levelled compaction handles it better but it is still no the best approach. 

Depending on your needs consider running http://incubator.apache.org/kafka/ 

could you share some details on this?  we're using hector and we see random timeout warns in the logs and not sure how to address them.
First determine if they are server side or client side timeouts. Then determine what the query was. 

Cheers

-----------------
Aaron Morton
Freelance Developer
@aaronmorton

On 29/06/2012, at 7:02 AM, Deno Vichas wrote:

On 6/28/2012 9:37 AM, David Leimbach wrote:

That coupled with Hector timeout issues became a real problem for us.

could you share some details on this?  we're using hector and we see random timeout warns in the logs and not sure how to address them.


thanks,
deno

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate
star

Re: hector timeouts

denov
is anybody using kafka?  what other options is there?  currently i need to do around 50,000 (is that a lot?) a minute.


On 7/1/2012 11:39 AM, aaron morton wrote:
Using Cassandra as a queue is generally thought of as a bas idea, owing to the high delete workload. Levelled compaction handles it better but it is still no the best approach. 

Depending on your needs consider running http://incubator.apache.org/kafka/ 

could you share some details on this?  we're using hector and we see random timeout warns in the logs and not sure how to address them.
First determine if they are server side or client side timeouts. Then determine what the query was. 

Cheers

-----------------
Aaron Morton
Freelance Developer
@aaronmorton

On 29/06/2012, at 7:02 AM, Deno Vichas wrote:

On 6/28/2012 9:37 AM, David Leimbach wrote:

That coupled with Hector timeout issues became a real problem for us.

could you share some details on this?  we're using hector and we see random timeout warns in the logs and not sure how to address them.


thanks,
deno



Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate
star

Re: hector timeouts

Joe Stein
lots of folks use Apache Kafka, check out https://cwiki.apache.org/confluence/display/KAFKA/Powered+By just to name a few

you can read about the performance for yourself http://incubator.apache.org/kafka/performance.html 

@ http://www.medialets.com we use Kafka upstream of Cassandra acting like a queue so our workers can do their business logic prior to storing their results in Cassandra, Hadoop & MySQL

this decouples our backend analytics from our forwarding facing system keeping our forward facing system (ad serving to mobile devices) as fast as possible and our backend results near realtime (seconds from data coming in)


On Mon, Jul 2, 2012 at 10:09 PM, Deno Vichas <[hidden email]> wrote:
is anybody using kafka?  what other options is there?  currently i need to do around 50,000 (is that a lot?) a minute.


On 7/1/2012 11:39 AM, aaron morton wrote:
Using Cassandra as a queue is generally thought of as a bas idea, owing to the high delete workload. Levelled compaction handles it better but it is still no the best approach. 

Depending on your needs consider running http://incubator.apache.org/kafka/ 

could you share some details on this?  we're using hector and we see random timeout warns in the logs and not sure how to address them.
First determine if they are server side or client side timeouts. Then determine what the query was. 

Cheers

-----------------
Aaron Morton
Freelance Developer
@aaronmorton

On 29/06/2012, at 7:02 AM, Deno Vichas wrote:

On 6/28/2012 9:37 AM, David Leimbach wrote:

That coupled with Hector timeout issues became a real problem for us.

could you share some details on this?  we're using hector and we see random timeout warns in the logs and not sure how to address them.


thanks,
deno






--

/*
Joe Stein
http://www.linkedin.com/in/charmalloc
Twitter: @allthingshadoop
*/
Loading...