Cassandra users survey

1 2
Jonathan Ellis-3

Cassandra users survey

Reply Threaded More More options
Print post
Permalink
Hi all,

I'd love to get a better feel for who is using Cassandra and what kind
of applications it is seeing.  If you are using Cassandra, could you
share what you're using it for and what stage you are at with it
(evaluation / testing / production)? Also, what alternatives you
evaluated/are evaluating would be useful.  Finally, feel free to throw
in "I'd love to use Cassandra if only it did X" wishes. :)

I can start: Rackspace is using Cassandra for stats collection
(testing, almost production) and as a backend for the Mail & Apps
division (early testing).  We evaluated HBase, Hypertable, dynomite,
and Voldemort as well.

Thanks,

-Jonathan

(If you're in stealth mode or don't want to say anything in public,
feel free to reply to me privately and I will keep it off the record.)
Joe Bowman

Re: Cassandra users survey

Reply Threaded More More options
Print post
Permalink

Hi Jonathon,

I'd say I am at the evaluation stage. The only reason I am looking at nosql type applications instead of using mysql is the vain hope my application will one day scale to the point that mysql won't be the best option. Cassandra appears to be the best fit for the requirement I have that everything must scale horizontally.

The information I will store will be user accounts, user configuration options, and other small data sets to start. Eventually, the largest implementation will be comments functionality on urls for a search engine interface I am building.

The only things i'd like to see is the atomic operations discussed a while back, and an easier interface for python. Honestly the latter can be built on top of thrift ( lazyboy is one attempt ) so I could just write it myself, but you did ask.

On Nov 20, 2009 4:18 PM, "Jonathan Ellis" <[hidden email]> wrote:

Hi all,

I'd love to get a better feel for who is using Cassandra and what kind
of applications it is seeing.  If you are using Cassandra, could you
share what you're using it for and what stage you are at with it
(evaluation / testing / production)? Also, what alternatives you
evaluated/are evaluating would be useful.  Finally, feel free to throw
in "I'd love to use Cassandra if only it did X" wishes. :)

I can start: Rackspace is using Cassandra for stats collection
(testing, almost production) and as a backend for the Mail & Apps
division (early testing).  We evaluated HBase, Hypertable, dynomite,
and Voldemort as well.

Thanks,

-Jonathan

(If you're in stealth mode or don't want to say anything in public,
feel free to reply to me privately and I will keep it off the record.)

Erich Nachbar

Re: Cassandra users survey

Reply Threaded More More options
Print post
Permalink
In reply to this post by Jonathan Ellis-3
Hi,

I'm using Cassandra 0.4.2 at my current client to persist URL graphs
for Spam detection.
The crawling and page classification is done in Hadoop/Bixo/Cascading,
which persists URL classification results into Cassandra.
The incoming production traffic is using Cassandra for the real-time
spam score lookup to determine the spammyness of a URL.

It started out as a prototype and is currently in production with 4
Cassandra nodes (for the last >3 weeks).
Sometimes Cassandra is a little rough on the edges, but in general it works.

Wishes:
- data rebalancing
- proper MapReduce support (ideally supporting the same API HBase
uses, so one could use the same eco-system)
- node decommissioning

On Fri, Nov 20, 2009 at 1:17 PM, Jonathan Ellis <[hidden email]> wrote:

> Hi all,
>
> I'd love to get a better feel for who is using Cassandra and what kind
> of applications it is seeing.  If you are using Cassandra, could you
> share what you're using it for and what stage you are at with it
> (evaluation / testing / production)? Also, what alternatives you
> evaluated/are evaluating would be useful.  Finally, feel free to throw
> in "I'd love to use Cassandra if only it did X" wishes. :)
>
> I can start: Rackspace is using Cassandra for stats collection
> (testing, almost production) and as a backend for the Mail & Apps
> division (early testing).  We evaluated HBase, Hypertable, dynomite,
> and Voldemort as well.
>
> Thanks,
>
> -Jonathan
>
> (If you're in stealth mode or don't want to say anything in public,
> feel free to reply to me privately and I will keep it off the record.)
>
Simon Smith-3

Re: Cassandra users survey

Reply Threaded More More options
Print post
Permalink
In reply to this post by Jonathan Ellis-3
The company I'm with is still small and in the early stages, but we're
planning on using Cassandra for user profile information (in
development right now), and possibly other uses later on.  We
evaluated CouchDB and Voldermort, and both of those were great as well
- for CouchDB, I really liked Futon but had some stability issues and
didn't like the manual replication.  Voldermort may be great, but I
couldn't figure out the API (which probably says more about me than
Voldermort).

One of the reasons we chose Cassandra is because we feel like it is
being used in other situations which required scaling.  I'm looking
forward to v0.5 because of load-balancing and for better support for
the situation where a node is lost permanently.

I'm very pleased with the high level of support for Cassandra, both on
this mailing list and on IRC.

Simon

On Fri, Nov 20, 2009 at 4:17 PM, Jonathan Ellis <[hidden email]> wrote:

> Hi all,
>
> I'd love to get a better feel for who is using Cassandra and what kind
> of applications it is seeing.  If you are using Cassandra, could you
> share what you're using it for and what stage you are at with it
> (evaluation / testing / production)? Also, what alternatives you
> evaluated/are evaluating would be useful.  Finally, feel free to throw
> in "I'd love to use Cassandra if only it did X" wishes. :)
>
> I can start: Rackspace is using Cassandra for stats collection
> (testing, almost production) and as a backend for the Mail & Apps
> division (early testing).  We evaluated HBase, Hypertable, dynomite,
> and Voldemort as well.
>
> Thanks,
>
> -Jonathan
>
> (If you're in stealth mode or don't want to say anything in public,
> feel free to reply to me privately and I will keep it off the record.)
>
B. Todd Burruss

Re: Cassandra users survey

Reply Threaded More More options
Print post
Permalink
In reply to this post by Jonathan Ellis-3
I am evaluating "NoSQL" alternatives to your typical hard to scale
RDBMS, specifically Key/Value stores.  I'm not looking for query
capabilities.  I want very very very high availability with very very
large amounts of data.

I have reduced down my list to Cassandra, Voldemort, Riak, and CouchDB.
Voldemort doesn't seem far enough along to properly evaluate so it is on
the back burner.  Couch is used in a lot of places, but without the
"lounge" it doesn't scale, nor have any sort of HA story (and the lounge
is difficult at best to get installed and working.)  I should mention
Oracle is in use today.

That leaves Riak and Cassandra.  I like Cassandra because of the Rack
and DC awareness hooks.  This is a nice feature for those wanting 5 9's
of availability.

I haven't gotten to performance testing yet.  Just trying to verify that
the products do what they are supposed to, and understand the nuances
with each one.

What I'd like to see in Cassandra:

- flexible conflict resolution mechanism.  Not just "last write wins".
Give the client the ability to "merge" conflicting values.
- A nice web interface to cluster statistics and management.  Something
an operations team could lean on to examine the entire cluster.

thx!





On Fri, 2009-11-20 at 15:17 -0600, Jonathan Ellis wrote:

> Hi all,
>
> I'd love to get a better feel for who is using Cassandra and what kind
> of applications it is seeing.  If you are using Cassandra, could you
> share what you're using it for and what stage you are at with it
> (evaluation / testing / production)? Also, what alternatives you
> evaluated/are evaluating would be useful.  Finally, feel free to throw
> in "I'd love to use Cassandra if only it did X" wishes. :)
>
> I can start: Rackspace is using Cassandra for stats collection
> (testing, almost production) and as a backend for the Mail & Apps
> division (early testing).  We evaluated HBase, Hypertable, dynomite,
> and Voldemort as well.
>
> Thanks,
>
> -Jonathan
>
> (If you're in stealth mode or don't want to say anything in public,
> feel free to reply to me privately and I will keep it off the record.)


Ramzi Rabah

Re: Cassandra users survey

Reply Threaded More More options
Print post
Permalink
We are currently evaluating Cassandra, and using it for a small
feature in production. We are only using the basic insert/get/remove
from the API, with a standard column family. So far, I like a lot of
what Cassandra offers, though I had some tough times with it.

* Version 0.4.2 seems very broken. Besides CASSANDRA-507 which is not
fixed in the v4 version, it seems that when you do significant amount
of deletes, and you try to restart the server, compaction fails pretty
much most of the time in our environment.
* Version 0.5 seems to be better in terms of stability from what I
observed so far. Some things that would definitely be very helpful for
us going forward:
- Easier way to replace a node that dies.
- Disk is not infinite so a way to say when you insert an entry into
cassandra, how long do you want it to be available before it is
deleted by Cassandra.
- Better monitoring tools. It's very hard to tell how heavily loaded a
node and the whole system is right now.
Ryan King

Re: Cassandra users survey

Reply Threaded More More options
Print post
Permalink
In reply to this post by Jonathan Ellis-3
At twitter we're working on using Cassandra to replace our currents
storage for all tweets. We have a cluster in production that's being
populated outside the the user-critical path (ie, the cassandra
writing is async).

Additionally, we're testing and evaluating for basically everything
else in our stack.

We evaluated a lot of things: a custom mysql impl, voldemort, hbase,
mongodb, memcachdb, hypertable, and others.

-ryan

On Fri, Nov 20, 2009 at 1:17 PM, Jonathan Ellis <[hidden email]> wrote:

> Hi all,
>
> I'd love to get a better feel for who is using Cassandra and what kind
> of applications it is seeing.  If you are using Cassandra, could you
> share what you're using it for and what stage you are at with it
> (evaluation / testing / production)? Also, what alternatives you
> evaluated/are evaluating would be useful.  Finally, feel free to throw
> in "I'd love to use Cassandra if only it did X" wishes. :)
>
> I can start: Rackspace is using Cassandra for stats collection
> (testing, almost production) and as a backend for the Mail & Apps
> division (early testing).  We evaluated HBase, Hypertable, dynomite,
> and Voldemort as well.
>
> Thanks,
>
> -Jonathan
>
> (If you're in stealth mode or don't want to say anything in public,
> feel free to reply to me privately and I will keep it off the record.)
>
Tim Underwood

Re: Cassandra users survey

Reply Threaded More More options
Print post
Permalink
In reply to this post by Jonathan Ellis-3
My company runs a niche comparison shopping site where we take in all sorts of raw product data from various sources (retailers, manufacturers, distributors, etc...).  We then have to take all that raw data and collapse it down across the data sources (e.g. product FOO from source A matches product BAR from source B) and eventually end up with a final product that gets surfaced to our website.

Cassandra's data model works great for the raw data where columns are sparsely populated and updated.  The SuperColumnFamily model works great for my collapsed data where I need to track which bits of information came from which raw data.

I'm currently in testing (almost production).  For this use case I'll only be using Cassandra on the backend and then indexing the final data into Apache Solr to power the frontend.  My data is small enough to fit on a single node so I don't have much use for the partitioning at this point.  If anything I'd be more interested in a fully replicated setup where the ReplicationFactor is equal to the number of nodes.

I looked at most of the other nosql solutions (couchdb, mongodb, hbase, hypertable, dynomite, voldemort).

One thing I'd love to see improved:

- Reading through all the data (or a specific key prefix) in a ColumnFamily seems slow.  Cassandra is the bottleneck when I try to index data into Solr and it looks like Cassandra's CPU usage is 2-3 times that of Solr's during the process.

I look forward to playing around with 0.5!

-Tim

On Fri, Nov 20, 2009 at 1:17 PM, Jonathan Ellis <[hidden email]> wrote:
Hi all,

I'd love to get a better feel for who is using Cassandra and what kind
of applications it is seeing.  If you are using Cassandra, could you
share what you're using it for and what stage you are at with it
(evaluation / testing / production)? Also, what alternatives you
evaluated/are evaluating would be useful.  Finally, feel free to throw
in "I'd love to use Cassandra if only it did X" wishes. :)

I can start: Rackspace is using Cassandra for stats collection
(testing, almost production) and as a backend for the Mail & Apps
division (early testing).  We evaluated HBase, Hypertable, dynomite,
and Voldemort as well.

Thanks,

-Jonathan

(If you're in stealth mode or don't want to say anything in public,
feel free to reply to me privately and I will keep it off the record.)

Edmond Lau

Re: Cassandra users survey

Reply Threaded More More options
Print post
Permalink
At Ooyala, we're in the process of testing and productionizing
Cassandra to store and serve our near real-time video analytics data.
Ooyala provides a comprehensive platform for professional video
publishers and enterprise companies looking to build up their online
video presence, and analytics/monetization is a key part of the
platform.

We researched a variety of systems to replace our current MySQL
solution, including HBase, Cassandra, Voldemort, and some others.  Of
those, we seriously considered HBase and Cassandra as satisfying our
needs b/c of HA, scaling, and the more fully featured data schema,
which is a better fit for our high dimensional data.  For both HBase
and Cassandra, we designed data schemas, built functional prototypes
of our application, conducted a fairly thorough performance
evaluation, tested the two systems for various failure scenarios, and
also evaluated how easy each system was to maintain and run.

What I'd like to see in Cassandra:
- More comments in the source code, esp. high-level descriptions of
code organization.  Design docs for various functionality would also
be helpful in getting other folks to contribute.  This was one area
where HBase was significantly better.
- Better bootstrapping and load balancing support (bootstrapping
seemed broken in 0.4.2), but I've seen a lot of work done in these two
areas for 0.5.

Edmond

On Fri, Nov 20, 2009 at 3:02 PM, Tim Underwood <[hidden email]> wrote:

> My company runs a niche comparison shopping site where we take in all sorts
> of raw product data from various sources (retailers, manufacturers,
> distributors, etc...).  We then have to take all that raw data and collapse
> it down across the data sources (e.g. product FOO from source A matches
> product BAR from source B) and eventually end up with a final product that
> gets surfaced to our website.
> Cassandra's data model works great for the raw data where columns are
> sparsely populated and updated.  The SuperColumnFamily model works great for
> my collapsed data where I need to track which bits of information came from
> which raw data.
> I'm currently in testing (almost production).  For this use case I'll only
> be using Cassandra on the backend and then indexing the final data into
> Apache Solr to power the frontend.  My data is small enough to fit on a
> single node so I don't have much use for the partitioning at this point.  If
> anything I'd be more interested in a fully replicated setup where the
> ReplicationFactor is equal to the number of nodes.
> I looked at most of the other nosql solutions (couchdb, mongodb, hbase,
> hypertable, dynomite, voldemort).
> One thing I'd love to see improved:
> - Reading through all the data (or a specific key prefix) in a ColumnFamily
> seems slow.  Cassandra is the bottleneck when I try to index data into Solr
> and it looks like Cassandra's CPU usage is 2-3 times that of Solr's during
> the process.
> I look forward to playing around with 0.5!
> -Tim
> On Fri, Nov 20, 2009 at 1:17 PM, Jonathan Ellis <[hidden email]> wrote:
>>
>> Hi all,
>>
>> I'd love to get a better feel for who is using Cassandra and what kind
>> of applications it is seeing.  If you are using Cassandra, could you
>> share what you're using it for and what stage you are at with it
>> (evaluation / testing / production)? Also, what alternatives you
>> evaluated/are evaluating would be useful.  Finally, feel free to throw
>> in "I'd love to use Cassandra if only it did X" wishes. :)
>>
>> I can start: Rackspace is using Cassandra for stats collection
>> (testing, almost production) and as a backend for the Mail & Apps
>> division (early testing).  We evaluated HBase, Hypertable, dynomite,
>> and Voldemort as well.
>>
>> Thanks,
>>
>> -Jonathan
>>
>> (If you're in stealth mode or don't want to say anything in public,
>> feel free to reply to me privately and I will keep it off the record.)
>
>
Joe Stump

Re: Cassandra users survey

Reply Threaded More More options
Print post
Permalink
In reply to this post by Jonathan Ellis-3
SimpleGeo is using Cassandra as the backend of our real-time location  
infrastructure. We needed something that was distributed, could scale,  
could handle lots of writes, etc.

We looked into all the usual suspects, but went with Cassandra because  
it was written in Java (we have two guys who know Java internally), it  
was small enough that we could become heavily involved early on, I  
personally knew a few of the committers, and it's multi-master.

The only thing I think would be super interesting would be increment/
decrement.

--Joe


On Nov 20, 2009, at 1:17 PM, Jonathan Ellis wrote:

> Hi all,
>
> I'd love to get a better feel for who is using Cassandra and what kind
> of applications it is seeing.  If you are using Cassandra, could you
> share what you're using it for and what stage you are at with it
> (evaluation / testing / production)? Also, what alternatives you
> evaluated/are evaluating would be useful.  Finally, feel free to throw
> in "I'd love to use Cassandra if only it did X" wishes. :)
>
> I can start: Rackspace is using Cassandra for stats collection
> (testing, almost production) and as a backend for the Mail & Apps
> division (early testing).  We evaluated HBase, Hypertable, dynomite,
> and Voldemort as well.
>
> Thanks,
>
> -Jonathan
>
> (If you're in stealth mode or don't want to say anything in public,
> feel free to reply to me privately and I will keep it off the record.)

Scott White-2

Re: Cassandra users survey

Reply Threaded More More options
Print post
Permalink
In reply to this post by Jonathan Ellis-3
For a project I am working on now at Onespot we are just beginning to move off RDBMS and onto Cassandra for a subset of our data store. We evaluated against several other solutions including Tokyo, Voldemort and Riak and Cassandra seemed the clear winner for our requirements. We have also done stress testing and been happy with the results.

Wishlist:
- rebalancing
- Hadoop integration
- node replacement that doesn't depend on having the same ip/hostname
- multi-insert (so can insert against multiple keys in one request)

cheers,
Scott
Ian Holsman-3

Re: Cassandra users survey

Reply Threaded More More options
Print post
Permalink
In reply to this post by Jonathan Ellis-3
We're looking at it to be part of a near real time Web analytics engine, which sounds similar to Ooyala.
at the moment I'm pushing to get the thing open sourced if possible.

we're looking at combining Cassandra + Esper, but we are still in the very early stages.
On Nov 21, 2009, at 8:17 AM, Jonathan Ellis wrote:

> Hi all,
>
> I'd love to get a better feel for who is using Cassandra and what kind
> of applications it is seeing.  If you are using Cassandra, could you
> share what you're using it for and what stage you are at with it
> (evaluation / testing / production)? Also, what alternatives you
> evaluated/are evaluating would be useful.  Finally, feel free to throw
> in "I'd love to use Cassandra if only it did X" wishes. :)
>
> I can start: Rackspace is using Cassandra for stats collection
> (testing, almost production) and as a backend for the Mail & Apps
> division (early testing).  We evaluated HBase, Hypertable, dynomite,
> and Voldemort as well.
>
> Thanks,
>
> -Jonathan
>
> (If you're in stealth mode or don't want to say anything in public,
> feel free to reply to me privately and I will keep it off the record.)

--
Ian Holsman
[hidden email]



Vitaly Kushner

Re: Cassandra users survey

Reply Threaded More More options
Print post
Permalink
In reply to this post by Jonathan Ellis-3
At Astrails we are using Cassandra in a project for one of our
clients. The performance requirements
are such that would require database sharding from the beginning if we
were to use an SQL solution
We thing Cassandra's horizontal scaling allow us to more concentrate
on the application an less on the infrastructure.
The project is still in the early development stage.

--
Vitaly Kushner
http://twitter.com/vkushner
Founder, Astrails Ltd. http://astrails.com/
Check out our blog: http://blog.astrails.com/

On Fri, Nov 20, 2009 at 11:17 PM, Jonathan Ellis <[hidden email]> wrote:

> Hi all,
>
> I'd love to get a better feel for who is using Cassandra and what kind
> of applications it is seeing.  If you are using Cassandra, could you
> share what you're using it for and what stage you are at with it
> (evaluation / testing / production)? Also, what alternatives you
> evaluated/are evaluating would be useful.  Finally, feel free to throw
> in "I'd love to use Cassandra if only it did X" wishes. :)
>
> I can start: Rackspace is using Cassandra for stats collection
> (testing, almost production) and as a backend for the Mail & Apps
> division (early testing).  We evaluated HBase, Hypertable, dynomite,
> and Voldemort as well.
>
> Thanks,
>
> -Jonathan
>
> (If you're in stealth mode or don't want to say anything in public,
> feel free to reply to me privately and I will keep it off the record.)
>
Jake Luciani

Re: Cassandra users survey

Reply Threaded More More options
Print post
Permalink
In reply to this post by Jonathan Ellis-3
I'm about to release a twitter search engine built ontop of cassandra. If you are interested in beta testing it let me know.

I would like to see cassandra support increment/decrement. 

-Jake

On Fri, Nov 20, 2009 at 4:17 PM, Jonathan Ellis <[hidden email]> wrote:
Hi all,

I'd love to get a better feel for who is using Cassandra and what kind
of applications it is seeing.  If you are using Cassandra, could you
share what you're using it for and what stage you are at with it
(evaluation / testing / production)? Also, what alternatives you
evaluated/are evaluating would be useful.  Finally, feel free to throw
in "I'd love to use Cassandra if only it did X" wishes. :)

I can start: Rackspace is using Cassandra for stats collection
(testing, almost production) and as a backend for the Mail & Apps
division (early testing).  We evaluated HBase, Hypertable, dynomite,
and Voldemort as well.

Thanks,

-Jonathan

(If you're in stealth mode or don't want to say anything in public,
feel free to reply to me privately and I will keep it off the record.)

Dan Di Spaltro

Re: Cassandra users survey

Reply Threaded More More options
Print post
Permalink
In reply to this post by Jonathan Ellis-3
At Cloudkick we are using Cassandra to store monitoring statistics and
running analytics over the data.  I would love to share some ideas
about how we set up our data-model, if anyone is interested.  This
isn't the right thread to do it in, but I think it would be useful to
show how we store billions of points of data in Cassandra (and maybe
get some feedback).

Wishlist
-remove_slice_range
-auto loadbalancing
-inc/dev

On Fri, Nov 20, 2009 at 1:17 PM, Jonathan Ellis <[hidden email]> wrote:

> Hi all,
>
> I'd love to get a better feel for who is using Cassandra and what kind
> of applications it is seeing.  If you are using Cassandra, could you
> share what you're using it for and what stage you are at with it
> (evaluation / testing / production)? Also, what alternatives you
> evaluated/are evaluating would be useful.  Finally, feel free to throw
> in "I'd love to use Cassandra if only it did X" wishes. :)
>
> I can start: Rackspace is using Cassandra for stats collection
> (testing, almost production) and as a backend for the Mail & Apps
> division (early testing).  We evaluated HBase, Hypertable, dynomite,
> and Voldemort as well.
>
> Thanks,
>
> -Jonathan
>
> (If you're in stealth mode or don't want to say anything in public,
> feel free to reply to me privately and I will keep it off the record.)
>



--
Dan Di Spaltro
James Golick

Re: Cassandra users survey

Reply Threaded More More options
Print post
Permalink
I would love to see that post about your data model.

J.

Sent from my iPhone.

On 2009-11-20, at 5:38 PM, Dan Di Spaltro <[hidden email]>  
wrote:

> At Cloudkick we are using Cassandra to store monitoring statistics and
> running analytics over the data.  I would love to share some ideas
> about how we set up our data-model, if anyone is interested.  This
> isn't the right thread to do it in, but I think it would be useful to
> show how we store billions of points of data in Cassandra (and maybe
> get some feedback).
>
> Wishlist
> -remove_slice_range
> -auto loadbalancing
> -inc/dev
>
> On Fri, Nov 20, 2009 at 1:17 PM, Jonathan Ellis <[hidden email]>  
> wrote:
>> Hi all,
>>
>> I'd love to get a better feel for who is using Cassandra and what  
>> kind
>> of applications it is seeing.  If you are using Cassandra, could you
>> share what you're using it for and what stage you are at with it
>> (evaluation / testing / production)? Also, what alternatives you
>> evaluated/are evaluating would be useful.  Finally, feel free to  
>> throw
>> in "I'd love to use Cassandra if only it did X" wishes. :)
>>
>> I can start: Rackspace is using Cassandra for stats collection
>> (testing, almost production) and as a backend for the Mail & Apps
>> division (early testing).  We evaluated HBase, Hypertable, dynomite,
>> and Voldemort as well.
>>
>> Thanks,
>>
>> -Jonathan
>>
>> (If you're in stealth mode or don't want to say anything in public,
>> feel free to reply to me privately and I will keep it off the  
>> record.)
>>
>
>
>
> --
> Dan Di Spaltro
Michael Pearson

Re: Cassandra users survey

Reply Threaded More More options
Print post
Permalink
In reply to this post by Jonathan Ellis-3
Hi, I've been waiting for something like Cassandra for a while now for
a personal project.  The data model seems ideally suited to building a
mashup engine, or any arbitrary data user app for that matter.  I'm
still at an early stage conceptually having come from an rdbms
background, and mostly trying to wrap my head around Thrift api and
building a crud/factory in php (Pandra on github) and a keyspace
administrator for fun.  I wanted to jump in early with Cassandra
(started watching from 0.3) with a view to the future as a production
level solution once some administrative nicities have been ironed out
(data migration, node decommissioning, more robust query api etc).
What would be awesome and make me love Cassandra forever would be a
way to group columns together across keys, similar to the way
supercolumns work but by key range (depth) rather than column (width).

.michael.

On Sat, Nov 21, 2009 at 7:17 AM, Jonathan Ellis <[hidden email]> wrote:

> Hi all,
>
> I'd love to get a better feel for who is using Cassandra and what kind
> of applications it is seeing.  If you are using Cassandra, could you
> share what you're using it for and what stage you are at with it
> (evaluation / testing / production)? Also, what alternatives you
> evaluated/are evaluating would be useful.  Finally, feel free to throw
> in "I'd love to use Cassandra if only it did X" wishes. :)
>
> I can start: Rackspace is using Cassandra for stats collection
> (testing, almost production) and as a backend for the Mail & Apps
> division (early testing).  We evaluated HBase, Hypertable, dynomite,
> and Voldemort as well.
>
> Thanks,
>
> -Jonathan
>
> (If you're in stealth mode or don't want to say anything in public,
> feel free to reply to me privately and I will keep it off the record.)
>
Ian Holsman-3

Re: Cassandra users survey

Reply Threaded More More options
Print post
Permalink
In reply to this post by Dan Di Spaltro



---
Sent from my phone
Ian Holsman - 703 879-3128

On 21/11/2009, at 12:38 PM, Dan Di Spaltro <[hidden email]>  
wrote:

> At Cloudkick we are using Cassandra to store monitoring statistics and
> running analytics over the data.  I would love to share some ideas
> about how we set up our data-model, if anyone is interested.  This
> isn't the right thread to do it in, but I think it would be useful to
> show how we store billions of points of data in Cassandra (and maybe
> get some feedback).
>
> Wishlist
> -remove_slice_range
> -auto loadbalancing
> -inc/dev
>
> On Fri, Nov 20, 2009 at 1:17 PM, Jonathan Ellis <[hidden email]>  
> wrote:
>> Hi all,
>>
>> I'd love to get a better feel for who is using Cassandra and what  
>> kind
>> of applications it is seeing.  If you are using Cassandra, could you
>> share what you're using it for and what stage you are at with it
>> (evaluation / testing / production)? Also, what alternatives you
>> evaluated/are evaluating would be useful.  Finally, feel free to  
>> throw
>> in "I'd love to use Cassandra if only it did X" wishes. :)
>>
>> I can start: Rackspace is using Cassandra for stats collection
>> (testing, almost production) and as a backend for the Mail & Apps
>> division (early testing).  We evaluated HBase, Hypertable, dynomite,
>> and Voldemort as well.
>>
>> Thanks,
>>
>> -Jonathan
>>
>> (If you're in stealth mode or don't want to say anything in public,
>> feel free to reply to me privately and I will keep it off the  
>> record.)
>>
>
>
>
> --
> Dan Di Spaltro
Phillip Michalak

Re: Cassandra users survey

Reply Threaded More More options
Print post
Permalink
In reply to this post by Jonathan Ellis-3
We're using Cassandra in development to store custom index information  
on large document sets. Also considered HBase and Voldemort.  
Cassandra's data model and performance tradeoffs seemed to best fit  
our needs.

Features that we're looking forward to seeing:
* map/reduce integration
* built-in counters with incr/decr
* more automated load balancing

Cheers,
Phil

On Nov 20, 2009, at 3:17 PM, Jonathan Ellis wrote:

> Hi all,
>
> I'd love to get a better feel for who is using Cassandra and what kind
> of applications it is seeing.  If you are using Cassandra, could you
> share what you're using it for and what stage you are at with it
> (evaluation / testing / production)? Also, what alternatives you
> evaluated/are evaluating would be useful.  Finally, feel free to throw
> in "I'd love to use Cassandra if only it did X" wishes. :)
>
> I can start: Rackspace is using Cassandra for stats collection
> (testing, almost production) and as a backend for the Mail & Apps
> division (early testing).  We evaluated HBase, Hypertable, dynomite,
> and Voldemort as well.
>
> Thanks,
>
> -Jonathan
>
> (If you're in stealth mode or don't want to say anything in public,
> feel free to reply to me privately and I will keep it off the record.)

Chris Were

Re: Cassandra users survey

Reply Threaded More More options
Print post
Permalink
In reply to this post by Jonathan Ellis-3
Hi Jonathan,

Firstly, thanks for all your help on this list. Without lots of your solutions / tips etc I probably wouldn't be using Cassandra.

I've built a real-time search engine based around all the links that appear on twitter (http://www.mozzler.com/). Cassandra is my data store for all the comments associated with a link, mapping short URL's to endpoint URL's etc. Cassandra is also used for session and user data for the web front end, in conjunction with memcached to speed up the writes.

If anyone wants a simple django session manager that uses lazyboy to talk to cassandra, let me know.

Cheers,
Chris

On Fri, Nov 20, 2009 at 1:17 PM, Jonathan Ellis <[hidden email]> wrote:
Hi all,

I'd love to get a better feel for who is using Cassandra and what kind
of applications it is seeing.  If you are using Cassandra, could you
share what you're using it for and what stage you are at with it
(evaluation / testing / production)? Also, what alternatives you
evaluated/are evaluating would be useful.  Finally, feel free to throw
in "I'd love to use Cassandra if only it did X" wishes. :)

I can start: Rackspace is using Cassandra for stats collection
(testing, almost production) and as a backend for the Mail & Apps
division (early testing).  We evaluated HBase, Hypertable, dynomite,
and Voldemort as well.

Thanks,

-Jonathan

(If you're in stealth mode or don't want to say anything in public,
feel free to reply to me privately and I will keep it off the record.)

1 2