Cassandra vs MongoDB

classic Classic list List threaded Threaded
10 messages Options
Reply | Threaded
Open this post in threaded view
|

Cassandra vs MongoDB

Mark-50
Can someone quickly explain the differences between the two? Other than
the fact that MongoDB supports ad-hoc querying I don't know whats
different. It also appears (using google trends) that MongoDB seems to
be growing while Cassandra is dying off. Is this the case?

Thanks for the help
Reply | Threaded
Open this post in threaded view
|

Re: Cassandra vs MongoDB

Drew Dahlke
There's a good post on stackoverflow comparing the two
http://stackoverflow.com/questions/2892729/mongodb-vs-cassandra

It seems to me that both projects have pretty vibrant communities behind them.

On Tue, Jul 27, 2010 at 11:14 AM, Mark <[hidden email]> wrote:
> Can someone quickly explain the differences between the two? Other than the
> fact that MongoDB supports ad-hoc querying I don't know whats different. It
> also appears (using google trends) that MongoDB seems to be growing while
> Cassandra is dying off. Is this the case?
>
> Thanks for the help
>
Reply | Threaded
Open this post in threaded view
|

Re: Cassandra vs MongoDB

Jonathan Shook
Also, google trends is only a measure of what terms people are
searching for. To equate this directly to growth would be misleading.

 Tue, Jul 27, 2010 at 12:27 PM, Drew Dahlke <[hidden email]> wrote:

> There's a good post on stackoverflow comparing the two
> http://stackoverflow.com/questions/2892729/mongodb-vs-cassandra
>
> It seems to me that both projects have pretty vibrant communities behind them.
>
> On Tue, Jul 27, 2010 at 11:14 AM, Mark <[hidden email]> wrote:
>> Can someone quickly explain the differences between the two? Other than the
>> fact that MongoDB supports ad-hoc querying I don't know whats different. It
>> also appears (using google trends) that MongoDB seems to be growing while
>> Cassandra is dying off. Is this the case?
>>
>> Thanks for the help
>>
>
Reply | Threaded
Open this post in threaded view
|

Re: Cassandra vs MongoDB

Dave Gardner
There are quite a few differences. Ultimately it depends on your use
case! For example Mongo has a limit on the maximum "document" size of
4MB, whereas with Cassandra you are not really limited to the volume
of data/columns per-row (I think there maybe a limit of 2GB perhaps;
basically none)

Another point re: search volumes is that mongo has been actively
promoting over the last few months. I recently attended an excellent
conference day in London which was very cheap; tickets probably didn't
cover the costs. I guess this is part of their strategy. Eg: encourage
adoption.

Dave

On Tuesday, July 27, 2010, Jonathan Shook <[hidden email]> wrote:

> Also, google trends is only a measure of what terms people are
> searching for. To equate this directly to growth would be misleading.
>
>  Tue, Jul 27, 2010 at 12:27 PM, Drew Dahlke <[hidden email]> wrote:
>> There's a good post on stackoverflow comparing the two
>> http://stackoverflow.com/questions/2892729/mongodb-vs-cassandra
>>
>> It seems to me that both projects have pretty vibrant communities behind them.
>>
>> On Tue, Jul 27, 2010 at 11:14 AM, Mark <[hidden email]> wrote:
>>> Can someone quickly explain the differences between the two? Other than the
>>> fact that MongoDB supports ad-hoc querying I don't know whats different. It
>>> also appears (using google trends) that MongoDB seems to be growing while
>>> Cassandra is dying off. Is this the case?
>>>
>>> Thanks for the help
>>>
>>
>
Reply | Threaded
Open this post in threaded view
|

Re: Cassandra vs MongoDB

Mark-50
On 7/27/10 12:42 PM, Dave Gardner wrote:

> There are quite a few differences. Ultimately it depends on your use
> case! For example Mongo has a limit on the maximum "document" size of
> 4MB, whereas with Cassandra you are not really limited to the volume
> of data/columns per-row (I think there maybe a limit of 2GB perhaps;
> basically none)
>
> Another point re: search volumes is that mongo has been actively
> promoting over the last few months. I recently attended an excellent
> conference day in London which was very cheap; tickets probably didn't
> cover the costs. I guess this is part of their strategy. Eg: encourage
> adoption.
>
> Dave
>
> On Tuesday, July 27, 2010, Jonathan Shook<[hidden email]>  wrote:
>    
>> Also, google trends is only a measure of what terms people are
>> searching for. To equate this directly to growth would be misleading.
>>
>>   Tue, Jul 27, 2010 at 12:27 PM, Drew Dahlke<[hidden email]>  wrote:
>>      
>>> There's a good post on stackoverflow comparing the two
>>> http://stackoverflow.com/questions/2892729/mongodb-vs-cassandra
>>>
>>> It seems to me that both projects have pretty vibrant communities behind them.
>>>
>>> On Tue, Jul 27, 2010 at 11:14 AM, Mark<[hidden email]>  wrote:
>>>        
>>>> Can someone quickly explain the differences between the two? Other than the
>>>> fact that MongoDB supports ad-hoc querying I don't know whats different. It
>>>> also appears (using google trends) that MongoDB seems to be growing while
>>>> Cassandra is dying off. Is this the case?
>>>>
>>>> Thanks for the help
>>>>
>>>>          
>>>        
>>      
Well my initial use case would be to store our search logs and perform
some ad-hoc querying which I know is a win for Mongo. However I don't
think I fully understand how to build indexes in Cassandra so maybe its
just an issue of ignorance. I know going forward though we would be
expanding it to house our per item translations.
Reply | Threaded
Open this post in threaded view
|

Re: Cassandra vs MongoDB

Benjamin Black
In reply to this post by Mark-50
They have approximately nothing in common.  And, no, Cassandra is
definitely not dying off.

On Tue, Jul 27, 2010 at 8:14 AM, Mark <[hidden email]> wrote:
> Can someone quickly explain the differences between the two? Other than the
> fact that MongoDB supports ad-hoc querying I don't know whats different. It
> also appears (using google trends) that MongoDB seems to be growing while
> Cassandra is dying off. Is this the case?
>
> Thanks for the help
>
Reply | Threaded
Open this post in threaded view
|

Re: Cassandra vs MongoDB

Joe Stein
In reply to this post by Mark-50
If you are looking to store web logs and then do ad hoc queries you might/should be using Hadoop (depending on how big your logs are)

While MongoDB has MapReduce (built in) it is there to simulate SQL GROUP BY and not for large scale analytics by any means. 

M
ongoDB uses a global read/write lock per operation. general and index-assisted reads are ultra-fast in mongo, but a bigger map/reduce or group call will block other requests until complete, possibly causing traffic to back up. because of that global lock, all writes block, too.

Cassandra is much more durable but from an architecture perspective keystore vs document store could be weighed (on smaller traffic systems that do not need higher level big data scale & durability)

If you have lots of data then MongoDB will eventually become a consistent problem.

Here is a nice article on MongoDB in a larger scale of implementation http://www.mikealrogers.com/2010/07/mongodb-performance-durability/ with some conclusions which also talks about Cassandra, Redis & CouchDB.

MongoDB has made a lot of improvements over time but Cassandra is VERY active also and continues to deliver great features and not driven by a corporation but rather the community. 

MongoDB is backed and started by a company for them to make money using the open source modal instead of Cassandra which started to solve a difficult problem at facebook and then supported completely open source and THEN having a company later pop up (Riptano) to support it making their money using the open source modal... I say this to express the drives of the 2 servers & open source projects/communities are different.

You might see Google trends for MongoDB going up because folks jump into because of the marketing and then have issues and try to find solutions =8^)

Now, I am not bashing MongoDB by an sorts it is a good database (so is MySQL) but it is all about use cases AND the implementation/use/load.  Apply the right solution to the problem it fits in all respects!

For logs (speaking with my architect hat on) I see no reason why you would want to hold that in a document structure but at the same time you might not have that many logs so you can get a lot of benefit from MongoDB M/R and such....But honestly if it is less than 1TB you might be fine JUST using MySQL.

It is all relative. 

Lastly, and back to Hadoop, Cassandra has a nice implementation so that when you load your data into Cassandra you can pull it out to MapReduce it http://allthingshadoop.com/2010/04/24/running-hadoop-mapreduce-with-cassandra-nosql/

/*
Joe Stein
http://www.linkedin.com/in/charmalloc
Twitter: @allthingshadoop
*/

On Tue, Jul 27, 2010 at 4:05 PM, Mark <[hidden email]> wrote:
On 7/27/10 12:42 PM, Dave Gardner wrote:
There are quite a few differences. Ultimately it depends on your use
case! For example Mongo has a limit on the maximum "document" size of
4MB, whereas with Cassandra you are not really limited to the volume
of data/columns per-row (I think there maybe a limit of 2GB perhaps;
basically none)

Another point re: search volumes is that mongo has been actively
promoting over the last few months. I recently attended an excellent
conference day in London which was very cheap; tickets probably didn't
cover the costs. I guess this is part of their strategy. Eg: encourage
adoption.

Dave

On Tuesday, July 27, 2010, Jonathan Shook<[hidden email]>  wrote:
 
Also, google trends is only a measure of what terms people are
searching for. To equate this directly to growth would be misleading.

 Tue, Jul 27, 2010 at 12:27 PM, Drew Dahlke<[hidden email]>  wrote:
   
There's a good post on stackoverflow comparing the two
http://stackoverflow.com/questions/2892729/mongodb-vs-cassandra

It seems to me that both projects have pretty vibrant communities behind them.

On Tue, Jul 27, 2010 at 11:14 AM, Mark<[hidden email]>  wrote:
     
Can someone quickly explain the differences between the two? Other than the
fact that MongoDB supports ad-hoc querying I don't know whats different. It
also appears (using google trends) that MongoDB seems to be growing while
Cassandra is dying off. Is this the case?

Thanks for the help

       
     
   
Well my initial use case would be to store our search logs and perform some ad-hoc querying which I know is a win for Mongo. However I don't think I fully understand how to build indexes in Cassandra so maybe its just an issue of ignorance. I know going forward though we would be expanding it to house our per item translations.



--


Reply | Threaded
Open this post in threaded view
|

Re: Cassandra vs MongoDB

aaron morton

If you are looking to store web logs and then do ad hoc queries you might/should be using Hadoop (depending on how big your logs are)
 
I agree, take a look at the Cloudera Hadopp 3 CDH3, they include an app called Flume for moving data...

"As a result, we designed and built Flume. Flume is a distributed service that makes it very easy to collect and aggregate your data into a persistent store such as HDFS. Flume can read data from almost any source – log files, Syslog packets, the standard output of any Unix process – and can deliver it to a batch processing system like Hadoop or a real-time data store like HBase. All this can be configured dynamically from a single, central location – no more tedious configuration file editing and process restarting. Flume will collect the data from wherever existing applications are storing it, and whisk it away for further analysis and processing."

(I wonder if this could deliver into Cassanda :) )

If it's straight log file processing Hadoop may be a better fit.

Aaron
Reply | Threaded
Open this post in threaded view
|

Re: Cassandra vs MongoDB

Jeremy Hanna
> "As a result, we designed and built Flume...
> (I wonder if this could deliver into Cassanda :) )


Yes - apparently it's pretty easy to do - I was thinking of doing it but haven't found the time yet.

https://issues.cloudera.org//browse/FLUME-20

On Jul 28, 2010, at 4:30 PM, Aaron Morton wrote:

>
>> If you are looking to store web logs and then do ad hoc queries you might/should be using Hadoop (depending on how big your logs are)
>  
> I agree, take a look at the Cloudera Hadopp 3 CDH3, they include an app called Flume for moving data...
>
> "As a result, we designed and built Flume. Flume is a distributed service that makes it very easy to collect and aggregate your data into a persistent store such as HDFS. Flume can read data from almost any source – log files, Syslog packets, the standard output of any Unix process – and can deliver it to a batch processing system like Hadoop or a real-time data store like HBase. All this can be configured dynamically from a single, central location – no more tedious configuration file editing and process restarting. Flume will collect the data from wherever existing applications are storing it, and whisk it away for further analysis and processing."
>
> (I wonder if this could deliver into Cassanda :) )
>
> If it's straight log file processing Hadoop may be a better fit.
>
> Aaron

Reply | Threaded
Open this post in threaded view
|

Re: Cassandra vs MongoDB

Jeff Hammerbacher
Having participated in the design of a few of these systems being mentioned, I'll chime in here and point out that the combination of Flume and Hive makes CDH3 very useful for log processing and that use case is directly in the wheelhouse of the system, especially for large collections of log files (as search logs tend to be).

On Wed, Jul 28, 2010 at 2:59 PM, Jeremy Hanna <[hidden email]> wrote:
> "As a result, we designed and built Flume...
> (I wonder if this could deliver into Cassanda :) )


Yes - apparently it's pretty easy to do - I was thinking of doing it but haven't found the time yet.

https://issues.cloudera.org//browse/FLUME-20

On Jul 28, 2010, at 4:30 PM, Aaron Morton wrote:

>
>> If you are looking to store web logs and then do ad hoc queries you might/should be using Hadoop (depending on how big your logs are)
>
> I agree, take a look at the Cloudera Hadopp 3 CDH3, they include an app called Flume for moving data...
>
> "As a result, we designed and built Flume. Flume is a distributed service that makes it very easy to collect and aggregate your data into a persistent store such as HDFS. Flume can read data from almost any source – log files, Syslog packets, the standard output of any Unix process – and can deliver it to a batch processing system like Hadoop or a real-time data store like HBase. All this can be configured dynamically from a single, central location – no more tedious configuration file editing and process restarting. Flume will collect the data from wherever existing applications are storing it, and whisk it away for further analysis and processing."
>
> (I wonder if this could deliver into Cassanda :) )
>
> If it's straight log file processing Hadoop may be a better fit.
>
> Aaron