calculation of disk size

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

calculation of disk size

Rahul Bhardwaj
Hi All,


We are planning to set up a cluster of 5 nodes with RF 3 for write heavy project, our current database size is around 500 GB. And it is growing at rate of 15 GB every day. We learnt that cassandra consumes space for compaction processes, So how can we calculate the amount of disk space we would require. 

Kindly suggest.



Regards:
Rahul Bhardwaj


Follow IndiaMART.com for latest updates on this and more: Mobile Channel:

Watch how IndiaMART Maximiser helped Mr. Khanna expand his business. kyunki Kaam Yahin Banta Hai!!!
Reply | Threaded
Open this post in threaded view
|

Re: calculation of disk size

arun sirimalla
Hi Rahul,

If you are expecting 15 GB of data per day, here is the calculation.

1 Day = 15 GB, 1 Month = 450 GB, 1 Year = 5.4 TB, so your raw data size for one year is 5.4 TB with replication factor of 3 it would be around 16.2 TB of data for one year.

Taking compaction into consideration and your use case being write heavy, if you go with size tiered compaction. you would need twice the space of your raw data. 

So you would need around 32-34 TB of disk space.


Thanks

On Wed, Apr 29, 2015 at 9:20 PM, Rahul Bhardwaj <[hidden email]> wrote:
Hi All,


We are planning to set up a cluster of 5 nodes with RF 3 for write heavy project, our current database size is around 500 GB. And it is growing at rate of 15 GB every day. We learnt that cassandra consumes space for compaction processes, So how can we calculate the amount of disk space we would require. 

Kindly suggest.



Regards:
Rahul Bhardwaj


Follow IndiaMART.com for latest updates on this and more: Mobile Channel:

Watch how IndiaMART Maximiser helped Mr. Khanna expand his business. kyunki Kaam Yahin Banta Hai!!!



--
Arun 
Senior Hadoop/Cassandra Engineer
Cloudwick

Champion of Big Data (Cloudera)

2014 Data Impact Award Winner (Cloudera)

Reply | Threaded
Open this post in threaded view
|

Re: calculation of disk size

Rahul Neelakantan
Here is a calculator someone has put together 


On Apr 30, 2015, at 12:53 AM, arun sirimalla <[hidden email]> wrote:

Hi Rahul,

If you are expecting 15 GB of data per day, here is the calculation.

1 Day = 15 GB, 1 Month = 450 GB, 1 Year = 5.4 TB, so your raw data size for one year is 5.4 TB with replication factor of 3 it would be around 16.2 TB of data for one year.

Taking compaction into consideration and your use case being write heavy, if you go with size tiered compaction. you would need twice the space of your raw data. 

So you would need around 32-34 TB of disk space.


Thanks

On Wed, Apr 29, 2015 at 9:20 PM, Rahul Bhardwaj <[hidden email]> wrote:
Hi All,


We are planning to set up a cluster of 5 nodes with RF 3 for write heavy project, our current database size is around 500 GB. And it is growing at rate of 15 GB every day. We learnt that cassandra consumes space for compaction processes, So how can we calculate the amount of disk space we would require. 

Kindly suggest.



Regards:
Rahul Bhardwaj


Follow IndiaMART.com for latest updates on this and more: Mobile Channel:

Watch how IndiaMART Maximiser helped Mr. Khanna expand his business. kyunki Kaam Yahin Banta Hai!!!



--
Arun 
Senior Hadoop/Cassandra Engineer
Cloudwick

Champion of Big Data (Cloudera)

2014 Data Impact Award Winner (Cloudera)

Reply | Threaded
Open this post in threaded view
|

Re: calculation of disk size

Rahul Bhardwaj
Thanks Rahul and Arun

On Thu, Apr 30, 2015 at 3:24 PM, Rahul Neelakantan <[hidden email]> wrote:
Here is a calculator someone has put together 


On Apr 30, 2015, at 12:53 AM, arun sirimalla <[hidden email]> wrote:

Hi Rahul,

If you are expecting 15 GB of data per day, here is the calculation.

1 Day = 15 GB, 1 Month = 450 GB, 1 Year = 5.4 TB, so your raw data size for one year is 5.4 TB with replication factor of 3 it would be around 16.2 TB of data for one year.

Taking compaction into consideration and your use case being write heavy, if you go with size tiered compaction. you would need twice the space of your raw data. 

So you would need around 32-34 TB of disk space.


Thanks

On Wed, Apr 29, 2015 at 9:20 PM, Rahul Bhardwaj <[hidden email]> wrote:
Hi All,


We are planning to set up a cluster of 5 nodes with RF 3 for write heavy project, our current database size is around 500 GB. And it is growing at rate of 15 GB every day. We learnt that cassandra consumes space for compaction processes, So how can we calculate the amount of disk space we would require. 

Kindly suggest.



Regards:
Rahul Bhardwaj


Follow IndiaMART.com for latest updates on this and more: Mobile Channel:

Watch how IndiaMART Maximiser helped Mr. Khanna expand his business. kyunki Kaam Yahin Banta Hai!!!



--
Arun 
Senior Hadoop/Cassandra Engineer
Cloudwick

Champion of Big Data (Cloudera)

2014 Data Impact Award Winner (Cloudera)




Follow IndiaMART.com for latest updates on this and more: Mobile Channel:

Watch how IndiaMART Maximiser helped Mr. Khanna expand his business. kyunki Kaam Yahin Banta Hai!!!