Reclaim deleted rows space

classic Classic list List threaded Threaded
21 messages Options
12
Reply | Threaded
Open this post in threaded view
|

Reclaim deleted rows space

shimi
Lets assume I have:
* single 100GB SSTable file
* min compaction threshold is set to 2

If I delete rows which are located in this file. Is the only way to "clean" the deleted rows is by inserting another 100GB of data or by triggering a painful major compaction?

Shimi
Reply | Threaded
Open this post in threaded view
|

Re: Reclaim deleted rows space

Adrian Cockcroft
How many nodes do you have? You should be able to run a rolling compaction around the ring, one node at a time to minimize impact. If one node is too big an impact, maybe you should have a bigger cluster? If you are on EC2, try running more but smaller instances.

Adrian

From: shimi <[hidden email]>
Reply-To: "[hidden email]" <[hidden email]>
Date: Sun, 2 Jan 2011 11:25:42 -0800
To: "[hidden email]" <[hidden email]>
Subject: Reclaim deleted rows space

Lets assume I have:
* single 100GB SSTable file
* min compaction threshold is set to 2

If I delete rows which are located in this file. Is the only way to "clean" the deleted rows is by inserting another 100GB of data or by triggering a painful major compaction?

Shimi
Reply | Threaded
Open this post in threaded view
|

Re: Reclaim deleted rows space

Peter Schuller
In reply to this post by shimi
> Lets assume I have:
> * single 100GB SSTable file
> * min compaction threshold is set to 2
> If I delete rows which are located in this file. Is the only way to "clean"
> the deleted rows is by inserting another 100GB of data or by triggering a
> painful major compaction?

Major compaction does it, but only if GCGraceSeconds has elapsed. See:

   http://spyced.blogspot.com/2010/02/distributed-deletes-in-cassandra.html

--
/ Peter Schuller
Reply | Threaded
Open this post in threaded view
|

Re: Reclaim deleted rows space

Peter Schuller
> Major compaction does it, but only if GCGraceSeconds has elapsed. See:
>
>   http://spyced.blogspot.com/2010/02/distributed-deletes-in-cassandra.html

But to be clear, under the assumption that your data is a lot smaller
than the tombstones, a major compaction will definitely reclaim space
even if GCGraceSeconds has not elapsed. So actually my original
response is a bit misleading.

--
/ Peter Schuller
Reply | Threaded
Open this post in threaded view
|

Re: Reclaim deleted rows space

shimi
This is what I thought. I was wishing there might be another way to reclaim the space.
The problem is that the more data you have the more time it will take to Cassandra to response.
Reclaim space of deleted rows in the biggest SSTable requires Major compaction. This compaction can be triggered by adding x2 data (or x4 data in the default configuration) to the system or by executing it manually using JMX.
In case of a system that deletes data regularly, which needs to serve customers all day and the time it takes should be in ms, this is a problem.

It appears to me that in order to use Cassandra you must have a process that will trigger major compaction on the nodes once in X amount of time.
One case where you would do that is when you don't (or hardly) delete data. Another one is when your upper limit of time it should take to response is very high so major compaction will not hurt you.

It might be that the only way to solve this problem is by having at least two copies of each row in each data center and use a dynamic snitch.

Shimi

On Mon, Jan 3, 2011 at 7:55 PM, Peter Schuller <[hidden email]> wrote:
> Major compaction does it, but only if GCGraceSeconds has elapsed. See:
>
>   http://spyced.blogspot.com/2010/02/distributed-deletes-in-cassandra.html

But to be clear, under the assumption that your data is a lot smaller
than the tombstones, a major compaction will definitely reclaim space
even if GCGraceSeconds has not elapsed. So actually my original
response is a bit misleading.

--
/ Peter Schuller

Reply | Threaded
Open this post in threaded view
|

Re: Reclaim deleted rows space

Peter Schuller
> This is what I thought. I was wishing there might be another way to reclaim
> the space.

Be sure you really need this first :) Normally you just let it happen in the bg.

> The problem is that the more data you have the more time it will take to
> Cassandra to response.

Relative to what though? There are definitely important side-effects
of having very large data sets, and part of that involves compactions,
but in a normal steady state type of system you should never be in the
position to "wait" for a major compaction to run. Compactions are
something that is intended to run every now and then in the
background. It will result in variations in disk space within certain
bounds, which is expected.

Certainly the situation can be improved and the current disk space
utilization situation is not perfect, but the above suggests to me
that you're trying to do something that is not really intended to be
done.

> Reclaim space of deleted rows in the biggest SSTable requires Major
> compaction. This compaction can be triggered by adding x2 data (or x4 data
> in the default configuration) to the system or by executing it manually
> using JMX.

You can indeed choose to trigger major compactions by e.g. cron jobs.
But just be aware that if you're operating under conditions where you
are close to disk space running out, you have other concerns too -
such as periodic repair operations also needing disk space.

Also; suppose you're overwriting lots of data (or replacing by
deleting and adding other data). It is not necessarily true that you
need 4x the space relative to what you otherwise do just because of
the compaction threshold.

Keep in mind that compactions already need extra space anyway. If
you're *not* overwriting or adding data, a compaction of a single CF
is expected to need up to twice the amount of space that it occupies.
If you're doing more overwrites and deletions though, as you point out
you will have more "dead" data at any given point in time. But on the
other hand, the peak disk space usage during compactions is lower. So
the actual peak disk space usage (which is what matters since you must
have this much disk space) is actually helped by the
deletions/overwrites too.

Further, suppose you trigger major compactions more often. That means
each compaction will have a higher relative spike of disk usage
because less data has had time to be overwritten or removed.

So in a sense, it's like the disk space demands is being moved between
the category of "dead data retained for longer than necessary" and
"peak disk usage during compaction".

Also keep in mind that the *low* peak of disk space usage is not
subject to any fragmentation concerns. Depending on the size of your
data compared to e.g. column names, that disk space usage might be
significantly lower than what you would get with an in-place updating
database. There are lots of trade-offs :)

You say you have to "wait" for deletions though which sounds like
you're doing something unusual. Are you doing stuff like deleting lots
of data in bulk from one CF, only to then write data to *another* CF?
Such that you're actually having to wait for disk space to be freed to
make room for data somewhere else?

> In case of a system that deletes data regularly, which needs to serve
> customers all day and the time it takes should be in ms, this is a problem.

Not in general. I am afraid there may be some misunderstanding here.
Unless disk space is a problem for you (i.e., you're running out of
space), there is no need to wait for compactions. And certainly
whether you can serve traffic 24/7 at low-ms latencies is an important
consideration, and does become complex when disk I/O is involved, but
it is not about disk *space*. If you have important performance
requirements, make sure you can service the read load at all given
your data set size. If you're runnning out of disk, I presume your
data is big. See
http://wiki.apache.org/cassandra/LargeDataSetConsiderations

Perhaps if you can describe your situation in more detail?

> It appears to me that in order to use Cassandra you must have a process that
> will trigger major compaction on the nodes once in X amount of time.

For some cases this will be beneficial, but not always. It's been
further improved for 0.7 too w.r.t. tomb stone handling in non-major
compactions (I don't have the JIRA ticket number handy). It's
certainly not a hard requirement and would only ever be relevant if
you're operating nodes that are significantly full.

> One case where you would do that is when you don't (or hardly) delete data.

Or just in most cases where you don't push disk space concerns.

> Another one is when your upper limit of time it should take to response is
> very high so major compaction will not hurt you.

To be really clear: Compaction is a background operation. It is never
the case that reads or writes somehow "wait" for compaction to
complete.

--
/ Peter Schuller
Reply | Threaded
Open this post in threaded view
|

Re: Reclaim deleted rows space

shimi
I think I didn't make myself clear.
I don't have a problem with disk space. I have a problem with the data size. 
I have a simple crud application. Most of the requests are read but there are update/delete and when the time pass the number of deleted rows is big enough in order to free some disk space (a matter of days and not hours).
Since not all of the data can fit to RAM (and I have a lot of RAM) the rest is served from disk. Since disk is slow I want to reduce as much as possible the number of requests that goes to the disk. The more requests to the disk, the disk wait time gets longer and it takes more time to return a response.

Bottom line is that I want to reduce the number of requests that goes to disk. Since there is enough data that is no longer valid I can do it by reclaiming the space. The only way to do it is by running Major compaction. I can wait and let Cassandra do it for me but then the data size will get even bigger and the response time will be worst. I can do it manually but I prefer it to happen in the background with less impact on the system

Shimi


On Tue, Jan 4, 2011 at 2:33 PM, Peter Schuller <[hidden email]> wrote:
> This is what I thought. I was wishing there might be another way to reclaim
> the space.

Be sure you really need this first :) Normally you just let it happen in the bg.

> The problem is that the more data you have the more time it will take to
> Cassandra to response.

Relative to what though? There are definitely important side-effects
of having very large data sets, and part of that involves compactions,
but in a normal steady state type of system you should never be in the
position to "wait" for a major compaction to run. Compactions are
something that is intended to run every now and then in the
background. It will result in variations in disk space within certain
bounds, which is expected.

Certainly the situation can be improved and the current disk space
utilization situation is not perfect, but the above suggests to me
that you're trying to do something that is not really intended to be
done.

> Reclaim space of deleted rows in the biggest SSTable requires Major
> compaction. This compaction can be triggered by adding x2 data (or x4 data
> in the default configuration) to the system or by executing it manually
> using JMX.

You can indeed choose to trigger major compactions by e.g. cron jobs.
But just be aware that if you're operating under conditions where you
are close to disk space running out, you have other concerns too -
such as periodic repair operations also needing disk space.

Also; suppose you're overwriting lots of data (or replacing by
deleting and adding other data). It is not necessarily true that you
need 4x the space relative to what you otherwise do just because of
the compaction threshold.

Keep in mind that compactions already need extra space anyway. If
you're *not* overwriting or adding data, a compaction of a single CF
is expected to need up to twice the amount of space that it occupies.
If you're doing more overwrites and deletions though, as you point out
you will have more "dead" data at any given point in time. But on the
other hand, the peak disk space usage during compactions is lower. So
the actual peak disk space usage (which is what matters since you must
have this much disk space) is actually helped by the
deletions/overwrites too.

Further, suppose you trigger major compactions more often. That means
each compaction will have a higher relative spike of disk usage
because less data has had time to be overwritten or removed.

So in a sense, it's like the disk space demands is being moved between
the category of "dead data retained for longer than necessary" and
"peak disk usage during compaction".

Also keep in mind that the *low* peak of disk space usage is not
subject to any fragmentation concerns. Depending on the size of your
data compared to e.g. column names, that disk space usage might be
significantly lower than what you would get with an in-place updating
database. There are lots of trade-offs :)

You say you have to "wait" for deletions though which sounds like
you're doing something unusual. Are you doing stuff like deleting lots
of data in bulk from one CF, only to then write data to *another* CF?
Such that you're actually having to wait for disk space to be freed to
make room for data somewhere else?

> In case of a system that deletes data regularly, which needs to serve
> customers all day and the time it takes should be in ms, this is a problem.

Not in general. I am afraid there may be some misunderstanding here.
Unless disk space is a problem for you (i.e., you're running out of
space), there is no need to wait for compactions. And certainly
whether you can serve traffic 24/7 at low-ms latencies is an important
consideration, and does become complex when disk I/O is involved, but
it is not about disk *space*. If you have important performance
requirements, make sure you can service the read load at all given
your data set size. If you're runnning out of disk, I presume your
data is big. See
http://wiki.apache.org/cassandra/LargeDataSetConsiderations

Perhaps if you can describe your situation in more detail?

> It appears to me that in order to use Cassandra you must have a process that
> will trigger major compaction on the nodes once in X amount of time.

For some cases this will be beneficial, but not always. It's been
further improved for 0.7 too w.r.t. tomb stone handling in non-major
compactions (I don't have the JIRA ticket number handy). It's
certainly not a hard requirement and would only ever be relevant if
you're operating nodes that are significantly full.

> One case where you would do that is when you don't (or hardly) delete data.

Or just in most cases where you don't push disk space concerns.

> Another one is when your upper limit of time it should take to response is
> very high so major compaction will not hurt you.

To be really clear: Compaction is a background operation. It is never
the case that reads or writes somehow "wait" for compaction to
complete.

--
/ Peter Schuller

Reply | Threaded
Open this post in threaded view
|

Re: Reclaim deleted rows space

Robert Coli
In reply to this post by Peter Schuller
On Tue, Jan 4, 2011 at 4:33 AM, Peter Schuller
<[hidden email]> wrote:
> For some cases this will be beneficial, but not always. It's been
> further improved for 0.7 too w.r.t. tomb stone handling in non-major
> compactions (I don't have the JIRA ticket number handy).

https://issues.apache.org/jira/browse/CASSANDRA-1074

(For those playing along at home..)

=Rob
Reply | Threaded
Open this post in threaded view
|

Re: Reclaim deleted rows space

shimi
Yes I am aware of that.
This is the reason I upgraded to 0.6.8.
Still all the deleted rows in the biggest SSTable will be remove in a major compaction

Shimi

On Tue, Jan 4, 2011 at 6:40 PM, Robert Coli <[hidden email]> wrote:
On Tue, Jan 4, 2011 at 4:33 AM, Peter Schuller
<[hidden email]> wrote:
> For some cases this will be beneficial, but not always. It's been
> further improved for 0.7 too w.r.t. tomb stone handling in non-major
> compactions (I don't have the JIRA ticket number handy).

https://issues.apache.org/jira/browse/CASSANDRA-1074

(For those playing along at home..)

=Rob

Reply | Threaded
Open this post in threaded view
|

Re: Reclaim deleted rows space

Peter Schuller
In reply to this post by shimi
> I don't have a problem with disk space. I have a problem with the data
> size.

[snip]

> Bottom line is that I want to reduce the number of requests that goes to
> disk. Since there is enough data that is no longer valid I can do it by
> reclaiming the space. The only way to do it is by running Major compaction.
> I can wait and let Cassandra do it for me but then the data size will get
> even bigger and the response time will be worst. I can do it manually but I
> prefer it to happen in the background with less impact on the system

Ok - that makes perfect sense then. Sorry for misunderstanding :)

So essentially, for workloads that are teetering on the edge of cache
warmness and is subject to significant overwrites or removals, it may
be beneficial to perform much more aggressive background compaction
even though it might waste lots of CPU, to keep the in-memory working
set down.

There was talk (I think in the compaction redesign ticket) about
potentially improving the use of bloom filters such that obsolete data
in sstables could be eliminated from the read set without
necessitating actual compaction; that might help address cases like
these too.

I don't think there's a pre-existing silver bullet in a current
release; you probably have to live with the need for
greater-than-theoretically-optimal memory requirements to keep the
working set in memory.

--
/ Peter Schuller
Reply | Threaded
Open this post in threaded view
|

Re: Reclaim deleted rows space

shimi
How does minor compaction is triggered? Is it triggered Only when a new SStable is added?
 
I was wondering if triggering a compaction with minimumCompactionThreshold set to 1 would be useful. If this can happen I assume it will do compaction on files with similar size and remove deleted rows on the rest.

Shimi

On Tue, Jan 4, 2011 at 9:56 PM, Peter Schuller <[hidden email]> wrote:
> I don't have a problem with disk space. I have a problem with the data
> size.

[snip]

> Bottom line is that I want to reduce the number of requests that goes to
> disk. Since there is enough data that is no longer valid I can do it by
> reclaiming the space. The only way to do it is by running Major compaction.
> I can wait and let Cassandra do it for me but then the data size will get
> even bigger and the response time will be worst. I can do it manually but I
> prefer it to happen in the background with less impact on the system

Ok - that makes perfect sense then. Sorry for misunderstanding :)

So essentially, for workloads that are teetering on the edge of cache
warmness and is subject to significant overwrites or removals, it may
be beneficial to perform much more aggressive background compaction
even though it might waste lots of CPU, to keep the in-memory working
set down.

There was talk (I think in the compaction redesign ticket) about
potentially improving the use of bloom filters such that obsolete data
in sstables could be eliminated from the read set without
necessitating actual compaction; that might help address cases like
these too.

I don't think there's a pre-existing silver bullet in a current
release; you probably have to live with the need for
greater-than-theoretically-optimal memory requirements to keep the
working set in memory.

--
/ Peter Schuller

Reply | Threaded
Open this post in threaded view
|

Re: Reclaim deleted rows space

Jonathan Ellis-3
Pretty sure there's logic in there that says "don't bother compacting
a single sstable."

On Wed, Jan 5, 2011 at 2:26 PM, shimi <[hidden email]> wrote:

> How does minor compaction is triggered? Is it triggered Only when a new
> SStable is added?
>
> I was wondering if triggering a compaction with minimumCompactionThreshold
> set to 1 would be useful. If this can happen I assume it will do compaction
> on files with similar size and remove deleted rows on the rest.
> Shimi
> On Tue, Jan 4, 2011 at 9:56 PM, Peter Schuller <[hidden email]>
> wrote:
>>
>> > I don't have a problem with disk space. I have a problem with the data
>> > size.
>>
>> [snip]
>>
>> > Bottom line is that I want to reduce the number of requests that goes to
>> > disk. Since there is enough data that is no longer valid I can do it by
>> > reclaiming the space. The only way to do it is by running Major
>> > compaction.
>> > I can wait and let Cassandra do it for me but then the data size will
>> > get
>> > even bigger and the response time will be worst. I can do it manually
>> > but I
>> > prefer it to happen in the background with less impact on the system
>>
>> Ok - that makes perfect sense then. Sorry for misunderstanding :)
>>
>> So essentially, for workloads that are teetering on the edge of cache
>> warmness and is subject to significant overwrites or removals, it may
>> be beneficial to perform much more aggressive background compaction
>> even though it might waste lots of CPU, to keep the in-memory working
>> set down.
>>
>> There was talk (I think in the compaction redesign ticket) about
>> potentially improving the use of bloom filters such that obsolete data
>> in sstables could be eliminated from the read set without
>> necessitating actual compaction; that might help address cases like
>> these too.
>>
>> I don't think there's a pre-existing silver bullet in a current
>> release; you probably have to live with the need for
>> greater-than-theoretically-optimal memory requirements to keep the
>> working set in memory.
>>
>> --
>> / Peter Schuller
>
>



--
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of Riptano, the source for professional Cassandra support
http://riptano.com
Reply | Threaded
Open this post in threaded view
|

Re: Reclaim deleted rows space

Edward Capriolo
On Wed, Jan 5, 2011 at 4:31 PM, Jonathan Ellis <[hidden email]> wrote:

> Pretty sure there's logic in there that says "don't bother compacting
> a single sstable."
>
> On Wed, Jan 5, 2011 at 2:26 PM, shimi <[hidden email]> wrote:
>> How does minor compaction is triggered? Is it triggered Only when a new
>> SStable is added?
>>
>> I was wondering if triggering a compaction with minimumCompactionThreshold
>> set to 1 would be useful. If this can happen I assume it will do compaction
>> on files with similar size and remove deleted rows on the rest.
>> Shimi
>> On Tue, Jan 4, 2011 at 9:56 PM, Peter Schuller <[hidden email]>
>> wrote:
>>>
>>> > I don't have a problem with disk space. I have a problem with the data
>>> > size.
>>>
>>> [snip]
>>>
>>> > Bottom line is that I want to reduce the number of requests that goes to
>>> > disk. Since there is enough data that is no longer valid I can do it by
>>> > reclaiming the space. The only way to do it is by running Major
>>> > compaction.
>>> > I can wait and let Cassandra do it for me but then the data size will
>>> > get
>>> > even bigger and the response time will be worst. I can do it manually
>>> > but I
>>> > prefer it to happen in the background with less impact on the system
>>>
>>> Ok - that makes perfect sense then. Sorry for misunderstanding :)
>>>
>>> So essentially, for workloads that are teetering on the edge of cache
>>> warmness and is subject to significant overwrites or removals, it may
>>> be beneficial to perform much more aggressive background compaction
>>> even though it might waste lots of CPU, to keep the in-memory working
>>> set down.
>>>
>>> There was talk (I think in the compaction redesign ticket) about
>>> potentially improving the use of bloom filters such that obsolete data
>>> in sstables could be eliminated from the read set without
>>> necessitating actual compaction; that might help address cases like
>>> these too.
>>>
>>> I don't think there's a pre-existing silver bullet in a current
>>> release; you probably have to live with the need for
>>> greater-than-theoretically-optimal memory requirements to keep the
>>> working set in memory.
>>>
>>> --
>>> / Peter Schuller
>>
>>
>
>
>
> --
> Jonathan Ellis
> Project Chair, Apache Cassandra
> co-founder of Riptano, the source for professional Cassandra support
> http://riptano.com
>

I was wording if it made sense to have a JMX operation that can
compact a list of tables by file name. This opens it up for power
users to have more options then compact entire keyspace.
Reply | Threaded
Open this post in threaded view
|

Re: Reclaim deleted rows space

Tyler Hobbs
Although it's not exactly the ability to list specific SSTables, the ability to only compact specific CFs will be in upcoming releases:

https://issues.apache.org/jira/browse/CASSANDRA-1812

- Tyler

On Wed, Jan 5, 2011 at 7:46 PM, Edward Capriolo <[hidden email]> wrote:
On Wed, Jan 5, 2011 at 4:31 PM, Jonathan Ellis <[hidden email]> wrote:
> Pretty sure there's logic in there that says "don't bother compacting
> a single sstable."
>
> On Wed, Jan 5, 2011 at 2:26 PM, shimi <[hidden email]> wrote:
>> How does minor compaction is triggered? Is it triggered Only when a new
>> SStable is added?
>>
>> I was wondering if triggering a compaction with minimumCompactionThreshold
>> set to 1 would be useful. If this can happen I assume it will do compaction
>> on files with similar size and remove deleted rows on the rest.
>> Shimi
>> On Tue, Jan 4, 2011 at 9:56 PM, Peter Schuller <[hidden email]>
>> wrote:
>>>
>>> > I don't have a problem with disk space. I have a problem with the data
>>> > size.
>>>
>>> [snip]
>>>
>>> > Bottom line is that I want to reduce the number of requests that goes to
>>> > disk. Since there is enough data that is no longer valid I can do it by
>>> > reclaiming the space. The only way to do it is by running Major
>>> > compaction.
>>> > I can wait and let Cassandra do it for me but then the data size will
>>> > get
>>> > even bigger and the response time will be worst. I can do it manually
>>> > but I
>>> > prefer it to happen in the background with less impact on the system
>>>
>>> Ok - that makes perfect sense then. Sorry for misunderstanding :)
>>>
>>> So essentially, for workloads that are teetering on the edge of cache
>>> warmness and is subject to significant overwrites or removals, it may
>>> be beneficial to perform much more aggressive background compaction
>>> even though it might waste lots of CPU, to keep the in-memory working
>>> set down.
>>>
>>> There was talk (I think in the compaction redesign ticket) about
>>> potentially improving the use of bloom filters such that obsolete data
>>> in sstables could be eliminated from the read set without
>>> necessitating actual compaction; that might help address cases like
>>> these too.
>>>
>>> I don't think there's a pre-existing silver bullet in a current
>>> release; you probably have to live with the need for
>>> greater-than-theoretically-optimal memory requirements to keep the
>>> working set in memory.
>>>
>>> --
>>> / Peter Schuller
>>
>>
>
>
>
> --
> Jonathan Ellis
> Project Chair, Apache Cassandra
> co-founder of Riptano, the source for professional Cassandra support
> http://riptano.com
>

I was wording if it made sense to have a JMX operation that can
compact a list of tables by file name. This opens it up for power
users to have more options then compact entire keyspace.

Reply | Threaded
Open this post in threaded view
|

Re: Reclaim deleted rows space

shimi
Am I missing something here? It is already possible to trigger major compaction on a specific CF.

On Thu, Jan 6, 2011 at 4:50 AM, Tyler Hobbs <[hidden email]> wrote:
Although it's not exactly the ability to list specific SSTables, the ability to only compact specific CFs will be in upcoming releases:

https://issues.apache.org/jira/browse/CASSANDRA-1812

- Tyler


On Wed, Jan 5, 2011 at 7:46 PM, Edward Capriolo <[hidden email]> wrote:
On Wed, Jan 5, 2011 at 4:31 PM, Jonathan Ellis <[hidden email]> wrote:
> Pretty sure there's logic in there that says "don't bother compacting
> a single sstable."
>
> On Wed, Jan 5, 2011 at 2:26 PM, shimi <[hidden email]> wrote:
>> How does minor compaction is triggered? Is it triggered Only when a new
>> SStable is added?
>>
>> I was wondering if triggering a compaction with minimumCompactionThreshold
>> set to 1 would be useful. If this can happen I assume it will do compaction
>> on files with similar size and remove deleted rows on the rest.
>> Shimi
>> On Tue, Jan 4, 2011 at 9:56 PM, Peter Schuller <[hidden email]>
>> wrote:
>>>
>>> > I don't have a problem with disk space. I have a problem with the data
>>> > size.
>>>
>>> [snip]
>>>
>>> > Bottom line is that I want to reduce the number of requests that goes to
>>> > disk. Since there is enough data that is no longer valid I can do it by
>>> > reclaiming the space. The only way to do it is by running Major
>>> > compaction.
>>> > I can wait and let Cassandra do it for me but then the data size will
>>> > get
>>> > even bigger and the response time will be worst. I can do it manually
>>> > but I
>>> > prefer it to happen in the background with less impact on the system
>>>
>>> Ok - that makes perfect sense then. Sorry for misunderstanding :)
>>>
>>> So essentially, for workloads that are teetering on the edge of cache
>>> warmness and is subject to significant overwrites or removals, it may
>>> be beneficial to perform much more aggressive background compaction
>>> even though it might waste lots of CPU, to keep the in-memory working
>>> set down.
>>>
>>> There was talk (I think in the compaction redesign ticket) about
>>> potentially improving the use of bloom filters such that obsolete data
>>> in sstables could be eliminated from the read set without
>>> necessitating actual compaction; that might help address cases like
>>> these too.
>>>
>>> I don't think there's a pre-existing silver bullet in a current
>>> release; you probably have to live with the need for
>>> greater-than-theoretically-optimal memory requirements to keep the
>>> working set in memory.
>>>
>>> --
>>> / Peter Schuller
>>
>>
>
>
>
> --
> Jonathan Ellis
> Project Chair, Apache Cassandra
> co-founder of Riptano, the source for professional Cassandra support
> http://riptano.com
>

I was wording if it made sense to have a JMX operation that can
compact a list of tables by file name. This opens it up for power
users to have more options then compact entire keyspace.


Reply | Threaded
Open this post in threaded view
|

Re: Reclaim deleted rows space

shimi
In reply to this post by Jonathan Ellis-3


On Wed, Jan 5, 2011 at 11:31 PM, Jonathan Ellis <[hidden email]> wrote:
Pretty sure there's logic in there that says "don't bother compacting
a single sstable."
No. You can do it.
Based on the log I have a feeling that it triggers an infinite compaction loop.

 
On Wed, Jan 5, 2011 at 2:26 PM, shimi <[hidden email]> wrote:
> How does minor compaction is triggered? Is it triggered Only when a new
> SStable is added?
>
> I was wondering if triggering a compaction with minimumCompactionThreshold
> set to 1 would be useful. If this can happen I assume it will do compaction
> on files with similar size and remove deleted rows on the rest.
> Shimi
> On Tue, Jan 4, 2011 at 9:56 PM, Peter Schuller <[hidden email]>
> wrote:
>>
>> > I don't have a problem with disk space. I have a problem with the data
>> > size.
>>
>> [snip]
>>
>> > Bottom line is that I want to reduce the number of requests that goes to
>> > disk. Since there is enough data that is no longer valid I can do it by
>> > reclaiming the space. The only way to do it is by running Major
>> > compaction.
>> > I can wait and let Cassandra do it for me but then the data size will
>> > get
>> > even bigger and the response time will be worst. I can do it manually
>> > but I
>> > prefer it to happen in the background with less impact on the system
>>
>> Ok - that makes perfect sense then. Sorry for misunderstanding :)
>>
>> So essentially, for workloads that are teetering on the edge of cache
>> warmness and is subject to significant overwrites or removals, it may
>> be beneficial to perform much more aggressive background compaction
>> even though it might waste lots of CPU, to keep the in-memory working
>> set down.
>>
>> There was talk (I think in the compaction redesign ticket) about
>> potentially improving the use of bloom filters such that obsolete data
>> in sstables could be eliminated from the read set without
>> necessitating actual compaction; that might help address cases like
>> these too.
>>
>> I don't think there's a pre-existing silver bullet in a current
>> release; you probably have to live with the need for
>> greater-than-theoretically-optimal memory requirements to keep the
>> working set in memory.
>>
>> --
>> / Peter Schuller
>
>



--
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of Riptano, the source for professional Cassandra support
http://riptano.com

Reply | Threaded
Open this post in threaded view
|

Re: Reclaim deleted rows space

shimi
According to the code it make sense.
submitMinorIfNeeded() calls doCompaction() which calls submitMinorIfNeeded().
With minimumCompactionThreshold = 1 submitMinorIfNeeded() will always run compaction.

Shimi

On Thu, Jan 6, 2011 at 10:26 AM, shimi <[hidden email]> wrote:


On Wed, Jan 5, 2011 at 11:31 PM, Jonathan Ellis <[hidden email]> wrote:
Pretty sure there's logic in there that says "don't bother compacting
a single sstable."
No. You can do it.
Based on the log I have a feeling that it triggers an infinite compaction loop.

 
On Wed, Jan 5, 2011 at 2:26 PM, shimi <[hidden email]> wrote:
> How does minor compaction is triggered? Is it triggered Only when a new
> SStable is added?
>
> I was wondering if triggering a compaction with minimumCompactionThreshold
> set to 1 would be useful. If this can happen I assume it will do compaction
> on files with similar size and remove deleted rows on the rest.
> Shimi
> On Tue, Jan 4, 2011 at 9:56 PM, Peter Schuller <[hidden email]>
> wrote:
>>
>> > I don't have a problem with disk space. I have a problem with the data
>> > size.
>>
>> [snip]
>>
>> > Bottom line is that I want to reduce the number of requests that goes to
>> > disk. Since there is enough data that is no longer valid I can do it by
>> > reclaiming the space. The only way to do it is by running Major
>> > compaction.
>> > I can wait and let Cassandra do it for me but then the data size will
>> > get
>> > even bigger and the response time will be worst. I can do it manually
>> > but I
>> > prefer it to happen in the background with less impact on the system
>>
>> Ok - that makes perfect sense then. Sorry for misunderstanding :)
>>
>> So essentially, for workloads that are teetering on the edge of cache
>> warmness and is subject to significant overwrites or removals, it may
>> be beneficial to perform much more aggressive background compaction
>> even though it might waste lots of CPU, to keep the in-memory working
>> set down.
>>
>> There was talk (I think in the compaction redesign ticket) about
>> potentially improving the use of bloom filters such that obsolete data
>> in sstables could be eliminated from the read set without
>> necessitating actual compaction; that might help address cases like
>> these too.
>>
>> I don't think there's a pre-existing silver bullet in a current
>> release; you probably have to live with the need for
>> greater-than-theoretically-optimal memory requirements to keep the
>> working set in memory.
>>
>> --
>> / Peter Schuller
>
>



--
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of Riptano, the source for professional Cassandra support
http://riptano.com


Reply | Threaded
Open this post in threaded view
|

Re: Reclaim deleted rows space

Jonathan Shook
I believe the following condition within submitMinorIfNeeded(...)
determines whether to continue, so it's not a hard loop.

// if (sstables.size() >= minThreshold) ...



On Thu, Jan 6, 2011 at 2:51 AM, shimi <[hidden email]> wrote:

> According to the code it make sense.
> submitMinorIfNeeded() calls doCompaction() which
> calls submitMinorIfNeeded().
> With minimumCompactionThreshold = 1 submitMinorIfNeeded() will always run
> compaction.
>
> Shimi
> On Thu, Jan 6, 2011 at 10:26 AM, shimi <[hidden email]> wrote:
>>
>>
>> On Wed, Jan 5, 2011 at 11:31 PM, Jonathan Ellis <[hidden email]> wrote:
>>>
>>> Pretty sure there's logic in there that says "don't bother compacting
>>> a single sstable."
>>
>> No. You can do it.
>> Based on the log I have a feeling that it triggers an infinite compaction
>> loop.
>>
>>>
>>> On Wed, Jan 5, 2011 at 2:26 PM, shimi <[hidden email]> wrote:
>>> > How does minor compaction is triggered? Is it triggered Only when a new
>>> > SStable is added?
>>> >
>>> > I was wondering if triggering a compaction
>>> > with minimumCompactionThreshold
>>> > set to 1 would be useful. If this can happen I assume it will do
>>> > compaction
>>> > on files with similar size and remove deleted rows on the rest.
>>> > Shimi
>>> > On Tue, Jan 4, 2011 at 9:56 PM, Peter Schuller
>>> > <[hidden email]>
>>> > wrote:
>>> >>
>>> >> > I don't have a problem with disk space. I have a problem with the
>>> >> > data
>>> >> > size.
>>> >>
>>> >> [snip]
>>> >>
>>> >> > Bottom line is that I want to reduce the number of requests that
>>> >> > goes to
>>> >> > disk. Since there is enough data that is no longer valid I can do it
>>> >> > by
>>> >> > reclaiming the space. The only way to do it is by running Major
>>> >> > compaction.
>>> >> > I can wait and let Cassandra do it for me but then the data size
>>> >> > will
>>> >> > get
>>> >> > even bigger and the response time will be worst. I can do it
>>> >> > manually
>>> >> > but I
>>> >> > prefer it to happen in the background with less impact on the system
>>> >>
>>> >> Ok - that makes perfect sense then. Sorry for misunderstanding :)
>>> >>
>>> >> So essentially, for workloads that are teetering on the edge of cache
>>> >> warmness and is subject to significant overwrites or removals, it may
>>> >> be beneficial to perform much more aggressive background compaction
>>> >> even though it might waste lots of CPU, to keep the in-memory working
>>> >> set down.
>>> >>
>>> >> There was talk (I think in the compaction redesign ticket) about
>>> >> potentially improving the use of bloom filters such that obsolete data
>>> >> in sstables could be eliminated from the read set without
>>> >> necessitating actual compaction; that might help address cases like
>>> >> these too.
>>> >>
>>> >> I don't think there's a pre-existing silver bullet in a current
>>> >> release; you probably have to live with the need for
>>> >> greater-than-theoretically-optimal memory requirements to keep the
>>> >> working set in memory.
>>> >>
>>> >> --
>>> >> / Peter Schuller
>>> >
>>> >
>>>
>>>
>>>
>>> --
>>> Jonathan Ellis
>>> Project Chair, Apache Cassandra
>>> co-founder of Riptano, the source for professional Cassandra support
>>> http://riptano.com
>>
>
>
Reply | Threaded
Open this post in threaded view
|

Re: Reclaim deleted rows space

shimi
I modified the code to limit the size of the SSTables.
I will be glad if someone can take a look at it


Shimi

On Fri, Jan 7, 2011 at 2:04 AM, Jonathan Shook <[hidden email]> wrote:
I believe the following condition within submitMinorIfNeeded(...)
determines whether to continue, so it's not a hard loop.

// if (sstables.size() >= minThreshold) ...



On Thu, Jan 6, 2011 at 2:51 AM, shimi <[hidden email]> wrote:
> According to the code it make sense.
> submitMinorIfNeeded() calls doCompaction() which
> calls submitMinorIfNeeded().
> With minimumCompactionThreshold = 1 submitMinorIfNeeded() will always run
> compaction.
>
> Shimi
> On Thu, Jan 6, 2011 at 10:26 AM, shimi <[hidden email]> wrote:
>>
>>
>> On Wed, Jan 5, 2011 at 11:31 PM, Jonathan Ellis <[hidden email]> wrote:
>>>
>>> Pretty sure there's logic in there that says "don't bother compacting
>>> a single sstable."
>>
>> No. You can do it.
>> Based on the log I have a feeling that it triggers an infinite compaction
>> loop.
>>
>>>
>>> On Wed, Jan 5, 2011 at 2:26 PM, shimi <[hidden email]> wrote:
>>> > How does minor compaction is triggered? Is it triggered Only when a new
>>> > SStable is added?
>>> >
>>> > I was wondering if triggering a compaction
>>> > with minimumCompactionThreshold
>>> > set to 1 would be useful. If this can happen I assume it will do
>>> > compaction
>>> > on files with similar size and remove deleted rows on the rest.
>>> > Shimi
>>> > On Tue, Jan 4, 2011 at 9:56 PM, Peter Schuller
>>> > <[hidden email]>
>>> > wrote:
>>> >>
>>> >> > I don't have a problem with disk space. I have a problem with the
>>> >> > data
>>> >> > size.
>>> >>
>>> >> [snip]
>>> >>
>>> >> > Bottom line is that I want to reduce the number of requests that
>>> >> > goes to
>>> >> > disk. Since there is enough data that is no longer valid I can do it
>>> >> > by
>>> >> > reclaiming the space. The only way to do it is by running Major
>>> >> > compaction.
>>> >> > I can wait and let Cassandra do it for me but then the data size
>>> >> > will
>>> >> > get
>>> >> > even bigger and the response time will be worst. I can do it
>>> >> > manually
>>> >> > but I
>>> >> > prefer it to happen in the background with less impact on the system
>>> >>
>>> >> Ok - that makes perfect sense then. Sorry for misunderstanding :)
>>> >>
>>> >> So essentially, for workloads that are teetering on the edge of cache
>>> >> warmness and is subject to significant overwrites or removals, it may
>>> >> be beneficial to perform much more aggressive background compaction
>>> >> even though it might waste lots of CPU, to keep the in-memory working
>>> >> set down.
>>> >>
>>> >> There was talk (I think in the compaction redesign ticket) about
>>> >> potentially improving the use of bloom filters such that obsolete data
>>> >> in sstables could be eliminated from the read set without
>>> >> necessitating actual compaction; that might help address cases like
>>> >> these too.
>>> >>
>>> >> I don't think there's a pre-existing silver bullet in a current
>>> >> release; you probably have to live with the need for
>>> >> greater-than-theoretically-optimal memory requirements to keep the
>>> >> working set in memory.
>>> >>
>>> >> --
>>> >> / Peter Schuller
>>> >
>>> >
>>>
>>>
>>>
>>> --
>>> Jonathan Ellis
>>> Project Chair, Apache Cassandra
>>> co-founder of Riptano, the source for professional Cassandra support
>>> http://riptano.com
>>
>
>

Reply | Threaded
Open this post in threaded view
|

Re: Reclaim deleted rows space

Jonathan Ellis-3
I'd suggest describing your approach on
https://issues.apache.org/jira/browse/CASSANDRA-1608, and if it's
attractive, porting it to 0.8.  It's too late for us to make deep
changes in 0.6 and probably even 0.7 for the sake of stability.

On Mon, Jan 10, 2011 at 8:00 AM, shimi <[hidden email]> wrote:

> I modified the code to limit the size of the SSTables.
> I will be glad if someone can take a look at it
> https://github.com/Shimi/cassandra/tree/cassandra-0.6
> Shimi
>
> On Fri, Jan 7, 2011 at 2:04 AM, Jonathan Shook <[hidden email]> wrote:
>>
>> I believe the following condition within submitMinorIfNeeded(...)
>> determines whether to continue, so it's not a hard loop.
>>
>> // if (sstables.size() >= minThreshold) ...
>>
>>
>>
>> On Thu, Jan 6, 2011 at 2:51 AM, shimi <[hidden email]> wrote:
>> > According to the code it make sense.
>> > submitMinorIfNeeded() calls doCompaction() which
>> > calls submitMinorIfNeeded().
>> > With minimumCompactionThreshold = 1 submitMinorIfNeeded() will always
>> > run
>> > compaction.
>> >
>> > Shimi
>> > On Thu, Jan 6, 2011 at 10:26 AM, shimi <[hidden email]> wrote:
>> >>
>> >>
>> >> On Wed, Jan 5, 2011 at 11:31 PM, Jonathan Ellis <[hidden email]>
>> >> wrote:
>> >>>
>> >>> Pretty sure there's logic in there that says "don't bother compacting
>> >>> a single sstable."
>> >>
>> >> No. You can do it.
>> >> Based on the log I have a feeling that it triggers an infinite
>> >> compaction
>> >> loop.
>> >>
>> >>>
>> >>> On Wed, Jan 5, 2011 at 2:26 PM, shimi <[hidden email]> wrote:
>> >>> > How does minor compaction is triggered? Is it triggered Only when a
>> >>> > new
>> >>> > SStable is added?
>> >>> >
>> >>> > I was wondering if triggering a compaction
>> >>> > with minimumCompactionThreshold
>> >>> > set to 1 would be useful. If this can happen I assume it will do
>> >>> > compaction
>> >>> > on files with similar size and remove deleted rows on the rest.
>> >>> > Shimi
>> >>> > On Tue, Jan 4, 2011 at 9:56 PM, Peter Schuller
>> >>> > <[hidden email]>
>> >>> > wrote:
>> >>> >>
>> >>> >> > I don't have a problem with disk space. I have a problem with the
>> >>> >> > data
>> >>> >> > size.
>> >>> >>
>> >>> >> [snip]
>> >>> >>
>> >>> >> > Bottom line is that I want to reduce the number of requests that
>> >>> >> > goes to
>> >>> >> > disk. Since there is enough data that is no longer valid I can do
>> >>> >> > it
>> >>> >> > by
>> >>> >> > reclaiming the space. The only way to do it is by running Major
>> >>> >> > compaction.
>> >>> >> > I can wait and let Cassandra do it for me but then the data size
>> >>> >> > will
>> >>> >> > get
>> >>> >> > even bigger and the response time will be worst. I can do it
>> >>> >> > manually
>> >>> >> > but I
>> >>> >> > prefer it to happen in the background with less impact on the
>> >>> >> > system
>> >>> >>
>> >>> >> Ok - that makes perfect sense then. Sorry for misunderstanding :)
>> >>> >>
>> >>> >> So essentially, for workloads that are teetering on the edge of
>> >>> >> cache
>> >>> >> warmness and is subject to significant overwrites or removals, it
>> >>> >> may
>> >>> >> be beneficial to perform much more aggressive background compaction
>> >>> >> even though it might waste lots of CPU, to keep the in-memory
>> >>> >> working
>> >>> >> set down.
>> >>> >>
>> >>> >> There was talk (I think in the compaction redesign ticket) about
>> >>> >> potentially improving the use of bloom filters such that obsolete
>> >>> >> data
>> >>> >> in sstables could be eliminated from the read set without
>> >>> >> necessitating actual compaction; that might help address cases like
>> >>> >> these too.
>> >>> >>
>> >>> >> I don't think there's a pre-existing silver bullet in a current
>> >>> >> release; you probably have to live with the need for
>> >>> >> greater-than-theoretically-optimal memory requirements to keep the
>> >>> >> working set in memory.
>> >>> >>
>> >>> >> --
>> >>> >> / Peter Schuller
>> >>> >
>> >>> >
>> >>>
>> >>>
>> >>>
>> >>> --
>> >>> Jonathan Ellis
>> >>> Project Chair, Apache Cassandra
>> >>> co-founder of Riptano, the source for professional Cassandra support
>> >>> http://riptano.com
>> >>
>> >
>> >
>
>



--
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of Riptano, the source for professional Cassandra support
http://riptano.com
12