error deleting messages

classic Classic list List threaded Threaded
11 messages Options
Reply | Threaded
Open this post in threaded view
|

error deleting messages

joss Earl
I run into trouble after a while if I delete rows, this happens in both 2.1.3 and 2.0.13, and I encountered the same problem when using either the datastax java driver or the stock python driver.
The problem is reproducible using the attached python program.

Once the problem is encountered, the table becomes unusable..

cqlsh:test1> select id from msg limit 1;
Request did not complete within rpc_timeout.

So, questions are:
am I doing something wrong ?
is this expected behaviour ?
is there some way to fix the table and make it usable again once this has happened ?
if this is a bug, what is the best way of reporting it ?

Many thanks
Joss

cass2.py (1K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: error deleting messages

Ali Akhtar
Can you put your code on gist.github.com or pastebin?

On Tue, Mar 24, 2015 at 4:58 PM, joss Earl <[hidden email]> wrote:
I run into trouble after a while if I delete rows, this happens in both 2.1.3 and 2.0.13, and I encountered the same problem when using either the datastax java driver or the stock python driver.
The problem is reproducible using the attached python program.

Once the problem is encountered, the table becomes unusable..

cqlsh:test1> select id from msg limit 1;
Request did not complete within rpc_timeout.

So, questions are:
am I doing something wrong ?
is this expected behaviour ?
is there some way to fix the table and make it usable again once this has happened ?
if this is a bug, what is the best way of reporting it ?

Many thanks
Joss

Reply | Threaded
Open this post in threaded view
|

Re: error deleting messages

joss Earl

On 24 March 2015 at 12:04, Ali Akhtar <[hidden email]> wrote:
Can you put your code on gist.github.com or pastebin?

On Tue, Mar 24, 2015 at 4:58 PM, joss Earl <[hidden email]> wrote:
I run into trouble after a while if I delete rows, this happens in both 2.1.3 and 2.0.13, and I encountered the same problem when using either the datastax java driver or the stock python driver.
The problem is reproducible using the attached python program.

Once the problem is encountered, the table becomes unusable..

cqlsh:test1> select id from msg limit 1;
Request did not complete within rpc_timeout.

So, questions are:
am I doing something wrong ?
is this expected behaviour ?
is there some way to fix the table and make it usable again once this has happened ?
if this is a bug, what is the best way of reporting it ?

Many thanks
Joss


Reply | Threaded
Open this post in threaded view
|

Re: error deleting messages

Ali Akhtar
What happens when you run it? How far does it get before stopping?

On Tue, Mar 24, 2015 at 5:13 PM, joss Earl <[hidden email]> wrote:

On 24 March 2015 at 12:04, Ali Akhtar <[hidden email]> wrote:
Can you put your code on gist.github.com or pastebin?

On Tue, Mar 24, 2015 at 4:58 PM, joss Earl <[hidden email]> wrote:
I run into trouble after a while if I delete rows, this happens in both 2.1.3 and 2.0.13, and I encountered the same problem when using either the datastax java driver or the stock python driver.
The problem is reproducible using the attached python program.

Once the problem is encountered, the table becomes unusable..

cqlsh:test1> select id from msg limit 1;
Request did not complete within rpc_timeout.

So, questions are:
am I doing something wrong ?
is this expected behaviour ?
is there some way to fix the table and make it usable again once this has happened ?
if this is a bug, what is the best way of reporting it ?

Many thanks
Joss



Reply | Threaded
Open this post in threaded view
|

Re: error deleting messages

joss Earl
on a stock install, it gets to about 50100 before grinding to a halt



On 24 March 2015 at 12:19, Ali Akhtar <[hidden email]> wrote:
What happens when you run it? How far does it get before stopping?

On Tue, Mar 24, 2015 at 5:13 PM, joss Earl <[hidden email]> wrote:

On 24 March 2015 at 12:04, Ali Akhtar <[hidden email]> wrote:
Can you put your code on gist.github.com or pastebin?

On Tue, Mar 24, 2015 at 4:58 PM, joss Earl <[hidden email]> wrote:
I run into trouble after a while if I delete rows, this happens in both 2.1.3 and 2.0.13, and I encountered the same problem when using either the datastax java driver or the stock python driver.
The problem is reproducible using the attached python program.

Once the problem is encountered, the table becomes unusable..

cqlsh:test1> select id from msg limit 1;
Request did not complete within rpc_timeout.

So, questions are:
am I doing something wrong ?
is this expected behaviour ?
is there some way to fix the table and make it usable again once this has happened ?
if this is a bug, what is the best way of reporting it ?

Many thanks
Joss




Reply | Threaded
Open this post in threaded view
|

Re: error deleting messages

Ali Akhtar
50100 inserts or deletes? also how much ram / cpu do you have on the server running this, and what's the ram / cpu usage at about the time it fails?

On Tue, Mar 24, 2015 at 5:29 PM, joss Earl <[hidden email]> wrote:
on a stock install, it gets to about 50100 before grinding to a halt



On 24 March 2015 at 12:19, Ali Akhtar <[hidden email]> wrote:
What happens when you run it? How far does it get before stopping?

On Tue, Mar 24, 2015 at 5:13 PM, joss Earl <[hidden email]> wrote:

On 24 March 2015 at 12:04, Ali Akhtar <[hidden email]> wrote:
Can you put your code on gist.github.com or pastebin?

On Tue, Mar 24, 2015 at 4:58 PM, joss Earl <[hidden email]> wrote:
I run into trouble after a while if I delete rows, this happens in both 2.1.3 and 2.0.13, and I encountered the same problem when using either the datastax java driver or the stock python driver.
The problem is reproducible using the attached python program.

Once the problem is encountered, the table becomes unusable..

cqlsh:test1> select id from msg limit 1;
Request did not complete within rpc_timeout.

So, questions are:
am I doing something wrong ?
is this expected behaviour ?
is there some way to fix the table and make it usable again once this has happened ?
if this is a bug, what is the best way of reporting it ?

Many thanks
Joss





Reply | Threaded
Open this post in threaded view
|

Re: error deleting messages

joss Earl
It inserts 100,000 messages, I then start deleting the messages by grabbing chunks of 100 at a time and then individually deleting each message.

So, the 100,000 messages get inserted without any trouble, I run into trouble when I have deleted about half of them. I've run this on machines with 4,8, and 16gig of ram and behaviour was consistent (I fail after 50000 or so messages on that table, or maybe 30,000 messages on a table with more columns).



On 24 March 2015 at 12:35, Ali Akhtar <[hidden email]> wrote:
50100 inserts or deletes? also how much ram / cpu do you have on the server running this, and what's the ram / cpu usage at about the time it fails?

On Tue, Mar 24, 2015 at 5:29 PM, joss Earl <[hidden email]> wrote:
on a stock install, it gets to about 50100 before grinding to a halt



On 24 March 2015 at 12:19, Ali Akhtar <[hidden email]> wrote:
What happens when you run it? How far does it get before stopping?

On Tue, Mar 24, 2015 at 5:13 PM, joss Earl <[hidden email]> wrote:

On 24 March 2015 at 12:04, Ali Akhtar <[hidden email]> wrote:
Can you put your code on gist.github.com or pastebin?

On Tue, Mar 24, 2015 at 4:58 PM, joss Earl <[hidden email]> wrote:
I run into trouble after a while if I delete rows, this happens in both 2.1.3 and 2.0.13, and I encountered the same problem when using either the datastax java driver or the stock python driver.
The problem is reproducible using the attached python program.

Once the problem is encountered, the table becomes unusable..

cqlsh:test1> select id from msg limit 1;
Request did not complete within rpc_timeout.

So, questions are:
am I doing something wrong ?
is this expected behaviour ?
is there some way to fix the table and make it usable again once this has happened ?
if this is a bug, what is the best way of reporting it ?

Many thanks
Joss






Reply | Threaded
Open this post in threaded view
|

Re: error deleting messages

Duncan Sands
In reply to this post by joss Earl
Hi Joss,

On 24/03/15 12:58, joss Earl wrote:
> I run into trouble after a while if I delete rows, this happens in both 2.1.3
> and 2.0.13, and I encountered the same problem when using either the datastax
> java driver or the stock python driver.
> The problem is reproducible using the attached python program.
>
> Once the problem is encountered, the table becomes unusable..
>
> cqlsh:test1> select id from msg limit 1;
> Request did not complete within rpc_timeout.

my guess is that due to previous delete attempts your table is filled with
tombstones, and this is why the read times out: it hits the tombstone limit.
Check your Cassandra node logs to see if they mention tombstones.

Ciao, Duncan.

>
> So, questions are:
> am I doing something wrong ?
> is this expected behaviour ?
> is there some way to fix the table and make it usable again once this has happened ?
> if this is a bug, what is the best way of reporting it ?
>
> Many thanks
> Joss

Reply | Threaded
Open this post in threaded view
|

Re: error deleting messages

Anuj
In reply to this post by joss Earl
Hi Joss

We faced similar issue recently. The problem seems to be related to huge number of tombstones generated after deletion. I would suggest you to increase tombstone warning and failure threshold in cassandra.yaml.

Once you do that and run your program make sure that you monitor Cassandra Heap usage using nodetool info command. If heap is near full Cassandra halts are obvious. So you need to increase heap.

Due to increased tombstones your query is unable to complete within short time..I would suggest increasing read timeout in cassandra.yaml so that query may complete.

Please look at your logs to make sure that there are no exception.

Thanks
Anuj Wadehra



From:"joss Earl" <[hidden email]>
Date:Tue, 24 Mar, 2015 at 6:17 pm
Subject:Re: error deleting messages

It inserts 100,000 messages, I then start deleting the messages by grabbing chunks of 100 at a time and then individually deleting each message.

So, the 100,000 messages get inserted without any trouble, I run into trouble when I have deleted about half of them. I've run this on machines with 4,8, and 16gig of ram and behaviour was consistent (I fail after 50000 or so messages on that table, or maybe 30,000 messages on a table with more columns).



On 24 March 2015 at 12:35, Ali Akhtar <<a rel="nofollow" shape="rect" ymailto="mailto:ali.rac200@gmail.com" target="_blank" href="javascript:return">ali.rac200@...> wrote:
50100 inserts or deletes? also how much ram / cpu do you have on the server running this, and what's the ram / cpu usage at about the time it fails?

On Tue, Mar 24, 2015 at 5:29 PM, joss Earl <<a rel="nofollow" shape="rect" ymailto="mailto:joss@rareformnewmedia.com" target="_blank" href="javascript:return">joss@...> wrote:
on a stock install, it gets to about 50100 before grinding to a halt



On 24 March 2015 at 12:19, Ali Akhtar <<a rel="nofollow" shape="rect" ymailto="mailto:ali.rac200@gmail.com" target="_blank" href="javascript:return">ali.rac200@...> wrote:
What happens when you run it? How far does it get before stopping?

On Tue, Mar 24, 2015 at 5:13 PM, joss Earl <<a rel="nofollow" shape="rect" ymailto="mailto:joss@rareformnewmedia.com" target="_blank" href="javascript:return">joss@...> wrote:

On 24 March 2015 at 12:04, Ali Akhtar <<a rel="nofollow" shape="rect" ymailto="mailto:ali.rac200@gmail.com" target="_blank" href="javascript:return">ali.rac200@...> wrote:
Can you put your code on gist.github.com or pastebin?

On Tue, Mar 24, 2015 at 4:58 PM, joss Earl <<a rel="nofollow" shape="rect" ymailto="mailto:joss@rareformnewmedia.com" target="_blank" href="javascript:return">joss@...> wrote:
I run into trouble after a while if I delete rows, this happens in both 2.1.3 and 2.0.13, and I encountered the same problem when using either the datastax java driver or the stock python driver.
The problem is reproducible using the attached python program.

Once the problem is encountered, the table becomes unusable..

cqlsh:test1> select id from msg limit 1;
Request did not complete within rpc_timeout.

So, questions are:
am I doing something wrong ?
is this expected behaviour ?
is there some way to fix the table and make it usable again once this has happened ?
if this is a bug, what is the best way of reporting it ?

Many thanks
Joss






Reply | Threaded
Open this post in threaded view
|

Re: error deleting messages

joss Earl
Hi Anuj

Yes, thanks.. looking at my log file I see:

ERROR [SharedPool-Worker-2] 2015-03-24 13:52:06,751 SliceQueryFilter.java:218 - Scanned over 100000 tombstones in test1.msg; query aborted (see tombstone_\
failure_threshold)
WARN  [SharedPool-Worker-2] 2015-03-24 13:52:06,759 AbstractTracingAwareExecutorService.java:169 - Uncaught exception on thread Thread[SharedPool-Worker-2\
,5,main]: {}
java.lang.RuntimeException: org.apache.cassandra.db.filter.TombstoneOverwhelmingException

I'm reading up about how to deal with this now, thanks..

On 24 March 2015 at 13:16, Anuj Wadehra <[hidden email]> wrote:
Hi Joss

We faced similar issue recently. The problem seems to be related to huge number of tombstones generated after deletion. I would suggest you to increase tombstone warning and failure threshold in cassandra.yaml.

Once you do that and run your program make sure that you monitor Cassandra Heap usage using nodetool info command. If heap is near full Cassandra halts are obvious. So you need to increase heap.

Due to increased tombstones your query is unable to complete within short time..I would suggest increasing read timeout in cassandra.yaml so that query may complete.

Please look at your logs to make sure that there are no exception.

Thanks
Anuj Wadehra



From:"joss Earl" <[hidden email]>
Date:Tue, 24 Mar, 2015 at 6:17 pm
Subject:Re: error deleting messages

It inserts 100,000 messages, I then start deleting the messages by grabbing chunks of 100 at a time and then individually deleting each message.

So, the 100,000 messages get inserted without any trouble, I run into trouble when I have deleted about half of them. I've run this on machines with 4,8, and 16gig of ram and behaviour was consistent (I fail after 50000 or so messages on that table, or maybe 30,000 messages on a table with more columns).



On 24 March 2015 at 12:35, Ali Akhtar <[hidden email]> wrote:
50100 inserts or deletes? also how much ram / cpu do you have on the server running this, and what's the ram / cpu usage at about the time it fails?

On Tue, Mar 24, 2015 at 5:29 PM, joss Earl <[hidden email]> wrote:
on a stock install, it gets to about 50100 before grinding to a halt



On 24 March 2015 at 12:19, Ali Akhtar <[hidden email]> wrote:
What happens when you run it? How far does it get before stopping?

On Tue, Mar 24, 2015 at 5:13 PM, joss Earl <[hidden email]> wrote:

On 24 March 2015 at 12:04, Ali Akhtar <[hidden email]> wrote:
Can you put your code on gist.github.com or pastebin?

On Tue, Mar 24, 2015 at 4:58 PM, joss Earl <[hidden email]> wrote:
I run into trouble after a while if I delete rows, this happens in both 2.1.3 and 2.0.13, and I encountered the same problem when using either the datastax java driver or the stock python driver.
The problem is reproducible using the attached python program.

Once the problem is encountered, the table becomes unusable..

cqlsh:test1> select id from msg limit 1;
Request did not complete within rpc_timeout.

So, questions are:
am I doing something wrong ?
is this expected behaviour ?
is there some way to fix the table and make it usable again once this has happened ?
if this is a bug, what is the best way of reporting it ?

Many thanks
Joss







Reply | Threaded
Open this post in threaded view
|

Re: error deleting messages

James Schappet
This talk from DataStax was talking about deletes as an Anti-pattern.  It may be worth watching.

Thanks for your interest in the following webinar:

Avoiding anti-patterns: How to stay in love with Cassandra
 
Here are the links to the video recording and presentation slides.
 
Thanks,
 
The DataStax Team

On Mar 24, 2015, at 9:09 AM, joss Earl <[hidden email]> wrote:

Hi Anuj

Yes, thanks.. looking at my log file I see:

ERROR [SharedPool-Worker-2] 2015-03-24 13:52:06,751 SliceQueryFilter.java:218 - Scanned over 100000 tombstones in test1.msg; query aborted (see tombstone_\
failure_threshold)
WARN  [SharedPool-Worker-2] 2015-03-24 13:52:06,759 AbstractTracingAwareExecutorService.java:169 - Uncaught exception on thread Thread[SharedPool-Worker-2\
,5,main]: {}
java.lang.RuntimeException: org.apache.cassandra.db.filter.TombstoneOverwhelmingException

I'm reading up about how to deal with this now, thanks..

On 24 March 2015 at 13:16, Anuj Wadehra <[hidden email]> wrote:
Hi Joss

We faced similar issue recently. The problem seems to be related to huge number of tombstones generated after deletion. I would suggest you to increase tombstone warning and failure threshold in cassandra.yaml.

Once you do that and run your program make sure that you monitor Cassandra Heap usage using nodetool info command. If heap is near full Cassandra halts are obvious. So you need to increase heap.

Due to increased tombstones your query is unable to complete within short time..I would suggest increasing read timeout in cassandra.yaml so that query may complete.

Please look at your logs to make sure that there are no exception.

Thanks
Anuj Wadehra



From:"joss Earl" <[hidden email]>
Date:Tue, 24 Mar, 2015 at 6:17 pm
Subject:Re: error deleting messages

It inserts 100,000 messages, I then start deleting the messages by grabbing chunks of 100 at a time and then individually deleting each message.

So, the 100,000 messages get inserted without any trouble, I run into trouble when I have deleted about half of them. I've run this on machines with 4,8, and 16gig of ram and behaviour was consistent (I fail after 50000 or so messages on that table, or maybe 30,000 messages on a table with more columns).



On 24 March 2015 at 12:35, Ali Akhtar <[hidden email]> wrote:
50100 inserts or deletes? also how much ram / cpu do you have on the server running this, and what's the ram / cpu usage at about the time it fails?

On Tue, Mar 24, 2015 at 5:29 PM, joss Earl <[hidden email]> wrote:
on a stock install, it gets to about 50100 before grinding to a halt



On 24 March 2015 at 12:19, Ali Akhtar <[hidden email]> wrote:
What happens when you run it? How far does it get before stopping?

On Tue, Mar 24, 2015 at 5:13 PM, joss Earl <[hidden email]> wrote:

On 24 March 2015 at 12:04, Ali Akhtar <[hidden email]> wrote:
Can you put your code on gist.github.com or pastebin?

On Tue, Mar 24, 2015 at 4:58 PM, joss Earl <[hidden email]> wrote:
I run into trouble after a while if I delete rows, this happens in both 2.1.3 and 2.0.13, and I encountered the same problem when using either the datastax java driver or the stock python driver.
The problem is reproducible using the attached python program.

Once the problem is encountered, the table becomes unusable..

cqlsh:test1> select id from msg limit 1;
Request did not complete within rpc_timeout.

So, questions are:
am I doing something wrong ?
is this expected behaviour ?
is there some way to fix the table and make it usable again once this has happened ?
if this is a bug, what is the best way of reporting it ?

Many thanks
Joss