OperationTimedOut in selerct count statement in cqlsh

classic Classic list List threaded Threaded
8 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

OperationTimedOut in selerct count statement in cqlsh

Mich Talebzadeh

Hi,

 

I have a table of 300,000 rows.

 

When I try to do a simple

 

cqlsh:ase> select count(1) from t;

OperationTimedOut: errors={}, last_host=127.0.0.1

 

Appreciate any feedback

 

Thanks,

 

Mich

 

 

NOTE: The information in this email is proprietary and confidential. This message is for the designated recipient only, if you are not the intended recipient, you should destroy it immediately. Any information in this message shall not be understood as given or endorsed by Peridale Ltd, its subsidiaries or their employees, unless expressly so stated. It is the responsibility of the recipient to ensure that this email is virus free, therefore neither Peridale Ltd, its subsidiaries nor their employees accept any responsibility.

 

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: OperationTimedOut in selerct count statement in cqlsh

Tommy Stendahl
Hi,

Checkout CASSANDRA-8899, my guess is that you have to increase the timeout in cqlsh.

/Tommy

On 2015-04-22 11:15, Mich Talebzadeh wrote:

Hi,

 

I have a table of 300,000 rows.

 

When I try to do a simple

 

cqlsh:ase> select count(1) from t;

OperationTimedOut: errors={}, last_host=127.0.0.1

 

Appreciate any feedback

 

Thanks,

 

Mich

 

 

NOTE: The information in this email is proprietary and confidential. This message is for the designated recipient only, if you are not the intended recipient, you should destroy it immediately. Any information in this message shall not be understood as given or endorsed by Peridale Ltd, its subsidiaries or their employees, unless expressly so stated. It is the responsibility of the recipient to ensure that this email is virus free, therefore neither Peridale Ltd, its subsidiaries nor their employees accept any responsibility.

 


Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: OperationTimedOut in selerct count statement in cqlsh

Robert Wille-2
Keep in mind that "select count(l)" and "select l" amount to essentially the same thing.

On Apr 22, 2015, at 3:41 AM, Tommy Stendahl <[hidden email]> wrote:

Hi,

Checkout CASSANDRA-8899, my guess is that you have to increase the timeout in cqlsh.

/Tommy

On 2015-04-22 11:15, Mich Talebzadeh wrote:
Hi,
 
I have a table of 300,000 rows.
 
When I try to do a simple
 
cqlsh:ase> select count(1) from t;
OperationTimedOut: errors={}, last_host=127.0.0.1
 
Appreciate any feedback
 
Thanks,
 
Mich
 
 
NOTE: The information in this email is proprietary and confidential. This message is for the designated recipient only, if you are not the intended recipient, you should destroy it immediately. Any information in this message shall not be understood as given or endorsed by Peridale Ltd, its subsidiaries or their employees, unless expressly so stated. It is the responsibility of the recipient to ensure that this email is virus free, therefore neither Peridale Ltd, its subsidiaries nor their employees accept any responsibility.
 



Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

RE: OperationTimedOut in selerct count statement in cqlsh

Mich Talebzadeh

Thanks Robert,

 

In RDBMS select count(1) basically returns the rows.

 

1> select count(1) from t

2> go

 

-----------

      300000

 

(1 row affected)

 

Is count(1) fundamentally different in Cassandra?

 

Does count(1) means return (in my case) 1 three hundred thousand time?

 

Cheers,

 

 

Mich Talebzadeh

 

http://talebzadehmich.wordpress.com

 

Author of the books "A Practitioner’s Guide to Upgrading to Sybase ASE 15", ISBN 978-0-9563693-0-7.

co-author "Sybase Transact SQL Guidelines Best Practices", ISBN 978-0-9759693-0-4

Publications due shortly:

Creating in-memory Data Grid for Trading Systems with Oracle TimesTen and Coherence Cache

Oracle and Sybase, Concepts and Contrasts, ISBN: 978-0-9563693-1-4, volume one out shortly

 

NOTE: The information in this email is proprietary and confidential. This message is for the designated recipient only, if you are not the intended recipient, you should destroy it immediately. Any information in this message shall not be understood as given or endorsed by Peridale Ltd, its subsidiaries or their employees, unless expressly so stated. It is the responsibility of the recipient to ensure that this email is virus free, therefore neither Peridale Ltd, its subsidiaries nor their employees accept any responsibility.

 

From: Robert Wille [mailto:[hidden email]]
Sent: 22 April 2015 14:44
To: [hidden email]
Subject: Re: OperationTimedOut in selerct count statement in cqlsh

 

Keep in mind that "select count(l)" and "select l" amount to essentially the same thing.

 

On Apr 22, 2015, at 3:41 AM, Tommy Stendahl <[hidden email]> wrote:



Hi,

Checkout CASSANDRA-8899, my guess is that you have to increase the timeout in cqlsh.

/Tommy

On 2015-04-22 11:15, Mich Talebzadeh wrote:

Hi,

 

I have a table of 300,000 rows.

 

When I try to do a simple

 

cqlsh:ase> select count(1) from t;

OperationTimedOut: errors={}, last_host=127.0.0.1

 

Appreciate any feedback

 

Thanks,

 

Mich

 

 

NOTE: The information in this email is proprietary and confidential. This message is for the designated recipient only, if you are not the intended recipient, you should destroy it immediately. Any information in this message shall not be understood as given or endorsed by Peridale Ltd, its subsidiaries or their employees, unless expressly so stated. It is the responsibility of the recipient to ensure that this email is virus free, therefore neither Peridale Ltd, its subsidiaries nor their employees accept any responsibility.

 

 

 

 

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: OperationTimedOut in selerct count statement in cqlsh

Robert Wille-2
I should have been more clear. What I meant was that its about the same amount of work for the cluster to do a “select count(l)” as it is to do a “select l” (unlike in the RDBMS world, where count(l) can use the primary key index). The reason why is the coordinator has to retrieve all the rows from all the nodes and count them. The only thing you’re saving is that the rows don’t have to be sent to the client.

I heard from another Cassandra user that they found “select l" to be faster than "select count(l)”. I don’t know why that would be, but I’ve seen stranger things.

Robert

On Apr 22, 2015, at 7:49 AM, Mich Talebzadeh <[hidden email]> wrote:

Thanks Robert,
 
In RDBMS select count(1) basically returns the rows.
 
1> select count(1) from t
2> go
 
-----------
      300000
 
(1 row affected)
 
Is count(1) fundamentally different in Cassandra?
 
Does count(1) means return (in my case) 1 three hundred thousand time?
 
Cheers,
 
 
Mich Talebzadeh
 
 
Author of the books "A Practitioner’s Guide to Upgrading to Sybase ASE 15", ISBN 978-0-9563693-0-7.
co-author "Sybase Transact SQL Guidelines Best Practices", ISBN 978-0-9759693-0-4
Publications due shortly:
Creating in-memory Data Grid for Trading Systems with Oracle TimesTen and Coherence Cache
Oracle and Sybase, Concepts and Contrasts, ISBN: 978-0-9563693-1-4, volume one out shortly
 
NOTE: The information in this email is proprietary and confidential. This message is for the designated recipient only, if you are not the intended recipient, you should destroy it immediately. Any information in this message shall not be understood as given or endorsed by Peridale Ltd, its subsidiaries or their employees, unless expressly so stated. It is the responsibility of the recipient to ensure that this email is virus free, therefore neither Peridale Ltd, its subsidiaries nor their employees accept any responsibility.
 
From: Robert Wille [[hidden email]] 
Sent: 22 April 2015 14:44
To: [hidden email]
Subject: Re: OperationTimedOut in selerct count statement in cqlsh
 
Keep in mind that "select count(l)" and "select l" amount to essentially the same thing.
 
On Apr 22, 2015, at 3:41 AM, Tommy Stendahl <[hidden email]> wrote:


Hi,

Checkout CASSANDRA-8899, my guess is that you have to increase the timeout in cqlsh.

/Tommy

On 2015-04-22 11:15, Mich Talebzadeh wrote:
Hi,
 
I have a table of 300,000 rows.
 
When I try to do a simple
 
cqlsh:ase> select count(1) from t;
OperationTimedOut: errors={}, last_host=127.0.0.1
 
Appreciate any feedback
 
Thanks,
 
Mich
 
 
NOTE: The information in this email is proprietary and confidential. This message is for the designated recipient only, if you are not the intended recipient, you should destroy it immediately. Any information in this message shall not be understood as given or endorsed by Peridale Ltd, its subsidiaries or their employees, unless expressly so stated. It is the responsibility of the recipient to ensure that this email is virus free, therefore neither Peridale Ltd, its subsidiaries nor their employees accept any responsibility.

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

RE: OperationTimedOut in selerct count statement in cqlsh

Mich Talebzadeh

Thanks Robert for explanation.

 

Please correct me if I am wrong.

 

Currently running a single node cluster of Cassandra. There is the primary key on object_id column in both RDBMS and Cassandra.

 

As you correctly pointed out RDBMS does not need to touch the base table. It can just go through the primary key B-tree index to work out the rows

 

 

       |ROOT:EMIT Operator (VA = 2)

       |

       |   |SCALAR AGGREGATE Operator (VA = 1)

       |   |  Evaluate Ungrouped COUNT AGGREGATE.

       |   |

       |   |   |SCAN Operator (VA = 0)

       |   |   |  FROM TABLE

       |   |   |  t

       |   |   |  Using Clustered Index.

       |   |   |  Index : t_ui

       |   |   |  Forward Scan.

       |   |   |  Positioning at index start.

       |   |   |  Index contains all needed columns. Base table will not be read.

       |   |   |  Using I/O Size 64 Kbytes for index leaf pages.

       |   |   |  With LRU Buffer Replacement Strategy for index leaf pages.

 

 

Total estimated I/O cost for statement 1 (at line 1): 144996.

 

 

-----------

      300000

 

 

Whereas in Cassandra it has to retrieve every row and count the total of the rows without sending results back?

 

What are the other alternatives to make it faster if any?

 

 

Cheers,

 

 

Mich Talebzadeh

 

http://talebzadehmich.wordpress.com

 

Author of the books "A Practitioner’s Guide to Upgrading to Sybase ASE 15", ISBN 978-0-9563693-0-7.

co-author "Sybase Transact SQL Guidelines Best Practices", ISBN 978-0-9759693-0-4

Publications due shortly:

Creating in-memory Data Grid for Trading Systems with Oracle TimesTen and Coherence Cache

Oracle and Sybase, Concepts and Contrasts, ISBN: 978-0-9563693-1-4, volume one out shortly

 

NOTE: The information in this email is proprietary and confidential. This message is for the designated recipient only, if you are not the intended recipient, you should destroy it immediately. Any information in this message shall not be understood as given or endorsed by Peridale Ltd, its subsidiaries or their employees, unless expressly so stated. It is the responsibility of the recipient to ensure that this email is virus free, therefore neither Peridale Ltd, its subsidiaries nor their employees accept any responsibility.

 

From: Robert Wille [mailto:[hidden email]]
Sent: 22 April 2015 15:00
To: [hidden email]
Subject: Re: OperationTimedOut in selerct count statement in cqlsh

 

I should have been more clear. What I meant was that its about the same amount of work for the cluster to do a “select count(l)” as it is to do a “select l” (unlike in the RDBMS world, where count(l) can use the primary key index). The reason why is the coordinator has to retrieve all the rows from all the nodes and count them. The only thing you’re saving is that the rows don’t have to be sent to the client.

 

I heard from another Cassandra user that they found “select l" to be faster than "select count(l)”. I don’t know why that would be, but I’ve seen stranger things.

 

Robert

 

On Apr 22, 2015, at 7:49 AM, Mich Talebzadeh <[hidden email]> wrote:



Thanks Robert,

 

In RDBMS select count(1) basically returns the rows.

 

1> select count(1) from t

2> go

 

-----------

      300000

 

(1 row affected)

 

Is count(1) fundamentally different in Cassandra?

 

Does count(1) means return (in my case) 1 three hundred thousand time?

 

Cheers,

 

 

Mich Talebzadeh

 

http://talebzadehmich.wordpress.com

 

Author of the books "A Practitioner’s Guide to Upgrading to Sybase ASE 15", ISBN 978-0-9563693-0-7.

co-author "Sybase Transact SQL Guidelines Best Practices", ISBN 978-0-9759693-0-4

Publications due shortly:

Creating in-memory Data Grid for Trading Systems with Oracle TimesTen and Coherence Cache

Oracle and Sybase, Concepts and Contrasts, ISBN: 978-0-9563693-1-4, volume one out shortly

 

NOTE: The information in this email is proprietary and confidential. This message is for the designated recipient only, if you are not the intended recipient, you should destroy it immediately. Any information in this message shall not be understood as given or endorsed by Peridale Ltd, its subsidiaries or their employees, unless expressly so stated. It is the responsibility of the recipient to ensure that this email is virus free, therefore neither Peridale Ltd, its subsidiaries nor their employees accept any responsibility.

 

From: Robert Wille [[hidden email]] 
Sent: 22 April 2015 14:44
To: [hidden email]
Subject: Re: OperationTimedOut in selerct count statement in cqlsh

 

Keep in mind that "select count(l)" and "select l" amount to essentially the same thing.

 

On Apr 22, 2015, at 3:41 AM, Tommy Stendahl <[hidden email]> wrote:




Hi,

Checkout CASSANDRA-8899, my guess is that you have to increase the timeout in cqlsh.

/Tommy

On 2015-04-22 11:15, Mich Talebzadeh wrote:

Hi,

 

I have a table of 300,000 rows.

 

When I try to do a simple

 

cqlsh:ase> select count(1) from t;

OperationTimedOut: errors={}, last_host=127.0.0.1

 

Appreciate any feedback

 

Thanks,

 

Mich

 

 

NOTE: The information in this email is proprietary and confidential. This message is for the designated recipient only, if you are not the intended recipient, you should destroy it immediately. Any information in this message shall not be understood as given or endorsed by Peridale Ltd, its subsidiaries or their employees, unless expressly so stated. It is the responsibility of the recipient to ensure that this email is virus free, therefore neither Peridale Ltd, its subsidiaries nor their employees accept any responsibility.

 

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: OperationTimedOut in selerct count statement in cqlsh

Robert Wille-2
In reply to this post by Mich Talebzadeh
And I should have read the post more clearly. I thought it was count(l), not count(1). But, either way, you’re counting the number of records in the table, which in the RDBMS world means scanning an index, and in Cassandra means the coordinator has to select all the records from all the nodes.

In general, counting records in Cassandra is bad. People are accustomed to counting being a cheap operation, but in any distributed database with replication, it is going to be expensive. If your data model requires that you count large number of records, then I recommend you revise your data model and maintain a counter. I know that can be a pain, but there really is not way to count records fast.

On Apr 22, 2015, at 7:49 AM, Mich Talebzadeh <[hidden email]> wrote:

Thanks Robert,
 
In RDBMS select count(1) basically returns the rows.
 
1> select count(1) from t
2> go
 
-----------
      300000
 
(1 row affected)
 
Is count(1) fundamentally different in Cassandra?
 
Does count(1) means return (in my case) 1 three hundred thousand time?
 
Cheers,
 
 
Mich Talebzadeh
 
 
Author of the books "A Practitioner’s Guide to Upgrading to Sybase ASE 15", ISBN 978-0-9563693-0-7.
co-author "Sybase Transact SQL Guidelines Best Practices", ISBN 978-0-9759693-0-4
Publications due shortly:
Creating in-memory Data Grid for Trading Systems with Oracle TimesTen and Coherence Cache
Oracle and Sybase, Concepts and Contrasts, ISBN: 978-0-9563693-1-4, volume one out shortly
 
NOTE: The information in this email is proprietary and confidential. This message is for the designated recipient only, if you are not the intended recipient, you should destroy it immediately. Any information in this message shall not be understood as given or endorsed by Peridale Ltd, its subsidiaries or their employees, unless expressly so stated. It is the responsibility of the recipient to ensure that this email is virus free, therefore neither Peridale Ltd, its subsidiaries nor their employees accept any responsibility.
 
From: Robert Wille [[hidden email]] 
Sent: 22 April 2015 14:44
To: [hidden email]
Subject: Re: OperationTimedOut in selerct count statement in cqlsh
 
Keep in mind that "select count(l)" and "select l" amount to essentially the same thing.
 
On Apr 22, 2015, at 3:41 AM, Tommy Stendahl <[hidden email]> wrote:


Hi,

Checkout CASSANDRA-8899, my guess is that you have to increase the timeout in cqlsh.

/Tommy

On 2015-04-22 11:15, Mich Talebzadeh wrote:
Hi,
 
I have a table of 300,000 rows.
 
When I try to do a simple
 
cqlsh:ase> select count(1) from t;
OperationTimedOut: errors={}, last_host=127.0.0.1
 
Appreciate any feedback
 
Thanks,
 
Mich
 
 
NOTE: The information in this email is proprietary and confidential. This message is for the designated recipient only, if you are not the intended recipient, you should destroy it immediately. Any information in this message shall not be understood as given or endorsed by Peridale Ltd, its subsidiaries or their employees, unless expressly so stated. It is the responsibility of the recipient to ensure that this email is virus free, therefore neither Peridale Ltd, its subsidiaries nor their employees accept any responsibility.

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: OperationTimedOut in selerct count statement in cqlsh

Robert Wille-2
In reply to this post by Mich Talebzadeh
Use a counter table to maintain the count so you don’t have to compute it. When you do something that affects the count, its generally easy to issue an asynchronous query to update the counter in parallel with the actual work. It definitely complicates the code, especially if you have a lot of places where you do things that affect the count, but generally doesn’t cost much, if anything, in terms of performance.

Due to Cassandra’s eventually consistent model and lack atomicity, you need to write your code to deal gracefully with the possibility of the counter being inaccurate. How hard that is really depends a lot on your data model.

Robert

On Apr 22, 2015, at 8:07 AM, Mich Talebzadeh <[hidden email]> wrote:

Thanks Robert for explanation.
 
Please correct me if I am wrong.
 
Currently running a single node cluster of Cassandra. There is the primary key on object_id column in both RDBMS and Cassandra.
 
As you correctly pointed out RDBMS does not need to touch the base table. It can just go through the primary key B-tree index to work out the rows
 
 
       |ROOT:EMIT Operator (VA = 2)
       |
       |   |SCALAR AGGREGATE Operator (VA = 1)
       |   |  Evaluate Ungrouped COUNT AGGREGATE.
       |   |
       |   |   |SCAN Operator (VA = 0)
       |   |   |  FROM TABLE
       |   |   |  t
       |   |   |  Using Clustered Index.
       |   |   |  Index : t_ui
       |   |   |  Forward Scan.
       |   |   |  Positioning at index start.
       |   |   |  Index contains all needed columns. Base table will not be read.
       |   |   |  Using I/O Size 64 Kbytes for index leaf pages.
       |   |   |  With LRU Buffer Replacement Strategy for index leaf pages.
 
 
Total estimated I/O cost for statement 1 (at line 1): 144996.
 
 
-----------
      300000
 
 
Whereas in Cassandra it has to retrieve every row and count the total of the rows without sending results back?
 
What are the other alternatives to make it faster if any?
 
 
Cheers,
 
 
Mich Talebzadeh
 
 
Author of the books "A Practitioner’s Guide to Upgrading to Sybase ASE 15", ISBN 978-0-9563693-0-7.
co-author "Sybase Transact SQL Guidelines Best Practices", ISBN 978-0-9759693-0-4
Publications due shortly:
Creating in-memory Data Grid for Trading Systems with Oracle TimesTen and Coherence Cache
Oracle and Sybase, Concepts and Contrasts, ISBN: 978-0-9563693-1-4, volume one out shortly
 
NOTE: The information in this email is proprietary and confidential. This message is for the designated recipient only, if you are not the intended recipient, you should destroy it immediately. Any information in this message shall not be understood as given or endorsed by Peridale Ltd, its subsidiaries or their employees, unless expressly so stated. It is the responsibility of the recipient to ensure that this email is virus free, therefore neither Peridale Ltd, its subsidiaries nor their employees accept any responsibility.
 
From: Robert Wille [[hidden email]] 
Sent: 22 April 2015 15:00
To: [hidden email]
Subject: Re: OperationTimedOut in selerct count statement in cqlsh
 
I should have been more clear. What I meant was that its about the same amount of work for the cluster to do a “select count(l)” as it is to do a “select l” (unlike in the RDBMS world, where count(l) can use the primary key index). The reason why is the coordinator has to retrieve all the rows from all the nodes and count them. The only thing you’re saving is that the rows don’t have to be sent to the client.
 
I heard from another Cassandra user that they found “select l" to be faster than "select count(l)”. I don’t know why that would be, but I’ve seen stranger things.
 
Robert
 
On Apr 22, 2015, at 7:49 AM, Mich Talebzadeh <[hidden email]> wrote:


Thanks Robert,
 
In RDBMS select count(1) basically returns the rows.
 
1> select count(1) from t
2> go
 
-----------
      300000
 
(1 row affected)
 
Is count(1) fundamentally different in Cassandra?
 
Does count(1) means return (in my case) 1 three hundred thousand time?
 
Cheers,
 
 
Mich Talebzadeh
 
 
Author of the books "A Practitioner’s Guide to Upgrading to Sybase ASE 15", ISBN 978-0-9563693-0-7.
co-author "Sybase Transact SQL Guidelines Best Practices", ISBN 978-0-9759693-0-4
Publications due shortly:
Creating in-memory Data Grid for Trading Systems with Oracle TimesTen and Coherence Cache
Oracle and Sybase, Concepts and Contrasts, ISBN: 978-0-9563693-1-4, volume one out shortly
 
NOTE: The information in this email is proprietary and confidential. This message is for the designated recipient only, if you are not the intended recipient, you should destroy it immediately. Any information in this message shall not be understood as given or endorsed by Peridale Ltd, its subsidiaries or their employees, unless expressly so stated. It is the responsibility of the recipient to ensure that this email is virus free, therefore neither Peridale Ltd, its subsidiaries nor their employees accept any responsibility.
 
From: Robert Wille [[hidden email]] 
Sent: 22 April 2015 14:44
To: [hidden email]
Subject: Re: OperationTimedOut in selerct count statement in cqlsh
 
Keep in mind that "select count(l)" and "select l" amount to essentially the same thing.
 
On Apr 22, 2015, at 3:41 AM, Tommy Stendahl <[hidden email]> wrote:



Hi,

Checkout CASSANDRA-8899, my guess is that you have to increase the timeout in cqlsh.

/Tommy

On 2015-04-22 11:15, Mich Talebzadeh wrote:
Hi,
 
I have a table of 300,000 rows.
 
When I try to do a simple
 
cqlsh:ase> select count(1) from t;
OperationTimedOut: errors={}, last_host=127.0.0.1
 
Appreciate any feedback
 
Thanks,
 
Mich
 
 
NOTE: The information in this email is proprietary and confidential. This message is for the designated recipient only, if you are not the intended recipient, you should destroy it immediately. Any information in this message shall not be understood as given or endorsed by Peridale Ltd, its subsidiaries or their employees, unless expressly so stated. It is the responsibility of the recipient to ensure that this email is virus free, therefore neither Peridale Ltd, its subsidiaries nor their employees accept any responsibility.

Loading...