quorum read timeout

classic Classic list List threaded Threaded
12 messages Options
Reply | Threaded
Open this post in threaded view
|

quorum read timeout

Phillip Michalak
I'm running three Cassandra nodes in virtual machines.

During a 'get' operation from Cassandra-remote directed at one of these nodes, I'm receiving the following output

vadmin@vadmin:~/cassandra$ interface/gen-py/cassandra/Cassandra-remote -h 192.168.133.130:9160 get 'MockElementLibrary' '0401318uuuuruepwdcznr' "ColumnPath('strings', None, 'id')" 2
/usr/local/lib/python2.6/dist-packages/thrift/Thrift.py:58: DeprecationWarning: BaseException.message has been deprecated as of Python 2.6
  self.message = message
/usr/local/lib/python2.6/dist-packages/thrift/Thrift.py:99: DeprecationWarning: BaseException.message has been deprecated as of Python 2.6
  self.message = iprot.readString();
Traceback (most recent call last):
  File "interface/gen-py/cassandra/Cassandra-remote", line 93, in <module>
    pp.pprint(client.get(args[0],args[1],eval(args[2]),eval(args[3]),))
  File "/home/vadmin/cassandra-0.4.0-beta1/interface/gen-py/cassandra/Cassandra.py", line 182, in get
    return self.recv_get()
  File "/home/vadmin/cassandra-0.4.0-beta1/interface/gen-py/cassandra/Cassandra.py", line 201, in recv_get
    raise x
thrift.Thrift.TApplicationException/usr/local/lib/python2.6/dist-packages/thrift/Thrift.py:76: DeprecationWarning: BaseException.message has been deprecated as of Python 2.6
  if self.message:
/usr/local/lib/python2.6/dist-packages/thrift/Thrift.py:77: DeprecationWarning: BaseException.message has been deprecated as of Python 2.6
  return self.message
: Internal error processing get


The same 'get' operation from Cassandra-remote directed at another of these nodes, yields 'normal' output

vadmin@vadmin:~/cassandra$ interface/gen-py/cassandra/Cassandra-remote -h 192.168.133.129:9160 get 'MockElementLibrary' '0401318uuuuruepwdcznr' "ColumnPath('strings', None, 'id')" 2
Traceback (most recent call last):
  File "interface/gen-py/cassandra/Cassandra-remote", line 93, in <module>
    pp.pprint(client.get(args[0],args[1],eval(args[2]),eval(args[3]),))
  File "/home/vadmin/cassandra-0.4.0-beta1/interface/gen-py/cassandra/Cassandra.py", line 182, in get
    return self.recv_get()
  File "/home/vadmin/cassandra-0.4.0-beta1/interface/gen-py/cassandra/Cassandra.py", line 210, in recv_get
    raise result.nfe
ttypes.NotFoundException: NotFoundException()

Furthermore, querying the same column for (some) other keys is successful when no matter which node it is directed at.
Looking at the log for the node that produced the error from the query above:

DEBUG [pool-1-thread-22] 2009-08-19 16:54:57,618 CassandraServer.java (line 221) get
DEBUG [pool-1-thread-22] 2009-08-19 16:54:57,618 StorageProxy.java (line 420) strongread reading data for SliceByNamesReadCommand(table='MockElementLibrary', key='0401318uuuuruepwdcznr', columnParent='QueryPath(columnFamilyName='strings', superColumnName='null', columnName='null')', columns=[id,]) from 38184@null
DEBUG [pool-1-thread-22] 2009-08-19 16:54:57,619 StorageProxy.java (line 427) strongread reading digest for SliceByNamesReadCommand(table='MockElementLibrary', key='0401318uuuuruepwdcznr', columnParent='QueryPath(columnFamilyName='strings', superColumnName='null', columnName='null')', columns=[id,]) from [hidden email]:7000
 WARN [MESSAGE-SERIALIZER-POOL:4] 2009-08-19 16:54:57,619 MessageSerializationTask.java (line 81) Exception was generated at : 08/19/2009 16:54:57 on thread MESSAGE-SERIALIZER-POOL:4

java.lang.NullPointerException
        at org.apache.cassandra.net.TcpConnection.<init>(TcpConnection.java:83)
        at org.apache.cassandra.net.TcpConnectionManager.getConnection(TcpConnectionManager.java:64)
        at org.apache.cassandra.net.MessagingService.getConnection(MessagingService.java:306)
        at org.apache.cassandra.net.MessageSerializationTask.run(MessageSerializationTask.java:66)
        at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
        at java.lang.Thread.run(Thread.java:619)

DEBUG [RESPONSE-STAGE:4] 2009-08-19 16:54:57,622 ResponseVerbHandler.java (line 38) Processing response on a callback from [hidden email]:7000
ERROR [pool-1-thread-22] 2009-08-19 16:55:02,619 Cassandra.java (line 608) Internal error processing get
java.lang.RuntimeException: java.util.concurrent.TimeoutException: Operation timed out - received only 1 responses from 192.168.133.129:7000 .
        at org.apache.cassandra.service.CassandraServer.readColumnFamily(CassandraServer.java:100)
        at org.apache.cassandra.service.CassandraServer.get(CassandraServer.java:226)
        at org.apache.cassandra.service.Cassandra$Processor$get.process(Cassandra.java:602)
        at org.apache.cassandra.service.Cassandra$Processor.process(Cassandra.java:560)
        at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:252)
        at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
        at java.lang.Thread.run(Thread.java:619)
Caused by: java.util.concurrent.TimeoutException: Operation timed out - received only 1 responses from 192.168.133.129:7000 .
        at org.apache.cassandra.service.QuorumResponseHandler.get(QuorumResponseHandler.java:86)
        at org.apache.cassandra.service.StorageProxy.strongRead(StorageProxy.java:435)
        at org.apache.cassandra.service.StorageProxy.readProtocol(StorageProxy.java:330)
        at org.apache.cassandra.service.CassandraServer.readColumnFamily(CassandraServer.java:92)
        ... 7 more


It appears to me that there is a timeout during the QuorumResponseHandler processing, stemming from a NullPointerException that occurs as part of the read process. I suspect that this NullPointerException has something to do with the second DEBUG [pool-1-thread-22] comment regarding strongread ... from 38184@null.

Does anyone know why this might be happening?

Thanks for any insight,
Phil

Reply | Threaded
Open this post in threaded view
|

Re: quorum read timeout

Jonathan Ellis-3
Is this 0.3 or 0.4/trunk?

On Wed, Aug 19, 2009 at 2:36 PM, Phillip
Michalak<[hidden email]> wrote:

> I'm running three Cassandra nodes in virtual machines.
> During a 'get' operation from Cassandra-remote directed at one of these
> nodes, I'm receiving the following output
>
> vadmin@vadmin:~/cassandra$ interface/gen-py/cassandra/Cassandra-remote -h
> 192.168.133.130:9160 get 'MockElementLibrary' '0401318uuuuruepwdcznr'
> "ColumnPath('strings', None, 'id')" 2
> /usr/local/lib/python2.6/dist-packages/thrift/Thrift.py:58:
> DeprecationWarning: BaseException.message has been deprecated as of Python
> 2.6
>   self.message = message
> /usr/local/lib/python2.6/dist-packages/thrift/Thrift.py:99:
> DeprecationWarning: BaseException.message has been deprecated as of Python
> 2.6
>   self.message = iprot.readString();
> Traceback (most recent call last):
>   File "interface/gen-py/cassandra/Cassandra-remote", line 93, in <module>
>     pp.pprint(client.get(args[0],args[1],eval(args[2]),eval(args[3]),))
>   File
> "/home/vadmin/cassandra-0.4.0-beta1/interface/gen-py/cassandra/Cassandra.py",
> line 182, in get
>     return self.recv_get()
>   File
> "/home/vadmin/cassandra-0.4.0-beta1/interface/gen-py/cassandra/Cassandra.py",
> line 201, in recv_get
>     raise x
> thrift.Thrift.TApplicationException/usr/local/lib/python2.6/dist-packages/thrift/Thrift.py:76:
> DeprecationWarning: BaseException.message has been deprecated as of Python
> 2.6
>   if self.message:
> /usr/local/lib/python2.6/dist-packages/thrift/Thrift.py:77:
> DeprecationWarning: BaseException.message has been deprecated as of Python
> 2.6
>   return self.message
> : Internal error processing get
>
> The same 'get' operation from Cassandra-remote directed at another of these
> nodes, yields 'normal' output
> vadmin@vadmin:~/cassandra$ interface/gen-py/cassandra/Cassandra-remote -h
> 192.168.133.129:9160 get 'MockElementLibrary' '0401318uuuuruepwdcznr'
> "ColumnPath('strings', None, 'id')" 2
> Traceback (most recent call last):
>   File "interface/gen-py/cassandra/Cassandra-remote", line 93, in <module>
>     pp.pprint(client.get(args[0],args[1],eval(args[2]),eval(args[3]),))
>   File
> "/home/vadmin/cassandra-0.4.0-beta1/interface/gen-py/cassandra/Cassandra.py",
> line 182, in get
>     return self.recv_get()
>   File
> "/home/vadmin/cassandra-0.4.0-beta1/interface/gen-py/cassandra/Cassandra.py",
> line 210, in recv_get
>     raise result.nfe
> ttypes.NotFoundException: NotFoundException()
> Furthermore, querying the same column for (some) other keys is successful
> when no matter which node it is directed at.
> Looking at the log for the node that produced the error from the query
> above:
>
> DEBUG [pool-1-thread-22] 2009-08-19 16:54:57,618 CassandraServer.java (line
> 221) get
> DEBUG [pool-1-thread-22] 2009-08-19 16:54:57,618 StorageProxy.java (line
> 420) strongread reading data for
> SliceByNamesReadCommand(table='MockElementLibrary',
> key='0401318uuuuruepwdcznr',
> columnParent='QueryPath(columnFamilyName='strings', superColumnName='null',
> columnName='null')', columns=[id,]) from 38184@null
> DEBUG [pool-1-thread-22] 2009-08-19 16:54:57,619 StorageProxy.java (line
> 427) strongread reading digest for
> SliceByNamesReadCommand(table='MockElementLibrary',
> key='0401318uuuuruepwdcznr',
> columnParent='QueryPath(columnFamilyName='strings', superColumnName='null',
> columnName='null')', columns=[id,]) from 38185@192.168.133.129:7000
>  WARN [MESSAGE-SERIALIZER-POOL:4] 2009-08-19 16:54:57,619
> MessageSerializationTask.java (line 81) Exception was generated at :
> 08/19/2009 16:54:57 on thread MESSAGE-SERIALIZER-POOL:4
> java.lang.NullPointerException
>         at
> org.apache.cassandra.net.TcpConnection.<init>(TcpConnection.java:83)
>         at
> org.apache.cassandra.net.TcpConnectionManager.getConnection(TcpConnectionManager.java:64)
>         at
> org.apache.cassandra.net.MessagingService.getConnection(MessagingService.java:306)
>         at
> org.apache.cassandra.net.MessageSerializationTask.run(MessageSerializationTask.java:66)
>         at
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>         at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>         at java.lang.Thread.run(Thread.java:619)
> DEBUG [RESPONSE-STAGE:4] 2009-08-19 16:54:57,622 ResponseVerbHandler.java
> (line 38) Processing response on a callback
> from 65B1E352-A0A3-1A7F-138B-9BEA3E1D787F@192.168.133.129:7000
> ERROR [pool-1-thread-22] 2009-08-19 16:55:02,619 Cassandra.java (line 608)
> Internal error processing get
> java.lang.RuntimeException: java.util.concurrent.TimeoutException: Operation
> timed out - received only 1 responses from 192.168.133.129:7000 .
>         at
> org.apache.cassandra.service.CassandraServer.readColumnFamily(CassandraServer.java:100)
>         at
> org.apache.cassandra.service.CassandraServer.get(CassandraServer.java:226)
>         at
> org.apache.cassandra.service.Cassandra$Processor$get.process(Cassandra.java:602)
>         at
> org.apache.cassandra.service.Cassandra$Processor.process(Cassandra.java:560)
>         at
> org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:252)
>         at
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>         at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>         at java.lang.Thread.run(Thread.java:619)
> Caused by: java.util.concurrent.TimeoutException: Operation timed out -
> received only 1 responses from 192.168.133.129:7000 .
>         at
> org.apache.cassandra.service.QuorumResponseHandler.get(QuorumResponseHandler.java:86)
>         at
> org.apache.cassandra.service.StorageProxy.strongRead(StorageProxy.java:435)
>         at
> org.apache.cassandra.service.StorageProxy.readProtocol(StorageProxy.java:330)
>         at
> org.apache.cassandra.service.CassandraServer.readColumnFamily(CassandraServer.java:92)
>         ... 7 more
>
> It appears to me that there is a timeout during the QuorumResponseHandler
> processing, stemming from a NullPointerException that occurs as part of the
> read process. I suspect that this NullPointerException has something to do
> with the second DEBUG [pool-1-thread-22] comment regarding strongread ...
> from 38184@null.
> Does anyone know why this might be happening?
> Thanks for any insight,
> Phil
>
Reply | Threaded
Open this post in threaded view
|

Re: quorum read timeout

Phillip Michalak
It's cassandra-0.4-beta1.

Thanks!
Phil

On Aug 19, 2009, at 4:43 PM, Jonathan Ellis wrote:

> Is this 0.3 or 0.4/trunk?
>
> On Wed, Aug 19, 2009 at 2:36 PM, Phillip
> Michalak<[hidden email]> wrote:
>> I'm running three Cassandra nodes in virtual machines.
>> During a 'get' operation from Cassandra-remote directed at one of  
>> these
>> nodes, I'm receiving the following output
>>
>> vadmin@vadmin:~/cassandra$ interface/gen-py/cassandra/Cassandra-
>> remote -h
>> 192.168.133.130:9160 get 'MockElementLibrary' '0401318uuuuruepwdcznr'
>> "ColumnPath('strings', None, 'id')" 2
>> /usr/local/lib/python2.6/dist-packages/thrift/Thrift.py:58:
>> DeprecationWarning: BaseException.message has been deprecated as of  
>> Python
>> 2.6
>>   self.message = message
>> /usr/local/lib/python2.6/dist-packages/thrift/Thrift.py:99:
>> DeprecationWarning: BaseException.message has been deprecated as of  
>> Python
>> 2.6
>>   self.message = iprot.readString();
>> Traceback (most recent call last):
>>   File "interface/gen-py/cassandra/Cassandra-remote", line 93, in  
>> <module>
>>      
>> pp.pprint(client.get(args[0],args[1],eval(args[2]),eval(args[3]),))
>>   File
>> "/home/vadmin/cassandra-0.4.0-beta1/interface/gen-py/cassandra/
>> Cassandra.py",
>> line 182, in get
>>     return self.recv_get()
>>   File
>> "/home/vadmin/cassandra-0.4.0-beta1/interface/gen-py/cassandra/
>> Cassandra.py",
>> line 201, in recv_get
>>     raise x
>> thrift.Thrift.TApplicationException/usr/local/lib/python2.6/dist-
>> packages/thrift/Thrift.py:76:
>> DeprecationWarning: BaseException.message has been deprecated as of  
>> Python
>> 2.6
>>   if self.message:
>> /usr/local/lib/python2.6/dist-packages/thrift/Thrift.py:77:
>> DeprecationWarning: BaseException.message has been deprecated as of  
>> Python
>> 2.6
>>   return self.message
>> : Internal error processing get
>>
>> The same 'get' operation from Cassandra-remote directed at another  
>> of these
>> nodes, yields 'normal' output
>> vadmin@vadmin:~/cassandra$ interface/gen-py/cassandra/Cassandra-
>> remote -h
>> 192.168.133.129:9160 get 'MockElementLibrary' '0401318uuuuruepwdcznr'
>> "ColumnPath('strings', None, 'id')" 2
>> Traceback (most recent call last):
>>   File "interface/gen-py/cassandra/Cassandra-remote", line 93, in  
>> <module>
>>      
>> pp.pprint(client.get(args[0],args[1],eval(args[2]),eval(args[3]),))
>>   File
>> "/home/vadmin/cassandra-0.4.0-beta1/interface/gen-py/cassandra/
>> Cassandra.py",
>> line 182, in get
>>     return self.recv_get()
>>   File
>> "/home/vadmin/cassandra-0.4.0-beta1/interface/gen-py/cassandra/
>> Cassandra.py",
>> line 210, in recv_get
>>     raise result.nfe
>> ttypes.NotFoundException: NotFoundException()
>> Furthermore, querying the same column for (some) other keys is  
>> successful
>> when no matter which node it is directed at.
>> Looking at the log for the node that produced the error from the  
>> query
>> above:
>>
>> DEBUG [pool-1-thread-22] 2009-08-19 16:54:57,618  
>> CassandraServer.java (line
>> 221) get
>> DEBUG [pool-1-thread-22] 2009-08-19 16:54:57,618 StorageProxy.java  
>> (line
>> 420) strongread reading data for
>> SliceByNamesReadCommand(table='MockElementLibrary',
>> key='0401318uuuuruepwdcznr',
>> columnParent='QueryPath(columnFamilyName='strings',  
>> superColumnName='null',
>> columnName='null')', columns=[id,]) from 38184@null
>> DEBUG [pool-1-thread-22] 2009-08-19 16:54:57,619 StorageProxy.java  
>> (line
>> 427) strongread reading digest for
>> SliceByNamesReadCommand(table='MockElementLibrary',
>> key='0401318uuuuruepwdcznr',
>> columnParent='QueryPath(columnFamilyName='strings',  
>> superColumnName='null',
>> columnName='null')', columns=[id,]) from 38185@192.168.133.129:7000
>>  WARN [MESSAGE-SERIALIZER-POOL:4] 2009-08-19 16:54:57,619
>> MessageSerializationTask.java (line 81) Exception was generated at :
>> 08/19/2009 16:54:57 on thread MESSAGE-SERIALIZER-POOL:4
>> java.lang.NullPointerException
>>         at
>> org.apache.cassandra.net.TcpConnection.<init>(TcpConnection.java:83)
>>         at
>> org
>> .apache
>> .cassandra
>> .net.TcpConnectionManager.getConnection(TcpConnectionManager.java:64)
>>         at
>> org
>> .apache
>> .cassandra.net.MessagingService.getConnection(MessagingService.java:
>> 306)
>>         at
>> org
>> .apache
>> .cassandra
>> .net.MessageSerializationTask.run(MessageSerializationTask.java:66)
>>         at
>> java.util.concurrent.ThreadPoolExecutor
>> $Worker.runTask(ThreadPoolExecutor.java:886)
>>         at
>> java.util.concurrent.ThreadPoolExecutor
>> $Worker.run(ThreadPoolExecutor.java:908)
>>         at java.lang.Thread.run(Thread.java:619)
>> DEBUG [RESPONSE-STAGE:4] 2009-08-19 16:54:57,622  
>> ResponseVerbHandler.java
>> (line 38) Processing response on a callback
>> from 65B1E352-A0A3-1A7F-138B-9BEA3E1D787F@192.168.133.129:7000
>> ERROR [pool-1-thread-22] 2009-08-19 16:55:02,619 Cassandra.java  
>> (line 608)
>> Internal error processing get
>> java.lang.RuntimeException: java.util.concurrent.TimeoutException:  
>> Operation
>> timed out - received only 1 responses from 192.168.133.129:7000 .
>>         at
>> org
>> .apache
>> .cassandra
>> .service.CassandraServer.readColumnFamily(CassandraServer.java:100)
>>         at
>> org
>> .apache.cassandra.service.CassandraServer.get(CassandraServer.java:
>> 226)
>>         at
>> org.apache.cassandra.service.Cassandra$Processor
>> $get.process(Cassandra.java:602)
>>         at
>> org.apache.cassandra.service.Cassandra
>> $Processor.process(Cassandra.java:560)
>>         at
>> org.apache.thrift.server.TThreadPoolServer
>> $WorkerProcess.run(TThreadPoolServer.java:252)
>>         at
>> java.util.concurrent.ThreadPoolExecutor
>> $Worker.runTask(ThreadPoolExecutor.java:886)
>>         at
>> java.util.concurrent.ThreadPoolExecutor
>> $Worker.run(ThreadPoolExecutor.java:908)
>>         at java.lang.Thread.run(Thread.java:619)
>> Caused by: java.util.concurrent.TimeoutException: Operation timed  
>> out -
>> received only 1 responses from 192.168.133.129:7000 .
>>         at
>> org
>> .apache
>> .cassandra
>> .service.QuorumResponseHandler.get(QuorumResponseHandler.java:86)
>>         at
>> org
>> .apache.cassandra.service.StorageProxy.strongRead(StorageProxy.java:
>> 435)
>>         at
>> org
>> .apache
>> .cassandra.service.StorageProxy.readProtocol(StorageProxy.java:330)
>>         at
>> org
>> .apache
>> .cassandra
>> .service.CassandraServer.readColumnFamily(CassandraServer.java:92)
>>         ... 7 more
>>
>> It appears to me that there is a timeout during the  
>> QuorumResponseHandler
>> processing, stemming from a NullPointerException that occurs as  
>> part of the
>> read process. I suspect that this NullPointerException has  
>> something to do
>> with the second DEBUG [pool-1-thread-22] comment regarding  
>> strongread ...
>> from 38184@null.
>> Does anyone know why this might be happening?
>> Thanks for any insight,
>> Phil
>>

Reply | Threaded
Open this post in threaded view
|

Re: quorum read timeout

Jonathan Ellis-3
Looks like a bug in TcpConnectionManager.  Can you file a ticket?

thanks,

-Jonathan

On Wed, Aug 19, 2009 at 2:49 PM, Phillip
Michalak<[hidden email]> wrote:

> It's cassandra-0.4-beta1.
>
> Thanks!
> Phil
>
> On Aug 19, 2009, at 4:43 PM, Jonathan Ellis wrote:
>
>> Is this 0.3 or 0.4/trunk?
>>
>> On Wed, Aug 19, 2009 at 2:36 PM, Phillip
>> Michalak<[hidden email]> wrote:
>>>
>>> I'm running three Cassandra nodes in virtual machines.
>>> During a 'get' operation from Cassandra-remote directed at one of these
>>> nodes, I'm receiving the following output
>>>
>>> vadmin@vadmin:~/cassandra$ interface/gen-py/cassandra/Cassandra-remote -h
>>> 192.168.133.130:9160 get 'MockElementLibrary' '0401318uuuuruepwdcznr'
>>> "ColumnPath('strings', None, 'id')" 2
>>> /usr/local/lib/python2.6/dist-packages/thrift/Thrift.py:58:
>>> DeprecationWarning: BaseException.message has been deprecated as of
>>> Python
>>> 2.6
>>>  self.message = message
>>> /usr/local/lib/python2.6/dist-packages/thrift/Thrift.py:99:
>>> DeprecationWarning: BaseException.message has been deprecated as of
>>> Python
>>> 2.6
>>>  self.message = iprot.readString();
>>> Traceback (most recent call last):
>>>  File "interface/gen-py/cassandra/Cassandra-remote", line 93, in <module>
>>>    pp.pprint(client.get(args[0],args[1],eval(args[2]),eval(args[3]),))
>>>  File
>>>
>>> "/home/vadmin/cassandra-0.4.0-beta1/interface/gen-py/cassandra/Cassandra.py",
>>> line 182, in get
>>>    return self.recv_get()
>>>  File
>>>
>>> "/home/vadmin/cassandra-0.4.0-beta1/interface/gen-py/cassandra/Cassandra.py",
>>> line 201, in recv_get
>>>    raise x
>>>
>>> thrift.Thrift.TApplicationException/usr/local/lib/python2.6/dist-packages/thrift/Thrift.py:76:
>>> DeprecationWarning: BaseException.message has been deprecated as of
>>> Python
>>> 2.6
>>>  if self.message:
>>> /usr/local/lib/python2.6/dist-packages/thrift/Thrift.py:77:
>>> DeprecationWarning: BaseException.message has been deprecated as of
>>> Python
>>> 2.6
>>>  return self.message
>>> : Internal error processing get
>>>
>>> The same 'get' operation from Cassandra-remote directed at another of
>>> these
>>> nodes, yields 'normal' output
>>> vadmin@vadmin:~/cassandra$ interface/gen-py/cassandra/Cassandra-remote -h
>>> 192.168.133.129:9160 get 'MockElementLibrary' '0401318uuuuruepwdcznr'
>>> "ColumnPath('strings', None, 'id')" 2
>>> Traceback (most recent call last):
>>>  File "interface/gen-py/cassandra/Cassandra-remote", line 93, in <module>
>>>    pp.pprint(client.get(args[0],args[1],eval(args[2]),eval(args[3]),))
>>>  File
>>>
>>> "/home/vadmin/cassandra-0.4.0-beta1/interface/gen-py/cassandra/Cassandra.py",
>>> line 182, in get
>>>    return self.recv_get()
>>>  File
>>>
>>> "/home/vadmin/cassandra-0.4.0-beta1/interface/gen-py/cassandra/Cassandra.py",
>>> line 210, in recv_get
>>>    raise result.nfe
>>> ttypes.NotFoundException: NotFoundException()
>>> Furthermore, querying the same column for (some) other keys is successful
>>> when no matter which node it is directed at.
>>> Looking at the log for the node that produced the error from the query
>>> above:
>>>
>>> DEBUG [pool-1-thread-22] 2009-08-19 16:54:57,618 CassandraServer.java
>>> (line
>>> 221) get
>>> DEBUG [pool-1-thread-22] 2009-08-19 16:54:57,618 StorageProxy.java (line
>>> 420) strongread reading data for
>>> SliceByNamesReadCommand(table='MockElementLibrary',
>>> key='0401318uuuuruepwdcznr',
>>> columnParent='QueryPath(columnFamilyName='strings',
>>> superColumnName='null',
>>> columnName='null')', columns=[id,]) from 38184@null
>>> DEBUG [pool-1-thread-22] 2009-08-19 16:54:57,619 StorageProxy.java (line
>>> 427) strongread reading digest for
>>> SliceByNamesReadCommand(table='MockElementLibrary',
>>> key='0401318uuuuruepwdcznr',
>>> columnParent='QueryPath(columnFamilyName='strings',
>>> superColumnName='null',
>>> columnName='null')', columns=[id,]) from 38185@192.168.133.129:7000
>>>  WARN [MESSAGE-SERIALIZER-POOL:4] 2009-08-19 16:54:57,619
>>> MessageSerializationTask.java (line 81) Exception was generated at :
>>> 08/19/2009 16:54:57 on thread MESSAGE-SERIALIZER-POOL:4
>>> java.lang.NullPointerException
>>>        at
>>> org.apache.cassandra.net.TcpConnection.<init>(TcpConnection.java:83)
>>>        at
>>>
>>> org.apache.cassandra.net.TcpConnectionManager.getConnection(TcpConnectionManager.java:64)
>>>        at
>>>
>>> org.apache.cassandra.net.MessagingService.getConnection(MessagingService.java:306)
>>>        at
>>>
>>> org.apache.cassandra.net.MessageSerializationTask.run(MessageSerializationTask.java:66)
>>>        at
>>>
>>> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>>>        at
>>>
>>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>>>        at java.lang.Thread.run(Thread.java:619)
>>> DEBUG [RESPONSE-STAGE:4] 2009-08-19 16:54:57,622 ResponseVerbHandler.java
>>> (line 38) Processing response on a callback
>>> from 65B1E352-A0A3-1A7F-138B-9BEA3E1D787F@192.168.133.129:7000
>>> ERROR [pool-1-thread-22] 2009-08-19 16:55:02,619 Cassandra.java (line
>>> 608)
>>> Internal error processing get
>>> java.lang.RuntimeException: java.util.concurrent.TimeoutException:
>>> Operation
>>> timed out - received only 1 responses from 192.168.133.129:7000 .
>>>        at
>>>
>>> org.apache.cassandra.service.CassandraServer.readColumnFamily(CassandraServer.java:100)
>>>        at
>>>
>>> org.apache.cassandra.service.CassandraServer.get(CassandraServer.java:226)
>>>        at
>>>
>>> org.apache.cassandra.service.Cassandra$Processor$get.process(Cassandra.java:602)
>>>        at
>>>
>>> org.apache.cassandra.service.Cassandra$Processor.process(Cassandra.java:560)
>>>        at
>>>
>>> org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:252)
>>>        at
>>>
>>> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>>>        at
>>>
>>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>>>        at java.lang.Thread.run(Thread.java:619)
>>> Caused by: java.util.concurrent.TimeoutException: Operation timed out -
>>> received only 1 responses from 192.168.133.129:7000 .
>>>        at
>>>
>>> org.apache.cassandra.service.QuorumResponseHandler.get(QuorumResponseHandler.java:86)
>>>        at
>>>
>>> org.apache.cassandra.service.StorageProxy.strongRead(StorageProxy.java:435)
>>>        at
>>>
>>> org.apache.cassandra.service.StorageProxy.readProtocol(StorageProxy.java:330)
>>>        at
>>>
>>> org.apache.cassandra.service.CassandraServer.readColumnFamily(CassandraServer.java:92)
>>>        ... 7 more
>>>
>>> It appears to me that there is a timeout during the QuorumResponseHandler
>>> processing, stemming from a NullPointerException that occurs as part of
>>> the
>>> read process. I suspect that this NullPointerException has something to
>>> do
>>> with the second DEBUG [pool-1-thread-22] comment regarding strongread ...
>>> from 38184@null.
>>> Does anyone know why this might be happening?
>>> Thanks for any insight,
>>> Phil
>>>
>
>
Reply | Threaded
Open this post in threaded view
|

Re: quorum read timeout

Phillip Michalak
Sure thing. Filed as  https://issues.apache.org/jira/browse/CASSANDRA-381

Thanks,
Phil

On Aug 19, 2009, at 4:54 PM, Jonathan Ellis wrote:

Looks like a bug in TcpConnectionManager.  Can you file a ticket?

thanks,

-Jonathan

On Wed, Aug 19, 2009 at 2:49 PM, Phillip
Michalak<[hidden email]> wrote:
It's cassandra-0.4-beta1.

Thanks!
Phil

On Aug 19, 2009, at 4:43 PM, Jonathan Ellis wrote:

Is this 0.3 or 0.4/trunk?

On Wed, Aug 19, 2009 at 2:36 PM, Phillip
Michalak<[hidden email]> wrote:

I'm running three Cassandra nodes in virtual machines.
During a 'get' operation from Cassandra-remote directed at one of these
nodes, I'm receiving the following output

vadmin@vadmin:~/cassandra$ interface/gen-py/cassandra/Cassandra-remote -h
192.168.133.130:9160 get 'MockElementLibrary' '0401318uuuuruepwdcznr'
"ColumnPath('strings', None, 'id')" 2
/usr/local/lib/python2.6/dist-packages/thrift/Thrift.py:58:
DeprecationWarning: BaseException.message has been deprecated as of
Python
2.6
 self.message = message
/usr/local/lib/python2.6/dist-packages/thrift/Thrift.py:99:
DeprecationWarning: BaseException.message has been deprecated as of
Python
2.6
 self.message = iprot.readString();
Traceback (most recent call last):
 File "interface/gen-py/cassandra/Cassandra-remote", line 93, in <module>
   pp.pprint(client.get(args[0],args[1],eval(args[2]),eval(args[3]),))
 File

"/home/vadmin/cassandra-0.4.0-beta1/interface/gen-py/cassandra/Cassandra.py",
line 182, in get
   return self.recv_get()
 File

"/home/vadmin/cassandra-0.4.0-beta1/interface/gen-py/cassandra/Cassandra.py",
line 201, in recv_get
   raise x

thrift.Thrift.TApplicationException/usr/local/lib/python2.6/dist-packages/thrift/Thrift.py:76:
DeprecationWarning: BaseException.message has been deprecated as of
Python
2.6
 if self.message:
/usr/local/lib/python2.6/dist-packages/thrift/Thrift.py:77:
DeprecationWarning: BaseException.message has been deprecated as of
Python
2.6
 return self.message
: Internal error processing get

The same 'get' operation from Cassandra-remote directed at another of
these
nodes, yields 'normal' output
vadmin@vadmin:~/cassandra$ interface/gen-py/cassandra/Cassandra-remote -h
192.168.133.129:9160 get 'MockElementLibrary' '0401318uuuuruepwdcznr'
"ColumnPath('strings', None, 'id')" 2
Traceback (most recent call last):
 File "interface/gen-py/cassandra/Cassandra-remote", line 93, in <module>
   pp.pprint(client.get(args[0],args[1],eval(args[2]),eval(args[3]),))
 File

"/home/vadmin/cassandra-0.4.0-beta1/interface/gen-py/cassandra/Cassandra.py",
line 182, in get
   return self.recv_get()
 File

"/home/vadmin/cassandra-0.4.0-beta1/interface/gen-py/cassandra/Cassandra.py",
line 210, in recv_get
   raise result.nfe
ttypes.NotFoundException: NotFoundException()
Furthermore, querying the same column for (some) other keys is successful
when no matter which node it is directed at.
Looking at the log for the node that produced the error from the query
above:

DEBUG [pool-1-thread-22] 2009-08-19 16:54:57,618 CassandraServer.java
(line
221) get
DEBUG [pool-1-thread-22] 2009-08-19 16:54:57,618 StorageProxy.java (line
420) strongread reading data for
SliceByNamesReadCommand(table='MockElementLibrary',
key='0401318uuuuruepwdcznr',
columnParent='QueryPath(columnFamilyName='strings',
superColumnName='null',
columnName='null')', columns=[id,]) from 38184@null
DEBUG [pool-1-thread-22] 2009-08-19 16:54:57,619 StorageProxy.java (line
427) strongread reading digest for
SliceByNamesReadCommand(table='MockElementLibrary',
key='0401318uuuuruepwdcznr',
columnParent='QueryPath(columnFamilyName='strings',
superColumnName='null',
columnName='null')', columns=[id,]) from [hidden email]:7000
 WARN [MESSAGE-SERIALIZER-POOL:4] 2009-08-19 16:54:57,619
MessageSerializationTask.java (line 81) Exception was generated at :
08/19/2009 16:54:57 on thread MESSAGE-SERIALIZER-POOL:4
java.lang.NullPointerException
       at
org.apache.cassandra.net.TcpConnection.<init>(TcpConnection.java:83)
       at

org.apache.cassandra.net.TcpConnectionManager.getConnection(TcpConnectionManager.java:64)
       at

org.apache.cassandra.net.MessagingService.getConnection(MessagingService.java:306)
       at

org.apache.cassandra.net.MessageSerializationTask.run(MessageSerializationTask.java:66)
       at

java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
       at

java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
       at java.lang.Thread.run(Thread.java:619)
DEBUG [RESPONSE-STAGE:4] 2009-08-19 16:54:57,622 ResponseVerbHandler.java
(line 38) Processing response on a callback
from [hidden email]:7000
ERROR [pool-1-thread-22] 2009-08-19 16:55:02,619 Cassandra.java (line
608)
Internal error processing get
java.lang.RuntimeException: java.util.concurrent.TimeoutException:
Operation
timed out - received only 1 responses from 192.168.133.129:7000 .
       at

org.apache.cassandra.service.CassandraServer.readColumnFamily(CassandraServer.java:100)
       at

org.apache.cassandra.service.CassandraServer.get(CassandraServer.java:226)
       at

org.apache.cassandra.service.Cassandra$Processor$get.process(Cassandra.java:602)
       at

org.apache.cassandra.service.Cassandra$Processor.process(Cassandra.java:560)
       at

org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:252)
       at

java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
       at

java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
       at java.lang.Thread.run(Thread.java:619)
Caused by: java.util.concurrent.TimeoutException: Operation timed out -
received only 1 responses from 192.168.133.129:7000 .
       at

org.apache.cassandra.service.QuorumResponseHandler.get(QuorumResponseHandler.java:86)
       at

org.apache.cassandra.service.StorageProxy.strongRead(StorageProxy.java:435)
       at

org.apache.cassandra.service.StorageProxy.readProtocol(StorageProxy.java:330)
       at

org.apache.cassandra.service.CassandraServer.readColumnFamily(CassandraServer.java:92)
       ... 7 more

It appears to me that there is a timeout during the QuorumResponseHandler
processing, stemming from a NullPointerException that occurs as part of
the
read process. I suspect that this NullPointerException has something to
do
with the second DEBUG [pool-1-thread-22] comment regarding strongread ...
from 38184@null.
Does anyone know why this might be happening?
Thanks for any insight,
Phil




Reply | Threaded
Open this post in threaded view
|

Re: quorum read timeout

Jun Rao
In reply to this post by Phillip Michalak

The following line from the log likely shows the problem. It looks like that a datanode is somehow null, which shouldn't happen.

      columnParent='QueryPath(columnFamilyName='strings', superColumnName='null', columnName='null')', columns=[id,]) from 38184@null

Jun
IBM Almaden Research Center
K55/B1, 650 Harry Road, San Jose, CA 95120-6099

[hidden email]


Phillip Michalak <[hidden email]> wrote on 08/19/2009 02:36:13 PM:

> [image removed]

>
> quorum read timeout

>
> Phillip Michalak

>
> to:

>
> cassandra-user

>
> 08/19/2009 02:37 PM

>
> Please respond to cassandra-user

>
> I'm running three Cassandra nodes in virtual machines.

>
> During a 'get' operation from Cassandra-remote directed at one of
> these nodes, I'm receiving the following output

>
> vadmin@vadmin:~/cassandra$ interface/gen-py/cassandra/Cassandra-
> remote -h 192.168.133.130:9160 get 'MockElementLibrary'
> '0401318uuuuruepwdcznr' "ColumnPath('strings', None, 'id')" 2

> /usr/local/lib/python2.6/dist-packages/thrift/Thrift.py:58:
> DeprecationWarning: BaseException.message has been deprecated as of Python 2.6

>   self.message = message
> /usr/local/lib/python2.6/dist-packages/thrift/Thrift.py:99:
> DeprecationWarning: BaseException.message has been deprecated as of Python 2.6

>   self.message = iprot.readString();
> Traceback (most recent call last):
>   File "interface/gen-py/cassandra/Cassandra-remote", line 93, in <module>
>     pp.pprint(client.get(args[0],args[1],eval(args[2]),eval(args[3]),))
>   File "/home/vadmin/cassandra-0.4.0-beta1/interface/gen-py/
> cassandra/Cassandra.py", line 182, in get

>     return self.recv_get()
>   File "/home/vadmin/cassandra-0.4.0-beta1/interface/gen-py/
> cassandra/Cassandra.py", line 201, in recv_get

>     raise x
> thrift.Thrift.TApplicationException/usr/local/lib/python2.6/dist-
> packages/thrift/Thrift.py:76: DeprecationWarning:
> BaseException.message has been deprecated as of Python 2.6

>   if self.message:
> /usr/local/lib/python2.6/dist-packages/thrift/Thrift.py:77:
> DeprecationWarning: BaseException.message has been deprecated as of Python 2.6

>   return self.message
> : Internal error processing get
>
> The same 'get' operation from Cassandra-remote directed at another
> of these nodes, yields 'normal' output

>
> vadmin@vadmin:~/cassandra$ interface/gen-py/cassandra/Cassandra-
> remote -h 192.168.133.129:9160 get 'MockElementLibrary'
> '0401318uuuuruepwdcznr' "ColumnPath('strings', None, 'id')" 2

> Traceback (most recent call last):
>   File "interface/gen-py/cassandra/Cassandra-remote", line 93, in <module>
>     pp.pprint(client.get(args[0],args[1],eval(args[2]),eval(args[3]),))
>   File "/home/vadmin/cassandra-0.4.0-beta1/interface/gen-py/
> cassandra/Cassandra.py", line 182, in get

>     return self.recv_get()
>   File "/home/vadmin/cassandra-0.4.0-beta1/interface/gen-py/
> cassandra/Cassandra.py", line 210, in recv_get

>     raise result.nfe
> ttypes.NotFoundException: NotFoundException()
>
> Furthermore, querying the same column for (some) other keys is
> successful when no matter which node it is directed at.

> Looking at the log for the node that produced the error from the query above:
>
> DEBUG [pool-1-thread-22] 2009-08-19 16:54:57,618
> CassandraServer.java (line 221) get

> DEBUG [pool-1-thread-22] 2009-08-19 16:54:57,618 StorageProxy.java
> (line 420) strongread reading data for SliceByNamesReadCommand
> (table='MockElementLibrary', key='0401318uuuuruepwdcznr',
> columnParent='QueryPath(columnFamilyName='strings',
> superColumnName='null', columnName='null')', columns=[id,]) from 38184@null

> DEBUG [pool-1-thread-22] 2009-08-19 16:54:57,619 StorageProxy.java
> (line 427) strongread reading digest for SliceByNamesReadCommand
> (table='MockElementLibrary', key='0401318uuuuruepwdcznr',
> columnParent='QueryPath(columnFamilyName='strings',
> superColumnName='null', columnName='null')', columns=[id,]) from
> 38185@192.168.133.129:7000

>  WARN [MESSAGE-SERIALIZER-POOL:4] 2009-08-19 16:54:57,619
> MessageSerializationTask.java (line 81) Exception was generated at :
> 08/19/2009 16:54:57 on thread MESSAGE-SERIALIZER-POOL:4

>
> java.lang.NullPointerException

>         at org.apache.cassandra.net.TcpConnection.<init>
> (TcpConnection.java:83)

>         at
> org.apache.cassandra.net.TcpConnectionManager.getConnection
> (TcpConnectionManager.java:64)

>         at org.apache.cassandra.net.MessagingService.getConnection
> (MessagingService.java:306)

>         at org.apache.cassandra.net.MessageSerializationTask.run
> (MessageSerializationTask.java:66)

>         at java.util.concurrent.ThreadPoolExecutor$Worker.runTask
> (ThreadPoolExecutor.java:886)

>         at java.util.concurrent.ThreadPoolExecutor$Worker.run
> (ThreadPoolExecutor.java:908)

>         at java.lang.Thread.run(Thread.java:619)
>
> DEBUG [RESPONSE-STAGE:4] 2009-08-19 16:54:57,622
> ResponseVerbHandler.java (line 38) Processing response on a callback from
> 65B1E352-A0A3-1A7F-138B-9BEA3E1D787F@192.168.133.129:7000

> ERROR [pool-1-thread-22] 2009-08-19 16:55:02,619 Cassandra.java
> (line 608) Internal error processing get

> java.lang.RuntimeException: java.util.concurrent.TimeoutException:
> Operation timed out - received only 1 responses from 192.168.133.129:7000 .

>         at
> org.apache.cassandra.service.CassandraServer.readColumnFamily
> (CassandraServer.java:100)

>         at org.apache.cassandra.service.CassandraServer.get
> (CassandraServer.java:226)

>         at org.apache.cassandra.service.Cassandra$Processor
> $get.process(Cassandra.java:602)

>         at org.apache.cassandra.service.Cassandra$Processor.process
> (Cassandra.java:560)

>         at org.apache.thrift.server.TThreadPoolServer
> $WorkerProcess.run(TThreadPoolServer.java:252)

>         at java.util.concurrent.ThreadPoolExecutor$Worker.runTask
> (ThreadPoolExecutor.java:886)

>         at java.util.concurrent.ThreadPoolExecutor$Worker.run
> (ThreadPoolExecutor.java:908)

>         at java.lang.Thread.run(Thread.java:619)
> Caused by: java.util.concurrent.TimeoutException: Operation timed
> out - received only 1 responses from 192.168.133.129:7000 .

>         at org.apache.cassandra.service.QuorumResponseHandler.get
> (QuorumResponseHandler.java:86)

>         at org.apache.cassandra.service.StorageProxy.strongRead
> (StorageProxy.java:435)

>         at org.apache.cassandra.service.StorageProxy.readProtocol
> (StorageProxy.java:330)

>         at
> org.apache.cassandra.service.CassandraServer.readColumnFamily
> (CassandraServer.java:92)

>         ... 7 more
>
> It appears to me that there is a timeout during the
> QuorumResponseHandler processing, stemming from a
> NullPointerException that occurs as part of the read process. I
> suspect that this NullPointerException has something to do with the
> second DEBUG [pool-1-thread-22] comment regarding strongread ...
> from 38184@null.

>
> Does anyone know why this might be happening?

>
> Thanks for any insight,

> Phil
Reply | Threaded
Open this post in threaded view
|

Re: quorum read timeout

Phillip Michalak
Jun,

That was my suspicion too. I'm trying to figure out whether anything in my configuration might be causing that.

Actually, seeing two or more storage-conf.xml files from a working configuration would be useful for identifying or ruling out configuration issues. Can anyone point me to some?

Thanks,
Phil

On Aug 19, 2009, at 6:42 PM, Jun Rao wrote:

The following line from the log likely shows the problem. It looks like that a datanode is somehow null, which shouldn't happen.

      columnParent='QueryPath(columnFamilyName='strings', superColumnName='null', columnName='null')', columns=[id,]) from 38184@null

Jun
IBM Almaden Research Center
K55/B1, 650 Harry Road, San Jose, CA 95120-6099

[hidden email]



Reply | Threaded
Open this post in threaded view
|

Re: quorum read timeout

Phillip Michalak
In reply to this post by Phillip Michalak
I can't reproduce this after cleaning out the data/ directories (and commit logs) and restarting the cluster. I suspect that I had gotten to an inconsistent state as I was playing around with the keyspace column families.

I've updated the bug report with a comment to this effect, but have left it open for someone with more experience to close it out.

Cheers,
Phil

On Aug 19, 2009, at 5:14 PM, Phillip Michalak wrote:

Sure thing. Filed as  https://issues.apache.org/jira/browse/CASSANDRA-381

Thanks,
Phil

On Aug 19, 2009, at 4:54 PM, Jonathan Ellis wrote:

Looks like a bug in TcpConnectionManager.  Can you file a ticket?

thanks,

-Jonathan

On Wed, Aug 19, 2009 at 2:49 PM, Phillip
Michalak<[hidden email]> wrote:
It's cassandra-0.4-beta1.

Thanks!
Phil

On Aug 19, 2009, at 4:43 PM, Jonathan Ellis wrote:

Is this 0.3 or 0.4/trunk?

On Wed, Aug 19, 2009 at 2:36 PM, Phillip
Michalak<[hidden email]> wrote:

I'm running three Cassandra nodes in virtual machines.
During a 'get' operation from Cassandra-remote directed at one of these
nodes, I'm receiving the following output

vadmin@vadmin:~/cassandra$ interface/gen-py/cassandra/Cassandra-remote -h
192.168.133.130:9160 get 'MockElementLibrary' '0401318uuuuruepwdcznr'
"ColumnPath('strings', None, 'id')" 2
/usr/local/lib/python2.6/dist-packages/thrift/Thrift.py:58:
DeprecationWarning: BaseException.message has been deprecated as of
Python
2.6
 self.message = message
/usr/local/lib/python2.6/dist-packages/thrift/Thrift.py:99:
DeprecationWarning: BaseException.message has been deprecated as of
Python
2.6
 self.message = iprot.readString();
Traceback (most recent call last):
 File "interface/gen-py/cassandra/Cassandra-remote", line 93, in <module>
   pp.pprint(client.get(args[0],args[1],eval(args[2]),eval(args[3]),))
 File

"/home/vadmin/cassandra-0.4.0-beta1/interface/gen-py/cassandra/Cassandra.py",
line 182, in get
   return self.recv_get()
 File

"/home/vadmin/cassandra-0.4.0-beta1/interface/gen-py/cassandra/Cassandra.py",
line 201, in recv_get
   raise x

thrift.Thrift.TApplicationException/usr/local/lib/python2.6/dist-packages/thrift/Thrift.py:76:
DeprecationWarning: BaseException.message has been deprecated as of
Python
2.6
 if self.message:
/usr/local/lib/python2.6/dist-packages/thrift/Thrift.py:77:
DeprecationWarning: BaseException.message has been deprecated as of
Python
2.6
 return self.message
: Internal error processing get

The same 'get' operation from Cassandra-remote directed at another of
these
nodes, yields 'normal' output
vadmin@vadmin:~/cassandra$ interface/gen-py/cassandra/Cassandra-remote -h
192.168.133.129:9160 get 'MockElementLibrary' '0401318uuuuruepwdcznr'
"ColumnPath('strings', None, 'id')" 2
Traceback (most recent call last):
 File "interface/gen-py/cassandra/Cassandra-remote", line 93, in <module>
   pp.pprint(client.get(args[0],args[1],eval(args[2]),eval(args[3]),))
 File

"/home/vadmin/cassandra-0.4.0-beta1/interface/gen-py/cassandra/Cassandra.py",
line 182, in get
   return self.recv_get()
 File

"/home/vadmin/cassandra-0.4.0-beta1/interface/gen-py/cassandra/Cassandra.py",
line 210, in recv_get
   raise result.nfe
ttypes.NotFoundException: NotFoundException()
Furthermore, querying the same column for (some) other keys is successful
when no matter which node it is directed at.
Looking at the log for the node that produced the error from the query
above:

DEBUG [pool-1-thread-22] 2009-08-19 16:54:57,618 CassandraServer.java
(line
221) get
DEBUG [pool-1-thread-22] 2009-08-19 16:54:57,618 StorageProxy.java (line
420) strongread reading data for
SliceByNamesReadCommand(table='MockElementLibrary',
key='0401318uuuuruepwdcznr',
columnParent='QueryPath(columnFamilyName='strings',
superColumnName='null',
columnName='null')', columns=[id,]) from 38184@null
DEBUG [pool-1-thread-22] 2009-08-19 16:54:57,619 StorageProxy.java (line
427) strongread reading digest for
SliceByNamesReadCommand(table='MockElementLibrary',
key='0401318uuuuruepwdcznr',
columnParent='QueryPath(columnFamilyName='strings',
superColumnName='null',
columnName='null')', columns=[id,]) from [hidden email]:7000
 WARN [MESSAGE-SERIALIZER-POOL:4] 2009-08-19 16:54:57,619
MessageSerializationTask.java (line 81) Exception was generated at :
08/19/2009 16:54:57 on thread MESSAGE-SERIALIZER-POOL:4
java.lang.NullPointerException
       at
org.apache.cassandra.net.TcpConnection.<init>(TcpConnection.java:83)
       at

org.apache.cassandra.net.TcpConnectionManager.getConnection(TcpConnectionManager.java:64)
       at

org.apache.cassandra.net.MessagingService.getConnection(MessagingService.java:306)
       at

org.apache.cassandra.net.MessageSerializationTask.run(MessageSerializationTask.java:66)
       at

java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
       at

java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
       at java.lang.Thread.run(Thread.java:619)
DEBUG [RESPONSE-STAGE:4] 2009-08-19 16:54:57,622 ResponseVerbHandler.java
(line 38) Processing response on a callback
from [hidden email]:7000
ERROR [pool-1-thread-22] 2009-08-19 16:55:02,619 Cassandra.java (line
608)
Internal error processing get
java.lang.RuntimeException: java.util.concurrent.TimeoutException:
Operation
timed out - received only 1 responses from 192.168.133.129:7000 .
       at

org.apache.cassandra.service.CassandraServer.readColumnFamily(CassandraServer.java:100)
       at

org.apache.cassandra.service.CassandraServer.get(CassandraServer.java:226)
       at

org.apache.cassandra.service.Cassandra$Processor$get.process(Cassandra.java:602)
       at

org.apache.cassandra.service.Cassandra$Processor.process(Cassandra.java:560)
       at

org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:252)
       at

java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
       at

java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
       at java.lang.Thread.run(Thread.java:619)
Caused by: java.util.concurrent.TimeoutException: Operation timed out -
received only 1 responses from 192.168.133.129:7000 .
       at

org.apache.cassandra.service.QuorumResponseHandler.get(QuorumResponseHandler.java:86)
       at

org.apache.cassandra.service.StorageProxy.strongRead(StorageProxy.java:435)
       at

org.apache.cassandra.service.StorageProxy.readProtocol(StorageProxy.java:330)
       at

org.apache.cassandra.service.CassandraServer.readColumnFamily(CassandraServer.java:92)
       ... 7 more

It appears to me that there is a timeout during the QuorumResponseHandler
processing, stemming from a NullPointerException that occurs as part of
the
read process. I suspect that this NullPointerException has something to
do
with the second DEBUG [pool-1-thread-22] comment regarding strongread ...
from 38184@null.
Does anyone know why this might be happening?
Thanks for any insight,
Phil





Reply | Threaded
Open this post in threaded view
|

Re: quorum read timeout

Jun Rao

You should be aware that Cassandra doesn't support changing column families without reloading the data.

Jun
IBM Almaden Research Center
K55/B1, 650 Harry Road, San Jose, CA 95120-6099

[hidden email]


Inactive hide details for Phillip Michalak ---08/20/2009 04:16:44 PM---I can't reproduce this after cleaning out the data/ direPhillip Michalak ---08/20/2009 04:16:44 PM---I can't reproduce this after cleaning out the data/ directories (and commit logs) and restarting t


From:

Phillip Michalak <[hidden email]>

To:

[hidden email]

Date:

08/20/2009 04:16 PM

Subject:

Re: quorum read timeout




I can't reproduce this after cleaning out the data/ directories (and commit logs) and restarting the cluster. I suspect that I had gotten to an inconsistent state as I was playing around with the keyspace column families.

I've updated the bug report with a comment to this effect, but have left it open for someone with more experience to close it out.

Cheers,
Phil

On Aug 19, 2009, at 5:14 PM, Phillip Michalak wrote:
      Sure thing. Filed as https://issues.apache.org/jira/browse/CASSANDRA-381

      Thanks,
      Phil

      On Aug 19, 2009, at 4:54 PM, Jonathan Ellis wrote:
          Looks like a bug in TcpConnectionManager. Can you file a ticket?

          thanks,

          -Jonathan

          On Wed, Aug 19, 2009 at 2:49 PM, Phillip
          Michalak<
          [hidden email]> wrote:
              It's cassandra-0.4-beta1.

              Thanks!
              Phil

              On Aug 19, 2009, at 4:43 PM, Jonathan Ellis wrote:
                  Is this 0.3 or 0.4/trunk?

                  On Wed, Aug 19, 2009 at 2:36 PM, Phillip
                  Michalak<[hidden email]> wrote:

                      I'm running three Cassandra nodes in virtual machines.
                      During a 'get' operation from Cassandra-remote directed at one of these
                      nodes, I'm receiving the following output

                      vadmin@vadmin:~/cassandra$ interface/gen-py/cassandra/Cassandra-remote -h
                      192.168.133.130:9160 get 'MockElementLibrary' '0401318uuuuruepwdcznr'
                      "ColumnPath('strings', None, 'id')" 2
                      /usr/local/lib/python2.6/dist-packages/thrift/Thrift.py:58:
                      DeprecationWarning: BaseException.message has been deprecated as of
                      Python
                      2.6
                      self.message = message
                      /usr/local/lib/python2.6/dist-packages/thrift/Thrift.py:99:
                      DeprecationWarning: BaseException.message has been deprecated as of
                      Python
                      2.6
                      self.message = iprot.readString();
                      Traceback (most recent call last):
                      File "interface/gen-py/cassandra/Cassandra-remote", line 93, in <module>
                      pp.pprint(client.get(args[0],args[1],eval(args[2]),eval(args[3]),))
                      File

                      "/home/vadmin/cassandra-0.4.0-beta1/interface/gen-py/cassandra/Cassandra.py",
                      line 182, in get
                      return self.recv_get()
                      File

                      "/home/vadmin/cassandra-0.4.0-beta1/interface/gen-py/cassandra/Cassandra.py",
                      line 201, in recv_get
                      raise x

                      thrift.Thrift.TApplicationException/usr/local/lib/python2.6/dist-packages/thrift/Thrift.py:76:
                      DeprecationWarning: BaseException.message has been deprecated as of
                      Python
                      2.6
                      if self.message:
                      /usr/local/lib/python2.6/dist-packages/thrift/Thrift.py:77:
                      DeprecationWarning: BaseException.message has been deprecated as of
                      Python
                      2.6
                      return self.message
                      : Internal error processing get

                      The same 'get' operation from Cassandra-remote directed at another of
                      these
                      nodes, yields 'normal' output
                      vadmin@vadmin:~/cassandra$ interface/gen-py/cassandra/Cassandra-remote -h
                      192.168.133.129:9160 get 'MockElementLibrary' '0401318uuuuruepwdcznr'
                      "ColumnPath('strings', None, 'id')" 2
                      Traceback (most recent call last):
                      File "interface/gen-py/cassandra/Cassandra-remote", line 93, in <module>
                      pp.pprint(client.get(args[0],args[1],eval(args[2]),eval(args[3]),))
                      File

                      "/home/vadmin/cassandra-0.4.0-beta1/interface/gen-py/cassandra/Cassandra.py",
                      line 182, in get
                      return self.recv_get()
                      File

                      "/home/vadmin/cassandra-0.4.0-beta1/interface/gen-py/cassandra/Cassandra.py",
                      line 210, in recv_get
                      raise result.nfe
                      ttypes.NotFoundException: NotFoundException()
                      Furthermore, querying the same column for (some) other keys is successful
                      when no matter which node it is directed at.
                      Looking at the log for the node that produced the error from the query
                      above:

                      DEBUG [pool-1-thread-22] 2009-08-19 16:54:57,618 CassandraServer.java
                      (line
                      221) get
                      DEBUG [pool-1-thread-22] 2009-08-19 16:54:57,618 StorageProxy.java (line
                      420) strongread reading data for
                      SliceByNamesReadCommand(table='MockElementLibrary',
                      key='0401318uuuuruepwdcznr',
                      columnParent='QueryPath(columnFamilyName='strings',
                      superColumnName='null',
                      columnName='null')', columns=[id,]) from 38184@null
                      DEBUG [pool-1-thread-22] 2009-08-19 16:54:57,619 StorageProxy.java (line
                      427) strongread reading digest for
                      SliceByNamesReadCommand(table='MockElementLibrary',
                      key='0401318uuuuruepwdcznr',
                      columnParent='QueryPath(columnFamilyName='strings',
                      superColumnName='null',
                      columnName='null')', columns=[id,]) from [hidden email]:7000
                      WARN [MESSAGE-SERIALIZER-POOL:4] 2009-08-19 16:54:57,619
                      MessageSerializationTask.java (line 81) Exception was generated at :
                      08/19/2009 16:54:57 on thread MESSAGE-SERIALIZER-POOL:4
                      java.lang.NullPointerException
                      at
                      org.apache.cassandra.net.TcpConnection.<init>(TcpConnection.java:83)
                      at

                      org.apache.cassandra.net.TcpConnectionManager.getConnection(TcpConnectionManager.java:64)
                      at

                      org.apache.cassandra.net.MessagingService.getConnection(MessagingService.java:306)
                      at

                      org.apache.cassandra.net.MessageSerializationTask.run(MessageSerializationTask.java:66)
                      at

                      java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
                      at

                      java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
                      at java.lang.Thread.run(Thread.java:619)
                      DEBUG [RESPONSE-STAGE:4] 2009-08-19 16:54:57,622 ResponseVerbHandler.java
                      (line 38) Processing response on a callback
                      from [hidden email]:7000
                      ERROR [pool-1-thread-22] 2009-08-19 16:55:02,619 Cassandra.java (line
                      608)
                      Internal error processing get
                      java.lang.RuntimeException: java.util.concurrent.TimeoutException:
                      Operation
                      timed out - received only 1 responses from 192.168.133.129:7000 .
                      at

                      org.apache.cassandra.service.CassandraServer.readColumnFamily(CassandraServer.java:100)
                      at

                      org.apache.cassandra.service.CassandraServer.get(CassandraServer.java:226)
                      at

                      org.apache.cassandra.service.Cassandra$Processor$get.process(Cassandra.java:602)
                      at

                      org.apache.cassandra.service.Cassandra$Processor.process(Cassandra.java:560)
                      at

                      org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:252)
                      at

                      java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
                      at

                      java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
                      at java.lang.Thread.run(Thread.java:619)
                      Caused by: java.util.concurrent.TimeoutException: Operation timed out -
                      received only 1 responses from 192.168.133.129:7000 .
                      at

                      org.apache.cassandra.service.QuorumResponseHandler.get(QuorumResponseHandler.java:86)
                      at

                      org.apache.cassandra.service.StorageProxy.strongRead(StorageProxy.java:435)
                      at

                      org.apache.cassandra.service.StorageProxy.readProtocol(StorageProxy.java:330)
                      at

                      org.apache.cassandra.service.CassandraServer.readColumnFamily(CassandraServer.java:92)
                      ... 7 more

                      It appears to me that there is a timeout during the QuorumResponseHandler
                      processing, stemming from a NullPointerException that occurs as part of
                      the
                      read process. I suspect that this NullPointerException has something to
                      do
                      with the second DEBUG [pool-1-thread-22] comment regarding strongread ...
                      from 38184@null.
                      Does anyone know why this might be happening?
                      Thanks for any insight,
                      Phil



Reply | Threaded
Open this post in threaded view
|

Re: quorum read timeout

Evan Weaver
http://issues.apache.org/jira/browse/CASSANDRA-44

On Thu, Aug 20, 2009 at 4:37 PM, Jun Rao<[hidden email]> wrote:

> You should be aware that Cassandra doesn't support changing column families
> without reloading the data.
>
> Jun
> IBM Almaden Research Center
> K55/B1, 650 Harry Road, San Jose, CA 95120-6099
>
> [hidden email]
>
>
> Phillip Michalak ---08/20/2009 04:16:44 PM---I can't reproduce this after
> cleaning out the data/ directories (and commit logs) and restarting t
>
>
> From:
> Phillip Michalak <[hidden email]>
> To:
> [hidden email]
> Date:
> 08/20/2009 04:16 PM
> Subject:
> Re: quorum read timeout
> ________________________________
>
>
> I can't reproduce this after cleaning out the data/ directories (and commit
> logs) and restarting the cluster. I suspect that I had gotten to an
> inconsistent state as I was playing around with the keyspace column
> families.
>
> I've updated the bug report with a comment to this effect, but have left it
> open for someone with more experience to close it out.
>
> Cheers,
> Phil
>
> On Aug 19, 2009, at 5:14 PM, Phillip Michalak wrote:
>
> Sure thing. Filed as https://issues.apache.org/jira/browse/CASSANDRA-381
>
> Thanks,
> Phil
>
> On Aug 19, 2009, at 4:54 PM, Jonathan Ellis wrote:
>
> Looks like a bug in TcpConnectionManager. Can you file a ticket?
>
> thanks,
>
> -Jonathan
>
> On Wed, Aug 19, 2009 at 2:49 PM, Phillip
> Michalak<[hidden email]> wrote:
>
> It's cassandra-0.4-beta1.
>
> Thanks!
> Phil
>
> On Aug 19, 2009, at 4:43 PM, Jonathan Ellis wrote:
>
> Is this 0.3 or 0.4/trunk?
>
> On Wed, Aug 19, 2009 at 2:36 PM, Phillip
> Michalak<[hidden email]> wrote:
>
> I'm running three Cassandra nodes in virtual machines.
> During a 'get' operation from Cassandra-remote directed at one of these
> nodes, I'm receiving the following output
>
> vadmin@vadmin:~/cassandra$ interface/gen-py/cassandra/Cassandra-remote -h
> 192.168.133.130:9160 get 'MockElementLibrary' '0401318uuuuruepwdcznr'
> "ColumnPath('strings', None, 'id')" 2
> /usr/local/lib/python2.6/dist-packages/thrift/Thrift.py:58:
> DeprecationWarning: BaseException.message has been deprecated as of
> Python
> 2.6
> self.message = message
> /usr/local/lib/python2.6/dist-packages/thrift/Thrift.py:99:
> DeprecationWarning: BaseException.message has been deprecated as of
> Python
> 2.6
> self.message = iprot.readString();
> Traceback (most recent call last):
> File "interface/gen-py/cassandra/Cassandra-remote", line 93, in <module>
> pp.pprint(client.get(args[0],args[1],eval(args[2]),eval(args[3]),))
> File
>
> "/home/vadmin/cassandra-0.4.0-beta1/interface/gen-py/cassandra/Cassandra.py",
> line 182, in get
> return self.recv_get()
> File
>
> "/home/vadmin/cassandra-0.4.0-beta1/interface/gen-py/cassandra/Cassandra.py",
> line 201, in recv_get
> raise x
>
> thrift.Thrift.TApplicationException/usr/local/lib/python2.6/dist-packages/thrift/Thrift.py:76:
> DeprecationWarning: BaseException.message has been deprecated as of
> Python
> 2.6
> if self.message:
> /usr/local/lib/python2.6/dist-packages/thrift/Thrift.py:77:
> DeprecationWarning: BaseException.message has been deprecated as of
> Python
> 2.6
> return self.message
> : Internal error processing get
>
> The same 'get' operation from Cassandra-remote directed at another of
> these
> nodes, yields 'normal' output
> vadmin@vadmin:~/cassandra$ interface/gen-py/cassandra/Cassandra-remote -h
> 192.168.133.129:9160 get 'MockElementLibrary' '0401318uuuuruepwdcznr'
> "ColumnPath('strings', None, 'id')" 2
> Traceback (most recent call last):
> File "interface/gen-py/cassandra/Cassandra-remote", line 93, in <module>
> pp.pprint(client.get(args[0],args[1],eval(args[2]),eval(args[3]),))
> File
>
> "/home/vadmin/cassandra-0.4.0-beta1/interface/gen-py/cassandra/Cassandra.py",
> line 182, in get
> return self.recv_get()
> File
>
> "/home/vadmin/cassandra-0.4.0-beta1/interface/gen-py/cassandra/Cassandra.py",
> line 210, in recv_get
> raise result.nfe
> ttypes.NotFoundException: NotFoundException()
> Furthermore, querying the same column for (some) other keys is successful
> when no matter which node it is directed at.
> Looking at the log for the node that produced the error from the query
> above:
>
> DEBUG [pool-1-thread-22] 2009-08-19 16:54:57,618 CassandraServer.java
> (line
> 221) get
> DEBUG [pool-1-thread-22] 2009-08-19 16:54:57,618 StorageProxy.java (line
> 420) strongread reading data for
> SliceByNamesReadCommand(table='MockElementLibrary',
> key='0401318uuuuruepwdcznr',
> columnParent='QueryPath(columnFamilyName='strings',
> superColumnName='null',
> columnName='null')', columns=[id,]) from 38184@null
> DEBUG [pool-1-thread-22] 2009-08-19 16:54:57,619 StorageProxy.java (line
> 427) strongread reading digest for
> SliceByNamesReadCommand(table='MockElementLibrary',
> key='0401318uuuuruepwdcznr',
> columnParent='QueryPath(columnFamilyName='strings',
> superColumnName='null',
> columnName='null')', columns=[id,]) from 38185@192.168.133.129:7000
> WARN [MESSAGE-SERIALIZER-POOL:4] 2009-08-19 16:54:57,619
> MessageSerializationTask.java (line 81) Exception was generated at :
> 08/19/2009 16:54:57 on thread MESSAGE-SERIALIZER-POOL:4
> java.lang.NullPointerException
> at
> org.apache.cassandra.net.TcpConnection.<init>(TcpConnection.java:83)
> at
>
> org.apache.cassandra.net.TcpConnectionManager.getConnection(TcpConnectionManager.java:64)
> at
>
> org.apache.cassandra.net.MessagingService.getConnection(MessagingService.java:306)
> at
>
> org.apache.cassandra.net.MessageSerializationTask.run(MessageSerializationTask.java:66)
> at
>
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
> at
>
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
> at java.lang.Thread.run(Thread.java:619)
> DEBUG [RESPONSE-STAGE:4] 2009-08-19 16:54:57,622 ResponseVerbHandler.java
> (line 38) Processing response on a callback
> from 65B1E352-A0A3-1A7F-138B-9BEA3E1D787F@192.168.133.129:7000
> ERROR [pool-1-thread-22] 2009-08-19 16:55:02,619 Cassandra.java (line
> 608)
> Internal error processing get
> java.lang.RuntimeException: java.util.concurrent.TimeoutException:
> Operation
> timed out - received only 1 responses from 192.168.133.129:7000 .
> at
>
> org.apache.cassandra.service.CassandraServer.readColumnFamily(CassandraServer.java:100)
> at
>
> org.apache.cassandra.service.CassandraServer.get(CassandraServer.java:226)
> at
>
> org.apache.cassandra.service.Cassandra$Processor$get.process(Cassandra.java:602)
> at
>
> org.apache.cassandra.service.Cassandra$Processor.process(Cassandra.java:560)
> at
>
> org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:252)
> at
>
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
> at
>
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
> at java.lang.Thread.run(Thread.java:619)
> Caused by: java.util.concurrent.TimeoutException: Operation timed out -
> received only 1 responses from 192.168.133.129:7000 .
> at
>
> org.apache.cassandra.service.QuorumResponseHandler.get(QuorumResponseHandler.java:86)
> at
>
> org.apache.cassandra.service.StorageProxy.strongRead(StorageProxy.java:435)
> at
>
> org.apache.cassandra.service.StorageProxy.readProtocol(StorageProxy.java:330)
> at
>
> org.apache.cassandra.service.CassandraServer.readColumnFamily(CassandraServer.java:92)
> ... 7 more
>
> It appears to me that there is a timeout during the QuorumResponseHandler
> processing, stemming from a NullPointerException that occurs as part of
> the
> read process. I suspect that this NullPointerException has something to
> do
> with the second DEBUG [pool-1-thread-22] comment regarding strongread ...
> from 38184@null.
> Does anyone know why this might be happening?
> Thanks for any insight,
> Phil
>
>
>
>



--
Evan Weaver
Reply | Threaded
Open this post in threaded view
|

Re: quorum read timeout

Michael Greene
In reply to this post by Jun Rao
That's not precisely true.  See https://issues.apache.org/jira/browse/CASSANDRA-44 for techniques that can be used to modify the current column families.  Eventually this will be made more dynamic.

Michael

On Thu, Aug 20, 2009 at 6:37 PM, Jun Rao <[hidden email]> wrote:

You should be aware that Cassandra doesn't support changing column families without reloading the data.



Jun
IBM Almaden Research Center
K55/B1, 650 Harry Road, San Jose, CA 95120-6099

[hidden email]


Inactive hide details for Phillip Michalak ---08/20/2009 04:16:44 PM---I can't reproduce this after cleaning out the data/ direPhillip Michalak ---08/20/2009 04:16:44 PM---I can't reproduce this after cleaning out the data/ directories (and commit logs) and restarting t


From:

Phillip Michalak <[hidden email]>

To:

[hidden email]

Date:

08/20/2009 04:16 PM

Subject:

Re: quorum read timeout




I can't reproduce this after cleaning out the data/ directories (and commit logs) and restarting the cluster. I suspect that I had gotten to an inconsistent state as I was playing around with the keyspace column families.

I've updated the bug report with a comment to this effect, but have left it open for someone with more experience to close it out.

Cheers,
Phil

On Aug 19, 2009, at 5:14 PM, Phillip Michalak wrote:
      Sure thing. Filed as https://issues.apache.org/jira/browse/CASSANDRA-381

      Thanks,
      Phil

      On Aug 19, 2009, at 4:54 PM, Jonathan Ellis wrote:
          Looks like a bug in TcpConnectionManager. Can you file a ticket?

          thanks,

          -Jonathan

          On Wed, Aug 19, 2009 at 2:49 PM, Phillip
          Michalak<
          [hidden email]> wrote:
              It's cassandra-0.4-beta1.

              Thanks!
              Phil

              On Aug 19, 2009, at 4:43 PM, Jonathan Ellis wrote:
                  Is this 0.3 or 0.4/trunk?

                  On Wed, Aug 19, 2009 at 2:36 PM, Phillip
                  Michalak<[hidden email]> wrote:

                      I'm running three Cassandra nodes in virtual machines.
                      During a 'get' operation from Cassandra-remote directed at one of these
                      nodes, I'm receiving the following output

                      vadmin@vadmin:~/cassandra$ interface/gen-py/cassandra/Cassandra-remote -h
                      192.168.133.130:9160 get 'MockElementLibrary' '0401318uuuuruepwdcznr'
                      "ColumnPath('strings', None, 'id')" 2
                      /usr/local/lib/python2.6/dist-packages/thrift/Thrift.py:58:
                      DeprecationWarning: BaseException.message has been deprecated as of
                      Python
                      2.6
                      self.message = message
                      /usr/local/lib/python2.6/dist-packages/thrift/Thrift.py:99:
                      DeprecationWarning: BaseException.message has been deprecated as of
                      Python
                      2.6
                      self.message = iprot.readString();
                      Traceback (most recent call last):
                      File "interface/gen-py/cassandra/Cassandra-remote", line 93, in <module>
                      pp.pprint(client.get(args[0],args[1],eval(args[2]),eval(args[3]),))
                      File

                      "/home/vadmin/cassandra-0.4.0-beta1/interface/gen-py/cassandra/Cassandra.py",
                      line 182, in get
                      return self.recv_get()
                      File

                      "/home/vadmin/cassandra-0.4.0-beta1/interface/gen-py/cassandra/Cassandra.py",
                      line 201, in recv_get
                      raise x

                      thrift.Thrift.TApplicationException/usr/local/lib/python2.6/dist-packages/thrift/Thrift.py:76:
                      DeprecationWarning: BaseException.message has been deprecated as of
                      Python
                      2.6
                      if self.message:
                      /usr/local/lib/python2.6/dist-packages/thrift/Thrift.py:77:
                      DeprecationWarning: BaseException.message has been deprecated as of
                      Python
                      2.6
                      return self.message
                      : Internal error processing get

                      The same 'get' operation from Cassandra-remote directed at another of
                      these
                      nodes, yields 'normal' output
                      vadmin@vadmin:~/cassandra$ interface/gen-py/cassandra/Cassandra-remote -h
                      192.168.133.129:9160 get 'MockElementLibrary' '0401318uuuuruepwdcznr'
                      "ColumnPath('strings', None, 'id')" 2
                      Traceback (most recent call last):
                      File "interface/gen-py/cassandra/Cassandra-remote", line 93, in <module>
                      pp.pprint(client.get(args[0],args[1],eval(args[2]),eval(args[3]),))
                      File

                      "/home/vadmin/cassandra-0.4.0-beta1/interface/gen-py/cassandra/Cassandra.py",
                      line 182, in get
                      return self.recv_get()
                      File

                      "/home/vadmin/cassandra-0.4.0-beta1/interface/gen-py/cassandra/Cassandra.py",
                      line 210, in recv_get
                      raise result.nfe
                      ttypes.NotFoundException: NotFoundException()
                      Furthermore, querying the same column for (some) other keys is successful
                      when no matter which node it is directed at.
                      Looking at the log for the node that produced the error from the query
                      above:

                      DEBUG [pool-1-thread-22] 2009-08-19 16:54:57,618 CassandraServer.java
                      (line
                      221) get
                      DEBUG [pool-1-thread-22] 2009-08-19 16:54:57,618 StorageProxy.java (line
                      420) strongread reading data for
                      SliceByNamesReadCommand(table='MockElementLibrary',
                      key='0401318uuuuruepwdcznr',
                      columnParent='QueryPath(columnFamilyName='strings',
                      superColumnName='null',
                      columnName='null')', columns=[id,]) from 38184@null
                      DEBUG [pool-1-thread-22] 2009-08-19 16:54:57,619 StorageProxy.java (line
                      427) strongread reading digest for
                      SliceByNamesReadCommand(table='MockElementLibrary',
                      key='0401318uuuuruepwdcznr',
                      columnParent='QueryPath(columnFamilyName='strings',
                      superColumnName='null',
                      columnName='null')', columns=[id,]) from [hidden email]:7000
                      WARN [MESSAGE-SERIALIZER-POOL:4] 2009-08-19 16:54:57,619
                      MessageSerializationTask.java (line 81) Exception was generated at :
                      08/19/2009 16:54:57 on thread MESSAGE-SERIALIZER-POOL:4
                      java.lang.NullPointerException
                      at
                      org.apache.cassandra.net.TcpConnection.<init>(TcpConnection.java:83)
                      at

                      org.apache.cassandra.net.TcpConnectionManager.getConnection(TcpConnectionManager.java:64)
                      at

                      org.apache.cassandra.net.MessagingService.getConnection(MessagingService.java:306)
                      at

                      org.apache.cassandra.net.MessageSerializationTask.run(MessageSerializationTask.java:66)
                      at

                      java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
                      at

                      java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
                      at java.lang.Thread.run(Thread.java:619)
                      DEBUG [RESPONSE-STAGE:4] 2009-08-19 16:54:57,622 ResponseVerbHandler.java
                      (line 38) Processing response on a callback
                      from [hidden email]:7000
                      ERROR [pool-1-thread-22] 2009-08-19 16:55:02,619 Cassandra.java (line
                      608)
                      Internal error processing get
                      java.lang.RuntimeException: java.util.concurrent.TimeoutException:
                      Operation
                      timed out - received only 1 responses from 192.168.133.129:7000 .
                      at

                      org.apache.cassandra.service.CassandraServer.readColumnFamily(CassandraServer.java:100)
                      at

                      org.apache.cassandra.service.CassandraServer.get(CassandraServer.java:226)
                      at

                      org.apache.cassandra.service.Cassandra$Processor$get.process(Cassandra.java:602)
                      at

                      org.apache.cassandra.service.Cassandra$Processor.process(Cassandra.java:560)
                      at

                      org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:252)
                      at

                      java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
                      at

                      java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
                      at java.lang.Thread.run(Thread.java:619)
                      Caused by: java.util.concurrent.TimeoutException: Operation timed out -
                      received only 1 responses from 192.168.133.129:7000 .
                      at

                      org.apache.cassandra.service.QuorumResponseHandler.get(QuorumResponseHandler.java:86)
                      at

                      org.apache.cassandra.service.StorageProxy.strongRead(StorageProxy.java:435)
                      at

                      org.apache.cassandra.service.StorageProxy.readProtocol(StorageProxy.java:330)
                      at

                      org.apache.cassandra.service.CassandraServer.readColumnFamily(CassandraServer.java:92)
                      ... 7 more

                      It appears to me that there is a timeout during the QuorumResponseHandler
                      processing, stemming from a NullPointerException that occurs as part of
                      the
                      read process. I suspect that this NullPointerException has something to
                      do
                      with the second DEBUG [pool-1-thread-22] comment regarding strongread ...
                      from 38184@null.
                      Does anyone know why this might be happening?
                      Thanks for any insight,
                      Phil




Reply | Threaded
Open this post in threaded view
|

Re: quorum read timeout

Phillip Michalak
In reply to this post by Evan Weaver
Thanks, Evan and Jun. That's exactly where I ended up after taking a  
step back and looking at the issue again.

Phil

On Aug 20, 2009, at 6:41 PM, "Evan Weaver" <[hidden email]> wrote:

> http://issues.apache.org/jira/browse/CASSANDRA-44
>
> On Thu, Aug 20, 2009 at 4:37 PM, Jun Rao<[hidden email]>  
> wrote:
>> You should be aware that Cassandra doesn't support changing column  
>> families
>> without reloading the data.
>>
>> Jun
>> IBM Almaden Research Center
>> K55/B1, 650 Harry Road, San Jose, CA 95120-6099
>>
>> [hidden email]
>>
>>
>> Phillip Michalak ---08/20/2009 04:16:44 PM---I can't reproduce this  
>> after
>> cleaning out the data/ directories (and commit logs) and restarting t
>>
>>
>> From:
>> Phillip Michalak <[hidden email]>
>> To:
>> [hidden email]
>> Date:
>> 08/20/2009 04:16 PM
>> Subject:
>> Re: quorum read timeout
>> ________________________________
>>
>>
>> I can't reproduce this after cleaning out the data/ directories  
>> (and commit
>> logs) and restarting the cluster. I suspect that I had gotten to an
>> inconsistent state as I was playing around with the keyspace column
>> families.
>>
>> I've updated the bug report with a comment to this effect, but have  
>> left it
>> open for someone with more experience to close it out.
>>
>> Cheers,
>> Phil
>>
>> On Aug 19, 2009, at 5:14 PM, Phillip Michalak wrote:
>>
>> Sure thing. Filed as https://issues.apache.org/jira/browse/CASSANDRA-381
>>
>> Thanks,
>> Phil
>>
>> On Aug 19, 2009, at 4:54 PM, Jonathan Ellis wrote:
>>
>> Looks like a bug in TcpConnectionManager. Can you file a ticket?
>>
>> thanks,
>>
>> -Jonathan
>>
>> On Wed, Aug 19, 2009 at 2:49 PM, Phillip
>> Michalak<[hidden email]> wrote:
>>
>> It's cassandra-0.4-beta1.
>>
>> Thanks!
>> Phil
>>
>> On Aug 19, 2009, at 4:43 PM, Jonathan Ellis wrote:
>>
>> Is this 0.3 or 0.4/trunk?
>>
>> On Wed, Aug 19, 2009 at 2:36 PM, Phillip
>> Michalak<[hidden email]> wrote:
>>
>> I'm running three Cassandra nodes in virtual machines.
>> During a 'get' operation from Cassandra-remote directed at one of  
>> these
>> nodes, I'm receiving the following output
>>
>> vadmin@vadmin:~/cassandra$ interface/gen-py/cassandra/Cassandra-
>> remote -h
>> 192.168.133.130:9160 get 'MockElementLibrary' '0401318uuuuruepwdcznr'
>> "ColumnPath('strings', None, 'id')" 2
>> /usr/local/lib/python2.6/dist-packages/thrift/Thrift.py:58:
>> DeprecationWarning: BaseException.message has been deprecated as of
>> Python
>> 2.6
>> self.message = message
>> /usr/local/lib/python2.6/dist-packages/thrift/Thrift.py:99:
>> DeprecationWarning: BaseException.message has been deprecated as of
>> Python
>> 2.6
>> self.message = iprot.readString();
>> Traceback (most recent call last):
>> File "interface/gen-py/cassandra/Cassandra-remote", line 93, in  
>> <module>
>> pp.pprint(client.get(args[0],args[1],eval(args[2]),eval(args[3]),))
>> File
>>
>> "/home/vadmin/cassandra-0.4.0-beta1/interface/gen-py/cassandra/
>> Cassandra.py",
>> line 182, in get
>> return self.recv_get()
>> File
>>
>> "/home/vadmin/cassandra-0.4.0-beta1/interface/gen-py/cassandra/
>> Cassandra.py",
>> line 201, in recv_get
>> raise x
>>
>> thrift.Thrift.TApplicationException/usr/local/lib/python2.6/dist-
>> packages/thrift/Thrift.py:76:
>> DeprecationWarning: BaseException.message has been deprecated as of
>> Python
>> 2.6
>> if self.message:
>> /usr/local/lib/python2.6/dist-packages/thrift/Thrift.py:77:
>> DeprecationWarning: BaseException.message has been deprecated as of
>> Python
>> 2.6
>> return self.message
>> : Internal error processing get
>>
>> The same 'get' operation from Cassandra-remote directed at another of
>> these
>> nodes, yields 'normal' output
>> vadmin@vadmin:~/cassandra$ interface/gen-py/cassandra/Cassandra-
>> remote -h
>> 192.168.133.129:9160 get 'MockElementLibrary' '0401318uuuuruepwdcznr'
>> "ColumnPath('strings', None, 'id')" 2
>> Traceback (most recent call last):
>> File "interface/gen-py/cassandra/Cassandra-remote", line 93, in  
>> <module>
>> pp.pprint(client.get(args[0],args[1],eval(args[2]),eval(args[3]),))
>> File
>>
>> "/home/vadmin/cassandra-0.4.0-beta1/interface/gen-py/cassandra/
>> Cassandra.py",
>> line 182, in get
>> return self.recv_get()
>> File
>>
>> "/home/vadmin/cassandra-0.4.0-beta1/interface/gen-py/cassandra/
>> Cassandra.py",
>> line 210, in recv_get
>> raise result.nfe
>> ttypes.NotFoundException: NotFoundException()
>> Furthermore, querying the same column for (some) other keys is  
>> successful
>> when no matter which node it is directed at.
>> Looking at the log for the node that produced the error from the  
>> query
>> above:
>>
>> DEBUG [pool-1-thread-22] 2009-08-19 16:54:57,618 CassandraServer.java
>> (line
>> 221) get
>> DEBUG [pool-1-thread-22] 2009-08-19 16:54:57,618 StorageProxy.java  
>> (line
>> 420) strongread reading data for
>> SliceByNamesReadCommand(table='MockElementLibrary',
>> key='0401318uuuuruepwdcznr',
>> columnParent='QueryPath(columnFamilyName='strings',
>> superColumnName='null',
>> columnName='null')', columns=[id,]) from 38184@null
>> DEBUG [pool-1-thread-22] 2009-08-19 16:54:57,619 StorageProxy.java  
>> (line
>> 427) strongread reading digest for
>> SliceByNamesReadCommand(table='MockElementLibrary',
>> key='0401318uuuuruepwdcznr',
>> columnParent='QueryPath(columnFamilyName='strings',
>> superColumnName='null',
>> columnName='null')', columns=[id,]) from 38185@192.168.133.129:7000
>> WARN [MESSAGE-SERIALIZER-POOL:4] 2009-08-19 16:54:57,619
>> MessageSerializationTask.java (line 81) Exception was generated at :
>> 08/19/2009 16:54:57 on thread MESSAGE-SERIALIZER-POOL:4
>> java.lang.NullPointerException
>> at
>> org.apache.cassandra.net.TcpConnection.<init>(TcpConnection.java:83)
>> at
>>
>> org.apache.cassandra.net.TcpConnectionManager.getConnection
>> (TcpConnectionManager.java:64)
>> at
>>
>> org.apache.cassandra.net.MessagingService.getConnection
>> (MessagingService.java:306)
>> at
>>
>> org.apache.cassandra.net.MessageSerializationTask.run
>> (MessageSerializationTask.java:66)
>> at
>>
>> java.util.concurrent.ThreadPoolExecutor$Worker.runTask
>> (ThreadPoolExecutor.java:886)
>> at
>>
>> java.util.concurrent.ThreadPoolExecutor$Worker.run
>> (ThreadPoolExecutor.java:908)
>> at java.lang.Thread.run(Thread.java:619)
>> DEBUG [RESPONSE-STAGE:4] 2009-08-19 16:54:57,622  
>> ResponseVerbHandler.java
>> (line 38) Processing response on a callback
>> from 65B1E352-A0A3-1A7F-138B-9BEA3E1D787F@192.168.133.129:7000
>> ERROR [pool-1-thread-22] 2009-08-19 16:55:02,619 Cassandra.java (line
>> 608)
>> Internal error processing get
>> java.lang.RuntimeException: java.util.concurrent.TimeoutException:
>> Operation
>> timed out - received only 1 responses from 192.168.133.129:7000 .
>> at
>>
>> org.apache.cassandra.service.CassandraServer.readColumnFamily
>> (CassandraServer.java:100)
>> at
>>
>> org.apache.cassandra.service.CassandraServer.get
>> (CassandraServer.java:226)
>> at
>>
>> org.apache.cassandra.service.Cassandra$Processor$get.process
>> (Cassandra.java:602)
>> at
>>
>> org.apache.cassandra.service.Cassandra$Processor.process
>> (Cassandra.java:560)
>> at
>>
>> org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run
>> (TThreadPoolServer.java:252)
>> at
>>
>> java.util.concurrent.ThreadPoolExecutor$Worker.runTask
>> (ThreadPoolExecutor.java:886)
>> at
>>
>> java.util.concurrent.ThreadPoolExecutor$Worker.run
>> (ThreadPoolExecutor.java:908)
>> at java.lang.Thread.run(Thread.java:619)
>> Caused by: java.util.concurrent.TimeoutException: Operation timed  
>> out -
>> received only 1 responses from 192.168.133.129:7000 .
>> at
>>
>> org.apache.cassandra.service.QuorumResponseHandler.get
>> (QuorumResponseHandler.java:86)
>> at
>>
>> org.apache.cassandra.service.StorageProxy.strongRead
>> (StorageProxy.java:435)
>> at
>>
>> org.apache.cassandra.service.StorageProxy.readProtocol
>> (StorageProxy.java:330)
>> at
>>
>> org.apache.cassandra.service.CassandraServer.readColumnFamily
>> (CassandraServer.java:92)
>> ... 7 more
>>
>> It appears to me that there is a timeout during the  
>> QuorumResponseHandler
>> processing, stemming from a NullPointerException that occurs as  
>> part of
>> the
>> read process. I suspect that this NullPointerException has  
>> something to
>> do
>> with the second DEBUG [pool-1-thread-22] comment regarding  
>> strongread ...
>> from 38184@null.
>> Does anyone know why this might be happening?
>> Thanks for any insight,
>> Phil
>>
>>
>>
>>
>
>
>
> --
> Evan Weaver