when using nodeprobe: java.lang.OutOfMemoryError: Java heap space

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

when using nodeprobe: java.lang.OutOfMemoryError: Java heap space

Simon Smith-3
I'm getting a traceback when using nodeprobe against Cassandra.

Immediately below is the traceback on the screen running cassandra -f
that I get when when I do a nodeprobe command, (e.g. ./nodeprobe -host
myhostname.localdomain  -port 9160 info).   The config and the
traceback on the nodeprobe screen follow below that (basic system info
is that it is an Amazon FC8 instance,  just under 2GB of ram.   The
code is cassandra trunk code from August 27.  The cassandra.in.sh is
unchanged and has -Xms128M and -Xmx1G, but I changed that to -Xmx1800M
and then the nodeprobe command gives same traceback but at least it
doesn't crash Cassandra, and after the nodeprobe it continues to let
me run multiget via thrift.  I only have about 80 items in the users
keyspace, and inserting and running multiget works fine, it is only
the nodeprobe which causes problems (same symptom if I do "nodeprobe
ring").  I have previously worked successfully with Cassandra with the
default JVM options in cassandra.in.sh - on CentOS 5 but that was a
while ago using older trunk code.

Any hints as to what is going on?  Do I need to be on a machine with
more memory and crank the JVM -Xmx up?  And just to confirm, are there
any non-recommended Linux systems, and are there any recommended ones?

Thanks,

Simon


java.lang.OutOfMemoryError: Java heap space
Dumping heap to java_pid21095.hprof ...
Heap dump file created [3821447 bytes in 0.133 secs]
ERROR - Fatal exception in thread Thread[pool-1-thread-2,5,main]
java.lang.OutOfMemoryError: Java heap space
        at org.apache.thrift.protocol.TBinaryProtocol.readStringBody(TBinaryProtocol.java:296)
        at org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:203)
        at org.apache.cassandra.service.Cassandra$Processor.process(Cassandra.java:594)
        at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:253)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
        at java.lang.Thread.run(Thread.java:675)


++++++++++++++++++++++++++++++++++++++++++++++++++

MY CONFIG (storage-conf.xml) (basically unchanged except for the
Keyspaces stanza)

<Storage>
  <ClusterName>Test Cluster</ClusterName>
  <Keyspaces>
    <Keyspace Name="users">
      <KeysCachedFraction>0.01</KeysCachedFraction>
      <ColumnFamily CompareWith="UTF8Type" FlushPeriodInMinutes="60"
Name="pwhash"/>
    </Keyspace>
  </Keyspaces>
  <Partitioner>org.apache.cassandra.dht.RandomPartitioner</Partitioner>
  <InitialToken></InitialToken>
  <EndPointSnitch>org.apache.cassandra.locator.EndPointSnitch</EndPointSnitch>
  <ReplicaPlacementStrategy>org.apache.cassandra.locator.RackUnawareStrategy</ReplicaPlacementStrategy>
  <ReplicationFactor>1</ReplicationFactor>
  <CommitLogDirectory>/var/lib/cassandra/commitlog</CommitLogDirectory>
  <DataFileDirectories>
    <DataFileDirectory>/var/lib/cassandra/data</DataFileDirectory>
  </DataFileDirectories>
  <CalloutLocation>/var/lib/cassandra/callouts</CalloutLocation>
  <BootstrapFileDirectory>/var/lib/cassandra/bootstrap</BootstrapFileDirectory>
  <StagingFileDirectory>/var/lib/cassandra/staging</StagingFileDirectory>
  <Seeds>
    <Seed>127.0.0.1</Seed>
  </Seeds>
  <RpcTimeoutInMillis>5000</RpcTimeoutInMillis>
  <CommitLogRotationThresholdInMB>128</CommitLogRotationThresholdInMB>
  <ListenAddress></ListenAddress>
  <StoragePort>7000</StoragePort>
  <ControlPort>7001</ControlPort>
  <ThriftAddress>0.0.0.0</ThriftAddress>
  <ThriftPort>9160</ThriftPort>
  <ThriftFramedTransport>false</ThriftFramedTransport>
  <SlicedBufferSizeInKB>64</SlicedBufferSizeInKB>
  <FlushDataBufferSizeInMB>32</FlushDataBufferSizeInMB>
  <FlushIndexBufferSizeInMB>8</FlushIndexBufferSizeInMB>
  <ColumnIndexSizeInKB>64</ColumnIndexSizeInKB>
  <MemtableSizeInMB>64</MemtableSizeInMB>
  <MemtableObjectCountInMillions>0.1</MemtableObjectCountInMillions>
  <ConcurrentReads>8</ConcurrentReads>
  <ConcurrentWrites>32</ConcurrentWrites>
  <CommitLogSync>periodic</CommitLogSync>
  <CommitLogSyncPeriodInMS>1000</CommitLogSyncPeriodInMS>
  <GCGraceSeconds>864000</GCGraceSeconds>
</Storage>


++++++++++++++++++++++++++++++++++++++++++++++++++

OUTPUT ON THE SCREEN RUNNING nodeprobe:

./nodeprobe -host `hostname -f` -port 9160 info
Error connecting to remote JMX agent!
java.io.IOException: Failed to retrieve RMIServer stub:
javax.naming.CommunicationException [Root exception is
java.rmi.ConnectIOException: error during JRMP connection
establishment; nested exception is:
        java.io.EOFException]
        at javax.management.remote.rmi.RMIConnector.connect(RMIConnector.java:342)
        at javax.management.remote.JMXConnectorFactory.connect(JMXConnectorFactory.java:267)
        at org.apache.cassandra.tools.NodeProbe.connect(NodeProbe.java:151)
        at org.apache.cassandra.tools.NodeProbe.<init>(NodeProbe.java:113)
        at org.apache.cassandra.tools.NodeProbe.main(NodeProbe.java:533)
Caused by: javax.naming.CommunicationException [Root exception is
java.rmi.ConnectIOException: error during JRMP connection
establishment; nested exception is:
        java.io.EOFException]
        at com.sun.jndi.rmi.registry.RegistryContext.lookup(RegistryContext.java:119)
        at com.sun.jndi.toolkit.url.GenericURLContext.lookup(GenericURLContext.java:203)
        at javax.naming.InitialContext.lookup(InitialContext.java:410)
        at javax.management.remote.rmi.RMIConnector.findRMIServerJNDI(RMIConnector.java:1902)
        at javax.management.remote.rmi.RMIConnector.findRMIServer(RMIConnector.java:1871)
        at javax.management.remote.rmi.RMIConnector.connect(RMIConnector.java:276)
        ... 4 more
Caused by: java.rmi.ConnectIOException: error during JRMP connection
establishment; nested exception is:
        java.io.EOFException
        at sun.rmi.transport.tcp.TCPChannel.createConnection(TCPChannel.java:304)
        at sun.rmi.transport.tcp.TCPChannel.newConnection(TCPChannel.java:202)
        at sun.rmi.server.UnicastRef.newCall(UnicastRef.java:340)
        at sun.rmi.registry.RegistryImpl_Stub.lookup(Unknown Source)
        at com.sun.jndi.rmi.registry.RegistryContext.lookup(RegistryContext.java:115)
        ... 9 more
Caused by: java.io.EOFException
        at java.io.DataInputStream.readByte(DataInputStream.java:268)
        at sun.rmi.transport.tcp.TCPChannel.createConnection(TCPChannel.java:246)
        ... 13 more
Reply | Threaded
Open this post in threaded view
|

Re: when using nodeprobe: java.lang.OutOfMemoryError: Java heap space

Jonathan Ellis-3
On Fri, Aug 28, 2009 at 11:25 AM, Simon Smith<[hidden email]> wrote:
> I'm getting a traceback when using nodeprobe against Cassandra.

That looks like a Thrift bug. :(

Can you try an older version of Cassandra, e.g. trunk from a week ago,
or the beta1 release, to see if the Thrift library upgrade from
yesterday is responsible?

> Any hints as to what is going on?  Do I need to be on a machine with
> more memory and crank the JVM -Xmx up?  And just to confirm, are there
> any non-recommended Linux systems, and are there any recommended ones?

EC2 in general has terrible I/O performance.  But I don't know of any
specific linux distribution problems.  (Although isn't FC8 pretty
old?)

-Jonathan
Reply | Threaded
Open this post in threaded view
|

Re: when using nodeprobe: java.lang.OutOfMemoryError: Java heap space

Simon Smith-3
I went and grabbed apache-cassandra-incubating-2009-08-20_13-02-45-src
and I get the same symptoms when using that version.

Thanks again - Simon

On Fri, Aug 28, 2009 at 12:34 PM, Jonathan Ellis<[hidden email]> wrote:
> On Fri, Aug 28, 2009 at 11:25 AM, Simon Smith<[hidden email]> wrote:
>> I'm getting a traceback when using nodeprobe against Cassandra.
>
> That looks like a Thrift bug. :(
>
> Can you try an older version of Cassandra, e.g. trunk from a week ago,
> or the beta1 release, to see if the Thrift library upgrade from
> yesterday is responsible?
>
Reply | Threaded
Open this post in threaded view
|

Re: when using nodeprobe: java.lang.OutOfMemoryError: Java heap space

Jonathan Ellis-3
Oh, I see the problem: nodeprobe uses the jmx port (specified in
cassandra.in.sh -- default 8080), not the thrift port.

Can you file a bug with Thrift/java not to OOM when someone connects
to the socket and sends nonsense? :)
https://issues.apache.org/jira/browse/THRIFT

-Jonathan

On Fri, Aug 28, 2009 at 11:45 AM, Simon Smith<[hidden email]> wrote:

> I went and grabbed apache-cassandra-incubating-2009-08-20_13-02-45-src
> and I get the same symptoms when using that version.
>
> Thanks again - Simon
>
> On Fri, Aug 28, 2009 at 12:34 PM, Jonathan Ellis<[hidden email]> wrote:
>> On Fri, Aug 28, 2009 at 11:25 AM, Simon Smith<[hidden email]> wrote:
>>> I'm getting a traceback when using nodeprobe against Cassandra.
>>
>> That looks like a Thrift bug. :(
>>
>> Can you try an older version of Cassandra, e.g. trunk from a week ago,
>> or the beta1 release, to see if the Thrift library upgrade from
>> yesterday is responsible?
>>
>
Reply | Threaded
Open this post in threaded view
|

Re: when using nodeprobe: java.lang.OutOfMemoryError: Java heap space

Simon Smith-3
Damn, how embarrassing!  User error.    Thank you so much for the help.

On Fri, Aug 28, 2009 at 1:00 PM, Jonathan Ellis<[hidden email]> wrote:
> Oh, I see the problem: nodeprobe uses the jmx port (specified in
> cassandra.in.sh -- default 8080), not the thrift port.
>