Quantcast

Bulk loader with Cassandra 1.2.5

classic Classic list List threaded Threaded
7 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Bulk loader with Cassandra 1.2.5

Davide Anastasia
Hi,
I am currently using Cassandra 1.2.5 on RHEL6 with Oracle JVM.
I want to try to build a bulk loader for stock prices that I have available in CSV format. However, I have started exploring Cassandra from something easier, focussing on the example available in the manual.

I cannot manage to have the bulk loader working as expected: once I create the sstables and I import them with sstableloader, I cannot perform anymore "select * from users" in the CQL console, or I get a java debug stack trace on the logfile and an rpc time out in the CQL console.

java.lang.RuntimeException: java.lang.IllegalArgumentException
at org.apache.cassandra.service.StorageProxy$DroppableRunnable.run(StorageProxy.java:1582)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:895)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:918)
at java.lang.Thread.run(Thread.java:680)
Caused by: java.lang.IllegalArgumentException
at java.nio.Buffer.limit(Buffer.java:247)
at org.apache.cassandra.db.marshal.AbstractCompositeType.getBytes(AbstractCompositeType.java:51)
at org.apache.cassandra.db.marshal.AbstractCompositeType.getWithShortLength(AbstractCompositeType.java:60)
at org.apache.cassandra.db.marshal.AbstractCompositeType.compare(AbstractCompositeType.java:78)
at org.apache.cassandra.db.marshal.AbstractCompositeType.compare(AbstractCompositeType.java:31)
at java.util.TreeMap.getEntryUsingComparator(TreeMap.java:351)
at java.util.TreeMap.getEntry(TreeMap.java:322)
at java.util.TreeMap.containsKey(TreeMap.java:209)
at java.util.TreeSet.contains(TreeSet.java:217)
at org.apache.cassandra.db.columniterator.SSTableNamesIterator.readSimpleColumns(SSTableNamesIterator.java:188)
at org.apache.cassandra.db.columniterator.SSTableNamesIterator.read(SSTableNamesIterator.java:156)
at org.apache.cassandra.db.columniterator.SSTableNamesIterator.<init>(SSTableNamesIterator.java:83)
at org.apache.cassandra.db.filter.NamesQueryFilter.getSSTableColumnIterator(NamesQueryFilter.java:86)
at org.apache.cassandra.db.filter.QueryFilter.getSSTableColumnIterator(QueryFilter.java:75)
at org.apache.cassandra.io.sstable.SSTableScanner$FilteredKeyScanningIterator$1.create(SSTableScanner.java:248)
at org.apache.cassandra.db.columniterator.LazyColumnIterator.getSubIterator(LazyColumnIterator.java:75)
at org.apache.cassandra.db.columniterator.LazyColumnIterator.getColumnFamily(LazyColumnIterator.java:87)
at org.apache.cassandra.db.RowIteratorFactory$2.reduce(RowIteratorFactory.java:95)
at org.apache.cassandra.db.RowIteratorFactory$2.reduce(RowIteratorFactory.java:79)
at org.apache.cassandra.utils.MergeIterator$ManyToOne.consume(MergeIterator.java:111)
at org.apache.cassandra.utils.MergeIterator$ManyToOne.computeNext(MergeIterator.java:97)
at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:143)
at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:138)
at org.apache.cassandra.db.ColumnFamilyStore$3.computeNext(ColumnFamilyStore.java:1399)
at org.apache.cassandra.db.ColumnFamilyStore$3.computeNext(ColumnFamilyStore.java:1395)
at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:143)
at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:138)
at org.apache.cassandra.db.ColumnFamilyStore.filter(ColumnFamilyStore.java:1466)
at org.apache.cassandra.db.ColumnFamilyStore.getRangeSlice(ColumnFamilyStore.java:1443)
at org.apache.cassandra.service.RangeSliceVerbHandler.executeLocally(RangeSliceVerbHandler.java:46)
at org.apache.cassandra.service.StorageProxy$LocalRangeSliceRunnable.runMayThrow(StorageProxy.java:1076)
at org.apache.cassandra.service.StorageProxy$DroppableRunnable.run(StorageProxy.java:1578)
... 3 more

The code I am using is currently stored here: https://gist.github.com/davideanastasia/5720903

What am I doing wrong?

Thanks,
Davide
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Bulk loader with Cassandra 1.2.5

Keith Wright
Could it be because you are writing age as a long but have it defined as an integer in the table definition?

Davide Anastasia <[hidden email]> wrote:

Hi,
I am currently using Cassandra 1.2.5 on RHEL6 with Oracle JVM.
I want to try to build a bulk loader for stock prices that I have available in CSV format. However, I have started exploring Cassandra from something easier, focussing on the example available in the manual.

I cannot manage to have the bulk loader working as expected: once I create the sstables and I import them with sstableloader, I cannot perform anymore "select * from users" in the CQL console, or I get a java debug stack trace on the logfile and an rpc time out in the CQL console.

java.lang.RuntimeException: java.lang.IllegalArgumentException
at org.apache.cassandra.service.StorageProxy$DroppableRunnable.run(StorageProxy.java:1582)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:895)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:918)
at java.lang.Thread.run(Thread.java:680)
Caused by: java.lang.IllegalArgumentException
at java.nio.Buffer.limit(Buffer.java:247)
at org.apache.cassandra.db.marshal.AbstractCompositeType.getBytes(AbstractCompositeType.java:51)
at org.apache.cassandra.db.marshal.AbstractCompositeType.getWithShortLength(AbstractCompositeType.java:60)
at org.apache.cassandra.db.marshal.AbstractCompositeType.compare(AbstractCompositeType.java:78)
at org.apache.cassandra.db.marshal.AbstractCompositeType.compare(AbstractCompositeType.java:31)
at java.util.TreeMap.getEntryUsingComparator(TreeMap.java:351)
at java.util.TreeMap.getEntry(TreeMap.java:322)
at java.util.TreeMap.containsKey(TreeMap.java:209)
at java.util.TreeSet.contains(TreeSet.java:217)
at org.apache.cassandra.db.columniterator.SSTableNamesIterator.readSimpleColumns(SSTableNamesIterator.java:188)
at org.apache.cassandra.db.columniterator.SSTableNamesIterator.read(SSTableNamesIterator.java:156)
at org.apache.cassandra.db.columniterator.SSTableNamesIterator.<init>(SSTableNamesIterator.java:83)
at org.apache.cassandra.db.filter.NamesQueryFilter.getSSTableColumnIterator(NamesQueryFilter.java:86)
at org.apache.cassandra.db.filter.QueryFilter.getSSTableColumnIterator(QueryFilter.java:75)
at org.apache.cassandra.io.sstable.SSTableScanner$FilteredKeyScanningIterator$1.create(SSTableScanner.java:248)
at org.apache.cassandra.db.columniterator.LazyColumnIterator.getSubIterator(LazyColumnIterator.java:75)
at org.apache.cassandra.db.columniterator.LazyColumnIterator.getColumnFamily(LazyColumnIterator.java:87)
at org.apache.cassandra.db.RowIteratorFactory$2.reduce(RowIteratorFactory.java:95)
at org.apache.cassandra.db.RowIteratorFactory$2.reduce(RowIteratorFactory.java:79)
at org.apache.cassandra.utils.MergeIterator$ManyToOne.consume(MergeIterator.java:111)
at org.apache.cassandra.utils.MergeIterator$ManyToOne.computeNext(MergeIterator.java:97)
at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:143)
at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:138)
at org.apache.cassandra.db.ColumnFamilyStore$3.computeNext(ColumnFamilyStore.java:1399)
at org.apache.cassandra.db.ColumnFamilyStore$3.computeNext(ColumnFamilyStore.java:1395)
at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:143)
at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:138)
at org.apache.cassandra.db.ColumnFamilyStore.filter(ColumnFamilyStore.java:1466)
at org.apache.cassandra.db.ColumnFamilyStore.getRangeSlice(ColumnFamilyStore.java:1443)
at org.apache.cassandra.service.RangeSliceVerbHandler.executeLocally(RangeSliceVerbHandler.java:46)
at org.apache.cassandra.service.StorageProxy$LocalRangeSliceRunnable.runMayThrow(StorageProxy.java:1076)
at org.apache.cassandra.service.StorageProxy$DroppableRunnable.run(StorageProxy.java:1578)
... 3 more

The code I am using is currently stored here: https://gist.github.com/davideanastasia/5720903

What am I doing wrong?

Thanks,
Davide
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Bulk loader with Cassandra 1.2.5

Davide Anastasia
Hi,
I've just tried to change it to integer and I get exactly the same error.

Thanks,
Davide


On 6 June 2013 23:53, Keith Wright <[hidden email]> wrote:
Could it be because you are writing age as a long but have it defined as an integer in the table definition?

Davide Anastasia <[hidden email]> wrote:

Hi,
I am currently using Cassandra 1.2.5 on RHEL6 with Oracle JVM.
I want to try to build a bulk loader for stock prices that I have available in CSV format. However, I have started exploring Cassandra from something easier, focussing on the example available in the manual.

I cannot manage to have the bulk loader working as expected: once I create the sstables and I import them with sstableloader, I cannot perform anymore "select * from users" in the CQL console, or I get a java debug stack trace on the logfile and an rpc time out in the CQL console.

java.lang.RuntimeException: java.lang.IllegalArgumentException
at org.apache.cassandra.service.StorageProxy$DroppableRunnable.run(StorageProxy.java:1582)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:895)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:918)
at java.lang.Thread.run(Thread.java:680)
Caused by: java.lang.IllegalArgumentException
at java.nio.Buffer.limit(Buffer.java:247)
at org.apache.cassandra.db.marshal.AbstractCompositeType.getBytes(AbstractCompositeType.java:51)
at org.apache.cassandra.db.marshal.AbstractCompositeType.getWithShortLength(AbstractCompositeType.java:60)
at org.apache.cassandra.db.marshal.AbstractCompositeType.compare(AbstractCompositeType.java:78)
at org.apache.cassandra.db.marshal.AbstractCompositeType.compare(AbstractCompositeType.java:31)
at java.util.TreeMap.getEntryUsingComparator(TreeMap.java:351)
at java.util.TreeMap.getEntry(TreeMap.java:322)
at java.util.TreeMap.containsKey(TreeMap.java:209)
at java.util.TreeSet.contains(TreeSet.java:217)
at org.apache.cassandra.db.columniterator.SSTableNamesIterator.readSimpleColumns(SSTableNamesIterator.java:188)
at org.apache.cassandra.db.columniterator.SSTableNamesIterator.read(SSTableNamesIterator.java:156)
at org.apache.cassandra.db.columniterator.SSTableNamesIterator.<init>(SSTableNamesIterator.java:83)
at org.apache.cassandra.db.filter.NamesQueryFilter.getSSTableColumnIterator(NamesQueryFilter.java:86)
at org.apache.cassandra.db.filter.QueryFilter.getSSTableColumnIterator(QueryFilter.java:75)
at org.apache.cassandra.io.sstable.SSTableScanner$FilteredKeyScanningIterator$1.create(SSTableScanner.java:248)
at org.apache.cassandra.db.columniterator.LazyColumnIterator.getSubIterator(LazyColumnIterator.java:75)
at org.apache.cassandra.db.columniterator.LazyColumnIterator.getColumnFamily(LazyColumnIterator.java:87)
at org.apache.cassandra.db.RowIteratorFactory$2.reduce(RowIteratorFactory.java:95)
at org.apache.cassandra.db.RowIteratorFactory$2.reduce(RowIteratorFactory.java:79)
at org.apache.cassandra.utils.MergeIterator$ManyToOne.consume(MergeIterator.java:111)
at org.apache.cassandra.utils.MergeIterator$ManyToOne.computeNext(MergeIterator.java:97)
at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:143)
at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:138)
at org.apache.cassandra.db.ColumnFamilyStore$3.computeNext(ColumnFamilyStore.java:1399)
at org.apache.cassandra.db.ColumnFamilyStore$3.computeNext(ColumnFamilyStore.java:1395)
at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:143)
at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:138)
at org.apache.cassandra.db.ColumnFamilyStore.filter(ColumnFamilyStore.java:1466)
at org.apache.cassandra.db.ColumnFamilyStore.getRangeSlice(ColumnFamilyStore.java:1443)
at org.apache.cassandra.service.RangeSliceVerbHandler.executeLocally(RangeSliceVerbHandler.java:46)
at org.apache.cassandra.service.StorageProxy$LocalRangeSliceRunnable.runMayThrow(StorageProxy.java:1076)
at org.apache.cassandra.service.StorageProxy$DroppableRunnable.run(StorageProxy.java:1578)
... 3 more

The code I am using is currently stored here: https://gist.github.com/davideanastasia/5720903

What am I doing wrong?

Thanks,
Davide

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Bulk loader with Cassandra 1.2.5

Keith Wright
Looking into it further, I believe your issue is that you did not define the table with compact storage.  Without that, CQL3 will treat every column as a composite (as is hinted in your stack trace where you see AbstractCompositeType is the cause of the error).  Try changing your table definition as follows:

create table users (
id uuid primary key,
firstname varchar,
lastname varchar,
password varchar,
age int,
email varchar)
WITH COMPACT STORAGE 
and compaction = {'class' : 'LeveledCompactionStrategy' }

From: Davide Anastasia <[hidden email]>
Reply-To: "[hidden email]" <[hidden email]>
Date: Friday, June 7, 2013 2:11 AM
To: "[hidden email]" <[hidden email]>
Subject: Re: Bulk loader with Cassandra 1.2.5

AbstractCompositeType.java
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Bulk loader with Cassandra 1.2.5

Davide Anastasia

Hi Keith,
You are my hero :-)
It does work now.

Thanks a lot,
Davide

On 7 Jun 2013 10:57, "Keith Wright" <[hidden email]> wrote:
Looking into it further, I believe your issue is that you did not define the table with compact storage.  Without that, CQL3 will treat every column as a composite (as is hinted in your stack trace where you see AbstractCompositeType is the cause of the error).  Try changing your table definition as follows:

create table users (
id uuid primary key,
firstname varchar,
lastname varchar,
password varchar,
age int,
email varchar)
WITH COMPACT STORAGE 
and compaction = {'class' : 'LeveledCompactionStrategy' }

From: Davide Anastasia <[hidden email]>
Reply-To: "[hidden email]" <[hidden email]>
Date: Friday, June 7, 2013 2:11 AM
To: "[hidden email]" <[hidden email]>
Subject: Re: Bulk loader with Cassandra 1.2.5

AbstractCompositeType.java
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Bulk loader with Cassandra 1.2.5

Davide Anastasia
In reply to this post by Keith Wright
Hi,
how should the bulk loader be modified to support composite columns?

Thanks,
Davide


On 7 Jun 2013, at 10:56, Keith Wright <[hidden email]> wrote:

Looking into it further, I believe your issue is that you did not define the table with compact storage.  Without that, CQL3 will treat every column as a composite (as is hinted in your stack trace where you see AbstractCompositeType is the cause of the error).  Try changing your table definition as follows:

create table users (
id uuid primary key,
firstname varchar,
lastname varchar,
password varchar,
age int,
email varchar)
WITH COMPACT STORAGE 
and compaction = {'class' : 'LeveledCompactionStrategy' }

From: Davide Anastasia <[hidden email]>
Reply-To: "[hidden email]" <[hidden email]>
Date: Friday, June 7, 2013 2:11 AM
To: "[hidden email]" <[hidden email]>
Subject: Re: Bulk loader with Cassandra 1.2.5

AbstractCompositeType.java

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Bulk loader with Cassandra 1.2.5

Arthur Zubarev
In reply to this post by Keith Wright
I am interested to know if the compaction directive is the key because I have the same symptoms on Ubuntu Server 12.04 64 bit C* 1.2.4 with a CF of ~ > half mil records 6,000 chars each.

I can only get back max 6,000 records read in cqlsh, so, if I query SELECT COUNT(*) FROM A_CF LIMIT 6000; It works, but ... LIMIT 7000 results in the "rpc timeout" (I have the read time-out elevated to a value three times greater than the default in cassandra.yaml).


On 06/07/2013 05:56 AM, Keith Wright wrote:
Looking into it further, I believe your issue is that you did not define the table with compact storage.  Without that, CQL3 will treat every column as a composite (as is hinted in your stack trace where you see AbstractCompositeType is the cause of the error).  Try changing your table definition as follows:

create table users (
id uuid primary key,
firstname varchar,
lastname varchar,
password varchar,
age int,
email varchar)
WITH COMPACT STORAGE 
and compaction = {'class' : 'LeveledCompactionStrategy' }

From: Davide Anastasia <[hidden email]>
Reply-To: "[hidden email]" <[hidden email]>
Date: Friday, June 7, 2013 2:11 AM
To: "[hidden email]" <[hidden email]>
Subject: Re: Bulk loader with Cassandra 1.2.5

AbstractCompositeType.java


-- 

Regards,

Arthur
Loading...