Quantcast

Bulk loader with Cassandra 1.2.5

classic Classic list List threaded Threaded
8 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Bulk loader with Cassandra 1.2.5

Davide Anastasia
Hi,
I am currently using Cassandra 1.2.5 on RHEL6 with Oracle JVM.
I want to try to build a bulk loader for stock prices that I have available in CSV format. However, I have started exploring Cassandra from something easier, focussing on the example available in the manual.

I cannot manage to have the bulk loader working as expected: once I create the sstables and I import them with sstableloader, I cannot perform anymore "select * from users" in the CQL console, or I get a java debug stack trace on the logfile and an rpc time out in the CQL console.

java.lang.RuntimeException: java.lang.IllegalArgumentException
at org.apache.cassandra.service.StorageProxy$DroppableRunnable.run(StorageProxy.java:1582)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:895)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:918)
at java.lang.Thread.run(Thread.java:680)
Caused by: java.lang.IllegalArgumentException
at java.nio.Buffer.limit(Buffer.java:247)
at org.apache.cassandra.db.marshal.AbstractCompositeType.getBytes(AbstractCompositeType.java:51)
at org.apache.cassandra.db.marshal.AbstractCompositeType.getWithShortLength(AbstractCompositeType.java:60)
at org.apache.cassandra.db.marshal.AbstractCompositeType.compare(AbstractCompositeType.java:78)
at org.apache.cassandra.db.marshal.AbstractCompositeType.compare(AbstractCompositeType.java:31)
at java.util.TreeMap.getEntryUsingComparator(TreeMap.java:351)
at java.util.TreeMap.getEntry(TreeMap.java:322)
at java.util.TreeMap.containsKey(TreeMap.java:209)
at java.util.TreeSet.contains(TreeSet.java:217)
at org.apache.cassandra.db.columniterator.SSTableNamesIterator.readSimpleColumns(SSTableNamesIterator.java:188)
at org.apache.cassandra.db.columniterator.SSTableNamesIterator.read(SSTableNamesIterator.java:156)
at org.apache.cassandra.db.columniterator.SSTableNamesIterator.<init>(SSTableNamesIterator.java:83)
at org.apache.cassandra.db.filter.NamesQueryFilter.getSSTableColumnIterator(NamesQueryFilter.java:86)
at org.apache.cassandra.db.filter.QueryFilter.getSSTableColumnIterator(QueryFilter.java:75)
at org.apache.cassandra.io.sstable.SSTableScanner$FilteredKeyScanningIterator$1.create(SSTableScanner.java:248)
at org.apache.cassandra.db.columniterator.LazyColumnIterator.getSubIterator(LazyColumnIterator.java:75)
at org.apache.cassandra.db.columniterator.LazyColumnIterator.getColumnFamily(LazyColumnIterator.java:87)
at org.apache.cassandra.db.RowIteratorFactory$2.reduce(RowIteratorFactory.java:95)
at org.apache.cassandra.db.RowIteratorFactory$2.reduce(RowIteratorFactory.java:79)
at org.apache.cassandra.utils.MergeIterator$ManyToOne.consume(MergeIterator.java:111)
at org.apache.cassandra.utils.MergeIterator$ManyToOne.computeNext(MergeIterator.java:97)
at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:143)
at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:138)
at org.apache.cassandra.db.ColumnFamilyStore$3.computeNext(ColumnFamilyStore.java:1399)
at org.apache.cassandra.db.ColumnFamilyStore$3.computeNext(ColumnFamilyStore.java:1395)
at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:143)
at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:138)
at org.apache.cassandra.db.ColumnFamilyStore.filter(ColumnFamilyStore.java:1466)
at org.apache.cassandra.db.ColumnFamilyStore.getRangeSlice(ColumnFamilyStore.java:1443)
at org.apache.cassandra.service.RangeSliceVerbHandler.executeLocally(RangeSliceVerbHandler.java:46)
at org.apache.cassandra.service.StorageProxy$LocalRangeSliceRunnable.runMayThrow(StorageProxy.java:1076)
at org.apache.cassandra.service.StorageProxy$DroppableRunnable.run(StorageProxy.java:1578)
... 3 more

The code I am using is currently stored here: https://gist.github.com/davideanastasia/5720903

What am I doing wrong?

Thanks,
Davide
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Bulk loader with Cassandra 1.2.5

Keith Wright
Could it be because you are writing age as a long but have it defined as an integer in the table definition?

Davide Anastasia <[hidden email]> wrote:

Hi,
I am currently using Cassandra 1.2.5 on RHEL6 with Oracle JVM.
I want to try to build a bulk loader for stock prices that I have available in CSV format. However, I have started exploring Cassandra from something easier, focussing on the example available in the manual.

I cannot manage to have the bulk loader working as expected: once I create the sstables and I import them with sstableloader, I cannot perform anymore "select * from users" in the CQL console, or I get a java debug stack trace on the logfile and an rpc time out in the CQL console.

java.lang.RuntimeException: java.lang.IllegalArgumentException
at org.apache.cassandra.service.StorageProxy$DroppableRunnable.run(StorageProxy.java:1582)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:895)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:918)
at java.lang.Thread.run(Thread.java:680)
Caused by: java.lang.IllegalArgumentException
at java.nio.Buffer.limit(Buffer.java:247)
at org.apache.cassandra.db.marshal.AbstractCompositeType.getBytes(AbstractCompositeType.java:51)
at org.apache.cassandra.db.marshal.AbstractCompositeType.getWithShortLength(AbstractCompositeType.java:60)
at org.apache.cassandra.db.marshal.AbstractCompositeType.compare(AbstractCompositeType.java:78)
at org.apache.cassandra.db.marshal.AbstractCompositeType.compare(AbstractCompositeType.java:31)
at java.util.TreeMap.getEntryUsingComparator(TreeMap.java:351)
at java.util.TreeMap.getEntry(TreeMap.java:322)
at java.util.TreeMap.containsKey(TreeMap.java:209)
at java.util.TreeSet.contains(TreeSet.java:217)
at org.apache.cassandra.db.columniterator.SSTableNamesIterator.readSimpleColumns(SSTableNamesIterator.java:188)
at org.apache.cassandra.db.columniterator.SSTableNamesIterator.read(SSTableNamesIterator.java:156)
at org.apache.cassandra.db.columniterator.SSTableNamesIterator.<init>(SSTableNamesIterator.java:83)
at org.apache.cassandra.db.filter.NamesQueryFilter.getSSTableColumnIterator(NamesQueryFilter.java:86)
at org.apache.cassandra.db.filter.QueryFilter.getSSTableColumnIterator(QueryFilter.java:75)
at org.apache.cassandra.io.sstable.SSTableScanner$FilteredKeyScanningIterator$1.create(SSTableScanner.java:248)
at org.apache.cassandra.db.columniterator.LazyColumnIterator.getSubIterator(LazyColumnIterator.java:75)
at org.apache.cassandra.db.columniterator.LazyColumnIterator.getColumnFamily(LazyColumnIterator.java:87)
at org.apache.cassandra.db.RowIteratorFactory$2.reduce(RowIteratorFactory.java:95)
at org.apache.cassandra.db.RowIteratorFactory$2.reduce(RowIteratorFactory.java:79)
at org.apache.cassandra.utils.MergeIterator$ManyToOne.consume(MergeIterator.java:111)
at org.apache.cassandra.utils.MergeIterator$ManyToOne.computeNext(MergeIterator.java:97)
at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:143)
at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:138)
at org.apache.cassandra.db.ColumnFamilyStore$3.computeNext(ColumnFamilyStore.java:1399)
at org.apache.cassandra.db.ColumnFamilyStore$3.computeNext(ColumnFamilyStore.java:1395)
at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:143)
at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:138)
at org.apache.cassandra.db.ColumnFamilyStore.filter(ColumnFamilyStore.java:1466)
at org.apache.cassandra.db.ColumnFamilyStore.getRangeSlice(ColumnFamilyStore.java:1443)
at org.apache.cassandra.service.RangeSliceVerbHandler.executeLocally(RangeSliceVerbHandler.java:46)
at org.apache.cassandra.service.StorageProxy$LocalRangeSliceRunnable.runMayThrow(StorageProxy.java:1076)
at org.apache.cassandra.service.StorageProxy$DroppableRunnable.run(StorageProxy.java:1578)
... 3 more

The code I am using is currently stored here: https://gist.github.com/davideanastasia/5720903

What am I doing wrong?

Thanks,
Davide
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Bulk loader with Cassandra 1.2.5

Davide Anastasia
Hi,
I've just tried to change it to integer and I get exactly the same error.

Thanks,
Davide


On 6 June 2013 23:53, Keith Wright <[hidden email]> wrote:
Could it be because you are writing age as a long but have it defined as an integer in the table definition?

Davide Anastasia <[hidden email]> wrote:

Hi,
I am currently using Cassandra 1.2.5 on RHEL6 with Oracle JVM.
I want to try to build a bulk loader for stock prices that I have available in CSV format. However, I have started exploring Cassandra from something easier, focussing on the example available in the manual.

I cannot manage to have the bulk loader working as expected: once I create the sstables and I import them with sstableloader, I cannot perform anymore "select * from users" in the CQL console, or I get a java debug stack trace on the logfile and an rpc time out in the CQL console.

java.lang.RuntimeException: java.lang.IllegalArgumentException
at org.apache.cassandra.service.StorageProxy$DroppableRunnable.run(StorageProxy.java:1582)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:895)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:918)
at java.lang.Thread.run(Thread.java:680)
Caused by: java.lang.IllegalArgumentException
at java.nio.Buffer.limit(Buffer.java:247)
at org.apache.cassandra.db.marshal.AbstractCompositeType.getBytes(AbstractCompositeType.java:51)
at org.apache.cassandra.db.marshal.AbstractCompositeType.getWithShortLength(AbstractCompositeType.java:60)
at org.apache.cassandra.db.marshal.AbstractCompositeType.compare(AbstractCompositeType.java:78)
at org.apache.cassandra.db.marshal.AbstractCompositeType.compare(AbstractCompositeType.java:31)
at java.util.TreeMap.getEntryUsingComparator(TreeMap.java:351)
at java.util.TreeMap.getEntry(TreeMap.java:322)
at java.util.TreeMap.containsKey(TreeMap.java:209)
at java.util.TreeSet.contains(TreeSet.java:217)
at org.apache.cassandra.db.columniterator.SSTableNamesIterator.readSimpleColumns(SSTableNamesIterator.java:188)
at org.apache.cassandra.db.columniterator.SSTableNamesIterator.read(SSTableNamesIterator.java:156)
at org.apache.cassandra.db.columniterator.SSTableNamesIterator.<init>(SSTableNamesIterator.java:83)
at org.apache.cassandra.db.filter.NamesQueryFilter.getSSTableColumnIterator(NamesQueryFilter.java:86)
at org.apache.cassandra.db.filter.QueryFilter.getSSTableColumnIterator(QueryFilter.java:75)
at org.apache.cassandra.io.sstable.SSTableScanner$FilteredKeyScanningIterator$1.create(SSTableScanner.java:248)
at org.apache.cassandra.db.columniterator.LazyColumnIterator.getSubIterator(LazyColumnIterator.java:75)
at org.apache.cassandra.db.columniterator.LazyColumnIterator.getColumnFamily(LazyColumnIterator.java:87)
at org.apache.cassandra.db.RowIteratorFactory$2.reduce(RowIteratorFactory.java:95)
at org.apache.cassandra.db.RowIteratorFactory$2.reduce(RowIteratorFactory.java:79)
at org.apache.cassandra.utils.MergeIterator$ManyToOne.consume(MergeIterator.java:111)
at org.apache.cassandra.utils.MergeIterator$ManyToOne.computeNext(MergeIterator.java:97)
at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:143)
at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:138)
at org.apache.cassandra.db.ColumnFamilyStore$3.computeNext(ColumnFamilyStore.java:1399)
at org.apache.cassandra.db.ColumnFamilyStore$3.computeNext(ColumnFamilyStore.java:1395)
at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:143)
at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:138)
at org.apache.cassandra.db.ColumnFamilyStore.filter(ColumnFamilyStore.java:1466)
at org.apache.cassandra.db.ColumnFamilyStore.getRangeSlice(ColumnFamilyStore.java:1443)
at org.apache.cassandra.service.RangeSliceVerbHandler.executeLocally(RangeSliceVerbHandler.java:46)
at org.apache.cassandra.service.StorageProxy$LocalRangeSliceRunnable.runMayThrow(StorageProxy.java:1076)
at org.apache.cassandra.service.StorageProxy$DroppableRunnable.run(StorageProxy.java:1578)
... 3 more

The code I am using is currently stored here: https://gist.github.com/davideanastasia/5720903

What am I doing wrong?

Thanks,
Davide

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Bulk loader with Cassandra 1.2.5

Keith Wright
Looking into it further, I believe your issue is that you did not define the table with compact storage.  Without that, CQL3 will treat every column as a composite (as is hinted in your stack trace where you see AbstractCompositeType is the cause of the error).  Try changing your table definition as follows:

create table users (
id uuid primary key,
firstname varchar,
lastname varchar,
password varchar,
age int,
email varchar)
WITH COMPACT STORAGE 
and compaction = {'class' : 'LeveledCompactionStrategy' }

From: Davide Anastasia <[hidden email]>
Reply-To: "[hidden email]" <[hidden email]>
Date: Friday, June 7, 2013 2:11 AM
To: "[hidden email]" <[hidden email]>
Subject: Re: Bulk loader with Cassandra 1.2.5

AbstractCompositeType.java
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Bulk loader with Cassandra 1.2.5

Davide Anastasia

Hi Keith,
You are my hero :-)
It does work now.

Thanks a lot,
Davide

On 7 Jun 2013 10:57, "Keith Wright" <[hidden email]> wrote:
Looking into it further, I believe your issue is that you did not define the table with compact storage.  Without that, CQL3 will treat every column as a composite (as is hinted in your stack trace where you see AbstractCompositeType is the cause of the error).  Try changing your table definition as follows:

create table users (
id uuid primary key,
firstname varchar,
lastname varchar,
password varchar,
age int,
email varchar)
WITH COMPACT STORAGE 
and compaction = {'class' : 'LeveledCompactionStrategy' }

From: Davide Anastasia <[hidden email]>
Reply-To: "[hidden email]" <[hidden email]>
Date: Friday, June 7, 2013 2:11 AM
To: "[hidden email]" <[hidden email]>
Subject: Re: Bulk loader with Cassandra 1.2.5

AbstractCompositeType.java
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Bulk loader with Cassandra 1.2.5

Davide Anastasia
In reply to this post by Keith Wright
Hi,
how should the bulk loader be modified to support composite columns?

Thanks,
Davide


On 7 Jun 2013, at 10:56, Keith Wright <[hidden email]> wrote:

Looking into it further, I believe your issue is that you did not define the table with compact storage.  Without that, CQL3 will treat every column as a composite (as is hinted in your stack trace where you see AbstractCompositeType is the cause of the error).  Try changing your table definition as follows:

create table users (
id uuid primary key,
firstname varchar,
lastname varchar,
password varchar,
age int,
email varchar)
WITH COMPACT STORAGE 
and compaction = {'class' : 'LeveledCompactionStrategy' }

From: Davide Anastasia <[hidden email]>
Reply-To: "[hidden email]" <[hidden email]>
Date: Friday, June 7, 2013 2:11 AM
To: "[hidden email]" <[hidden email]>
Subject: Re: Bulk loader with Cassandra 1.2.5

AbstractCompositeType.java

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Bulk loader with Cassandra 1.2.5

Arthur Zubarev
In reply to this post by Keith Wright
I am interested to know if the compaction directive is the key because I have the same symptoms on Ubuntu Server 12.04 64 bit C* 1.2.4 with a CF of ~ > half mil records 6,000 chars each.

I can only get back max 6,000 records read in cqlsh, so, if I query SELECT COUNT(*) FROM A_CF LIMIT 6000; It works, but ... LIMIT 7000 results in the "rpc timeout" (I have the read time-out elevated to a value three times greater than the default in cassandra.yaml).


On 06/07/2013 05:56 AM, Keith Wright wrote:
Looking into it further, I believe your issue is that you did not define the table with compact storage.  Without that, CQL3 will treat every column as a composite (as is hinted in your stack trace where you see AbstractCompositeType is the cause of the error).  Try changing your table definition as follows:

create table users (
id uuid primary key,
firstname varchar,
lastname varchar,
password varchar,
age int,
email varchar)
WITH COMPACT STORAGE 
and compaction = {'class' : 'LeveledCompactionStrategy' }

From: Davide Anastasia <[hidden email]>
Reply-To: "[hidden email]" <[hidden email]>
Date: Friday, June 7, 2013 2:11 AM
To: "[hidden email]" <[hidden email]>
Subject: Re: Bulk loader with Cassandra 1.2.5

AbstractCompositeType.java


-- 

Regards,

Arthur
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Bulk loader with Cassandra 1.2.5

chikkubhai
This post has NOT been accepted by the mailing list yet.
In reply to this post by Davide Anastasia
Any update on this yet, are non compact tables supported?
Loading...