keys and column names cannot be utf-8

classic Classic list List threaded Threaded
18 messages Options
Reply | Threaded
Open this post in threaded view
|

keys and column names cannot be utf-8

mobiledreamers
Is there any timeline on when commit 185 will be done as the utf8 error still exists

In my experiments, i found that keys and column-names still cannot be utf8

this is a major restriction
Please push this fix in the trunk
Thanks

On Sun, Jul 19, 2009 at 6:11 AM, Jonathan Ellis <[hidden email]> wrote:
That should be partially solved in trunk now that 139 is committed,
and more solved when we commit 185 soon.

On Sun, Jul 19, 2009 at 3:43 AM, <[hidden email]> wrote:
> Any utf-8 keyword causes cassandra to crash!
>
Reply | Threaded
Open this post in threaded view
|

Re: keys and column names cannot be utf-8

Eric Evans-4
On Tue, 2009-07-21 at 09:18 -0700, [hidden email] wrote:
> Is there any timeline on when commit 185 will be done as the utf8
> error still exists

185 was committed yesterday.

https://issues.apache.org/jira/browse/CASSANDRA-185

--
Eric Evans
[hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: keys and column names cannot be utf-8

mobiledreamers
Not fixed

The following utf8 key names and column names still give an error.

cass: 2009-07-21 13:55:35,597 error 98. ìµì§
                                            ì¤ ìì§                       Ûïº] (1)icasso's, instruments de musique sur un guéridon] (1)Ûïº[irancel] (1)ïº
cass: 2009-07-21 13:55:55,093 error 377. friday night lights s03e01[âmegaupload..50 error 321. instruments de musique sur un guéridon[[

comâ
    cass: 2009-07-21 13:56:12,341 error 637. asuka izumi photos[u15 ç«¥æãçé] (1)
cass: 2009-07-21 13:56:39,380 error 1118. dragonball z games for pc[dragon balĺz pc games download] (1)
cass: 2009-07-21 13:56:48,976 error 1301. ï»ïº­ïºïºï» ﺳ[ï»ïº­ïºïºï» ﺳ ï»
                                                 导æç³æµ·è¯±å¥¸å¯¼è´å¥³çèªæ] ((1)2009-07-21 13:56:55,352 error 1430. æç³æµ·[大å
cass: 2009-07-21 13:56:59,287 error 1510. cinquième république[définition de  république?] (1)                                      å¯¼æç³æµ·] (1)
cass: 2009-07-21 13:59:38,783 error 1842. navaratri kolu[doll festival in navratt
ri golu] (1)
cass: 2009-07-21 13:59:39,069 error 1846. tn lottery winning numbers[www.tnlottery] (1)
cass: 2009-07-21 13:59:39,274 error 1850. www.buildabearville.com cheats[all the buildabear.com cheats and codes] (1)
cass: 2009-07-21 13:59:39,773 error 1860. shippuuden 78[naruto shippuuden 78 subbed torrent] (1)


On Tue, Jul 21, 2009 at 10:34 AM, Eric Evans <[hidden email]> wrote:
On Tue, 2009-07-21 at 09:18 -0700, [hidden email] wrote:
> Is there any timeline on when commit 185 will be done as the utf8
> error still exists

185 was committed yesterday.

https://issues.apache.org/jira/browse/CASSANDRA-185

--
Eric Evans
[hidden email]




--
Bidegg worlds best auction site
http://bidegg.com
Reply | Threaded
Open this post in threaded view
|

Re: keys and column names cannot be utf-8

Jonathan Ellis-3
did you read the new section in the config xml explaining how to use a
UTF8 comparator?

also: thrift itself is just plain broken for unicode support in some
languages; see THRIFT-395

I think the short version is that when you have a java server, unicode
will work with java or C# clients but not with anything else

(so if you are using a python client for instance switching to jython
might be a workaround)

On Tue, Jul 21, 2009 at 4:00 PM, <[hidden email]> wrote:

> Not fixed
> The following utf8 key names and column names still give an error.
> cass: 2009-07-21 13:55:35,597 error 98. ìµì§
>                                             ì¤ ìì§
> Ûïº] (1)icasso's, instruments de musique sur un guéridon] (1)Ûïº[irancel]
> (1)ïº
> cass: 2009-07-21 13:55:55,093 error 377. friday night lights
> s03e01[âmegaupload..50 error 321. instruments de musique sur un guéridon[[
> comâ
>     cass: 2009-07-21 13:56:12,341 error 637. asuka izumi photos[u15 ç«¥æãçé]
> (1)
> cass: 2009-07-21 13:56:39,380 error 1118. dragonball z games for pc[dragon
> balĺz pc games download] (1)
> cass: 2009-07-21 13:56:48,976 error 1301. ï»ïº­ïºïºï» ﺳ[ï»ïº­ïºïºï» ﺳ ï»
>
> 导æç³æµ·è¯±å¥¸å¯¼è´å¥³çèªæ] ((1)2009-07-21 13:56:55,352 error 1430.
> æç³æµ·[大å
> cass: 2009-07-21 13:56:59,287 error 1510. cinquième république[définition
> de  république?] (1)                                      å¯¼æç³æµ·] (1)
> cass: 2009-07-21 13:59:38,783 error 1842. navaratri kolu[doll festival in
> navratt
> ri golu] (1)
> cass: 2009-07-21 13:59:39,069 error 1846. tn lottery winning
> numbers[www.tnlottery] (1)
> cass: 2009-07-21 13:59:39,274 error 1850. www.buildabearville.com cheats[all
> the buildabear.com cheats and codes] (1)
> cass: 2009-07-21 13:59:39,773 error 1860. shippuuden 78[naruto shippuuden 78
> subbed torrent] (1)
>
> On Tue, Jul 21, 2009 at 10:34 AM, Eric Evans <[hidden email]> wrote:
>>
>> On Tue, 2009-07-21 at 09:18 -0700, [hidden email] wrote:
>> > Is there any timeline on when commit 185 will be done as the utf8
>> > error still exists
>>
>> 185 was committed yesterday.
>>
>> https://issues.apache.org/jira/browse/CASSANDRA-185
>>
>> --
>> Eric Evans
>> [hidden email]
>>
>
>
>
> --
> Bidegg worlds best auction site
> http://bidegg.com
>
Reply | Threaded
Open this post in threaded view
|

Re: keys and column names cannot be utf-8

Jonathan Ellis-3
On Tue, Jul 21, 2009 at 4:06 PM, Jonathan Ellis<[hidden email]> wrote:
> (so if you are using a python client for instance switching to jython
> might be a workaround)

that is, using the java thrift client, not the python ones.
Reply | Threaded
Open this post in threaded view
|

Re: keys and column names cannot be utf-8

Sandeep Tata
In reply to this post by mobiledreamers
This is after you changed the conf file to use UTF8Type for the column family?

On Tue, Jul 21, 2009 at 2:00 PM, <[hidden email]> wrote:

> Not fixed
> The following utf8 key names and column names still give an error.
> cass: 2009-07-21 13:55:35,597 error 98. ìµì§
>                                             ì¤ ìì§
> Ûïº] (1)icasso's, instruments de musique sur un guéridon] (1)Ûïº[irancel]
> (1)ïº
> cass: 2009-07-21 13:55:55,093 error 377. friday night lights
> s03e01[âmegaupload..50 error 321. instruments de musique sur un guéridon[[
> comâ
>     cass: 2009-07-21 13:56:12,341 error 637. asuka izumi photos[u15 ç«¥æãçé]
> (1)
> cass: 2009-07-21 13:56:39,380 error 1118. dragonball z games for pc[dragon
> balĺz pc games download] (1)
> cass: 2009-07-21 13:56:48,976 error 1301. ï»ïº­ïºïºï» ﺳ[ï»ïº­ïºïºï» ﺳ ï»
>
> 导æç³æµ·è¯±å¥¸å¯¼è´å¥³çèªæ] ((1)2009-07-21 13:56:55,352 error 1430.
> æç³æµ·[大å
> cass: 2009-07-21 13:56:59,287 error 1510. cinquième république[définition
> de  république?] (1)                                      å¯¼æç³æµ·] (1)
> cass: 2009-07-21 13:59:38,783 error 1842. navaratri kolu[doll festival in
> navratt
> ri golu] (1)
> cass: 2009-07-21 13:59:39,069 error 1846. tn lottery winning
> numbers[www.tnlottery] (1)
> cass: 2009-07-21 13:59:39,274 error 1850. www.buildabearville.com cheats[all
> the buildabear.com cheats and codes] (1)
> cass: 2009-07-21 13:59:39,773 error 1860. shippuuden 78[naruto shippuuden 78
> subbed torrent] (1)
>
> On Tue, Jul 21, 2009 at 10:34 AM, Eric Evans <[hidden email]> wrote:
>>
>> On Tue, 2009-07-21 at 09:18 -0700, [hidden email] wrote:
>> > Is there any timeline on when commit 185 will be done as the utf8
>> > error still exists
>>
>> 185 was committed yesterday.
>>
>> https://issues.apache.org/jira/browse/CASSANDRA-185
>>
>> --
>> Eric Evans
>> [hidden email]
>>
>
>
>
> --
> Bidegg worlds best auction site
> http://bidegg.com
>
Reply | Threaded
Open this post in threaded view
|

Re: keys and column names cannot be utf-8

mobiledreamers
In reply to this post by Jonathan Ellis-3
Hey jonathan
this is not in the wiki or any documentation. Give that searching for UTF8Type gave the tag of 
adding  CompareWith="UTF8Type" 
  <ColumnFamily ColumnType="Super" Name="Super1"/>

All my inserts and queries go to a super column family Super1

So should i change this to

<ColumnFamily ColumnType="Super" CompareWith="UTF8Type" Name="Super1"/>

does this work in python thrift 
if it does - that would be perfect

but this doesnt explain why keys cannot be utf8

thanks

On Tue, Jul 21, 2009 at 2:06 PM, Jonathan Ellis <[hidden email]> wrote:
did you read the new section in the config xml explaining how to use a
UTF8 comparator?

also: thrift itself is just plain broken for unicode support in some
languages; see THRIFT-395

I think the short version is that when you have a java server, unicode
will work with java or C# clients but not with anything else

(so if you are using a python client for instance switching to jython
might be a workaround)

On Tue, Jul 21, 2009 at 4:00 PM, <[hidden email]> wrote:
> Not fixed
> The following utf8 key names and column names still give an error.
> cass: 2009-07-21 13:55:35,597 error 98. ìµì§
>                                             ì¤ ìì§
> Ûïº] (1)icasso's, instruments de musique sur un guéridon] (1)Ûïº[irancel]
> (1)ïº
> cass: 2009-07-21 13:55:55,093 error 377. friday night lights
> s03e01[âmegaupload..50 error 321. instruments de musique sur un guéridon[[
> comâ
>     cass: 2009-07-21 13:56:12,341 error 637. asuka izumi photos[u15 ç«¥æãçé]
> (1)
> cass: 2009-07-21 13:56:39,380 error 1118. dragonball z games for pc[dragon
> balĺz pc games download] (1)
> cass: 2009-07-21 13:56:48,976 error 1301. ï»ïº­ïºïºï» ﺳ[ï»ïº­ïºïºï» ﺳ ï»
>
> 导æç³æµ·è¯±å¥¸å¯¼è´å¥³çèªæ] ((1)2009-07-21 13:56:55,352 error 1430.
> æç³æµ·[大å
> cass: 2009-07-21 13:56:59,287 error 1510. cinquième république[définition
> de  république?] (1)                                      å¯¼æç³æµ·] (1)
> cass: 2009-07-21 13:59:38,783 error 1842. navaratri kolu[doll festival in
> navratt
> ri golu] (1)
> cass: 2009-07-21 13:59:39,069 error 1846. tn lottery winning
> numbers[www.tnlottery] (1)
> cass: 2009-07-21 13:59:39,274 error 1850. www.buildabearville.com cheats[all
> the buildabear.com cheats and codes] (1)
> cass: 2009-07-21 13:59:39,773 error 1860. shippuuden 78[naruto shippuuden 78
> subbed torrent] (1)
>
> On Tue, Jul 21, 2009 at 10:34 AM, Eric Evans <[hidden email]> wrote:
>>
>> On Tue, 2009-07-21 at 09:18 -0700, [hidden email] wrote:
>> > Is there any timeline on when commit 185 will be done as the utf8
>> > error still exists
>>
>> 185 was committed yesterday.
>>
>> https://issues.apache.org/jira/browse/CASSANDRA-185
>>
>> --
>> Eric Evans
>> [hidden email]
>>
>
>
>
> --
> Bidegg worlds best auction site
> http://bidegg.com
>



--
Bidegg worlds best auction site
http://bidegg.com
Reply | Threaded
Open this post in threaded view
|

Re: keys and column names cannot be utf-8

Jonathan Ellis-3
On Tue, Jul 21, 2009 at 4:18 PM, <[hidden email]> wrote:
> Hey jonathan
> this is not in the wiki or any documentation.

this is trunk.  i wrote it a couple days ago.  feel free to step in
and update the wiki.

> does this work in python thrift

probably not, given the thrift utf8 bugs.  (but you could use
BytesType and at least you will get the right data back.)

> if it does - that would be perfect
> but this doesnt explain why keys cannot be utf8

because FB didn't write it and so far neither has anyone else.

-Jonathan
Reply | Threaded
Open this post in threaded view
|

Re: keys and column names cannot be utf-8

Jonathan Ellis-3
On Tue, Jul 21, 2009 at 4:21 PM, Jonathan Ellis<[hidden email]> wrote:
>> does this work in python thrift
>
> probably not, given the thrift utf8 bugs.

to correct myself: now that we are using binary data in the thrift api
it can't screw us over.  so yes, UTF8Type should be fine.
Reply | Threaded
Open this post in threaded view
|

Re: keys and column names cannot be utf-8

mobiledreamers
In reply to this post by mobiledreamers
This is a definitely a bug not an improvement 

The python thrift client is unusable without utf8 or unicode as much of the web is utf8 or unicode
jonathan, cpython is the default way to use in django, pylons or any of the other frameworks
using jython or java is not an option

If someone can tell how hard this is to fix python thrift client, it would tell me if we can use cassandra or not

On Tue, Jul 21, 2009 at 2:18 PM, <[hidden email]> wrote:
Hey jonathan
this is not in the wiki or any documentation. Give that searching for UTF8Type gave the tag of 
adding  CompareWith="UTF8Type" 
  <ColumnFamily ColumnType="Super" Name="Super1"/>

All my inserts and queries go to a super column family Super1

So should i change this to

<ColumnFamily ColumnType="Super" CompareWith="UTF8Type" Name="Super1"/>

does this work in python thrift 
if it does - that would be perfect

but this doesnt explain why keys cannot be utf8

thanks

On Tue, Jul 21, 2009 at 2:06 PM, Jonathan Ellis <[hidden email]> wrote:
did you read the new section in the config xml explaining how to use a
UTF8 comparator?

also: thrift itself is just plain broken for unicode support in some
languages; see THRIFT-395

I think the short version is that when you have a java server, unicode
will work with java or C# clients but not with anything else

(so if you are using a python client for instance switching to jython
might be a workaround)

On Tue, Jul 21, 2009 at 4:00 PM, <[hidden email]> wrote:
> Not fixed
> The following utf8 key names and column names still give an error.
> cass: 2009-07-21 13:55:35,597 error 98. ìµì§
>                                             ì¤ ìì§
> Ûïº] (1)icasso's, instruments de musique sur un guéridon] (1)Ûïº[irancel]
> (1)ïº
> cass: 2009-07-21 13:55:55,093 error 377. friday night lights
> s03e01[âmegaupload..50 error 321. instruments de musique sur un guéridon[[
> comâ
>     cass: 2009-07-21 13:56:12,341 error 637. asuka izumi photos[u15 ç«¥æãçé]
> (1)
> cass: 2009-07-21 13:56:39,380 error 1118. dragonball z games for pc[dragon
> balĺz pc games download] (1)
> cass: 2009-07-21 13:56:48,976 error 1301. ï»ïº­ïºïºï» ﺳ[ï»ïº­ïºïºï» ﺳ ï»
>
> 导æç³æµ·è¯±å¥¸å¯¼è´å¥³çèªæ] ((1)2009-07-21 13:56:55,352 error 1430.
> æç³æµ·[大å
> cass: 2009-07-21 13:56:59,287 error 1510. cinquième république[définition
> de  république?] (1)                                      å¯¼æç³æµ·] (1)
> cass: 2009-07-21 13:59:38,783 error 1842. navaratri kolu[doll festival in
> navratt
> ri golu] (1)
> cass: 2009-07-21 13:59:39,069 error 1846. tn lottery winning
> numbers[www.tnlottery] (1)
> cass: 2009-07-21 13:59:39,274 error 1850. www.buildabearville.com cheats[all
> the buildabear.com cheats and codes] (1)
> cass: 2009-07-21 13:59:39,773 error 1860. shippuuden 78[naruto shippuuden 78
> subbed torrent] (1)
>
> On Tue, Jul 21, 2009 at 10:34 AM, Eric Evans <[hidden email]> wrote:
>>
>> On Tue, 2009-07-21 at 09:18 -0700, [hidden email] wrote:
>> > Is there any timeline on when commit 185 will be done as the utf8
>> > error still exists
>>
>> 185 was committed yesterday.
>>
>> https://issues.apache.org/jira/browse/CASSANDRA-185
>>
>> --
>> Eric Evans
>> [hidden email]
>>
>
>
>
> --
> Bidegg worlds best auction site
> http://bidegg.com
>



--
Bidegg worlds best auction site
http://bidegg.com



--
Bidegg worlds best auction site
http://bidegg.com
Reply | Threaded
Open this post in threaded view
|

Re: keys and column names cannot be utf-8

mobiledreamers
In reply to this post by Jonathan Ellis-3
thanks jonathan
trying this 
<ColumnFamily ColumnType="Super" CompareWith="UTF8Type" Name="Super1"/>


On Tue, Jul 21, 2009 at 2:24 PM, Jonathan Ellis <[hidden email]> wrote:
On Tue, Jul 21, 2009 at 4:21 PM, Jonathan Ellis<[hidden email]> wrote:
>> does this work in python thrift
>
> probably not, given the thrift utf8 bugs.

to correct myself: now that we are using binary data in the thrift api
it can't screw us over.  so yes, UTF8Type should be fine.



--
Bidegg worlds best auction site
http://bidegg.com
Reply | Threaded
Open this post in threaded view
|

Re: keys and column names cannot be utf-8

Jonathan Ellis-3
you may also want to specify CompareSubcolumnsWith.

On Tue, Jul 21, 2009 at 4:27 PM, <[hidden email]> wrote:

> thanks jonathan
> trying this
> <ColumnFamily ColumnType="Super" CompareWith="UTF8Type" Name="Super1"/>
>
> On Tue, Jul 21, 2009 at 2:24 PM, Jonathan Ellis <[hidden email]> wrote:
>>
>> On Tue, Jul 21, 2009 at 4:21 PM, Jonathan Ellis<[hidden email]> wrote:
>> >> does this work in python thrift
>> >
>> > probably not, given the thrift utf8 bugs.
>>
>> to correct myself: now that we are using binary data in the thrift api
>> it can't screw us over.  so yes, UTF8Type should be fine.
>
>
>
> --
> Bidegg worlds best auction site
> http://bidegg.com
>
Reply | Threaded
Open this post in threaded view
|

Re: keys and column names cannot be utf-8

mobiledreamers
In reply to this post by mobiledreamers
WHy not use UTF8Type or BytesType as default

The CompareWith attribute tells Cassandra how to sort the columns
+ for slicing operations. For backwards compatibility, the default
+ is to use AsciiType, which is probably NOT what you want.
+ Other options are UTF8Type, UUIDType, and LongType.
+ You can also specify the fully-qualified class name to a class
+ of your choice implementing org.apache.cassandra.db.marshal.IType.
+
+ if FlushPeriodInMinutes is configured and positive, it will be
flushed to disk with that period whether it is dirty or not.
This is intended for lightly-used columnfamilies so that they
do not prevent commitlog segments from being purged.
On Tue, Jul 21, 2009 at 2:27 PM, <[hidden email]> wrote:
thanks jonathan
trying this 
<ColumnFamily ColumnType="Super" CompareWith="UTF8Type" Name="Super1"/>


On Tue, Jul 21, 2009 at 2:24 PM, Jonathan Ellis <[hidden email]> wrote:
On Tue, Jul 21, 2009 at 4:21 PM, Jonathan Ellis<[hidden email]> wrote:
>> does this work in python thrift
>
> probably not, given the thrift utf8 bugs.

to correct myself: now that we are using binary data in the thrift api
it can't screw us over.  so yes, UTF8Type should be fine.



--
Bidegg worlds best auction site
http://bidegg.com



--
Bidegg worlds best auction site
http://bidegg.com
Reply | Threaded
Open this post in threaded view
|

Re: keys and column names cannot be utf-8

mobiledreamers
In reply to this post by Jonathan Ellis-3
if this would be the conf/storage-conf.xml 

<ColumnFamily ColumnSort="Name" Name="Standard1"  CompareWith="UTF8Type" FlushPeriodInMinutes="60"/>
<ColumnFamily ColumnSort="Name"  CompareWith="UTF8Type" Name="Standard2"/>
<ColumnFamily ColumnSort="Time"  CompareWith="UTF8Type" Name="StandardByTime1"/>
<ColumnFamily ColumnType="Super" CompareWith="UTF8Type" CompareSubcolumnsWith="UTF8Type" Name="Super1"/>

Jonathan can you clarify if this will guarantee proper python thrift utf8 behavior thanks

On Tue, Jul 21, 2009 at 2:29 PM, Jonathan Ellis <[hidden email]> wrote:
you may also want to specify CompareSubcolumnsWith.

On Tue, Jul 21, 2009 at 4:27 PM, <[hidden email]> wrote:
> thanks jonathan
> trying this
> <ColumnFamily ColumnType="Super" CompareWith="UTF8Type" Name="Super1"/>
>
> On Tue, Jul 21, 2009 at 2:24 PM, Jonathan Ellis <[hidden email]> wrote:
>>
>> On Tue, Jul 21, 2009 at 4:21 PM, Jonathan Ellis<[hidden email]> wrote:
>> >> does this work in python thrift
>> >
>> > probably not, given the thrift utf8 bugs.
>>
>> to correct myself: now that we are using binary data in the thrift api
>> it can't screw us over.  so yes, UTF8Type should be fine.
>
>
>
> --
> Bidegg worlds best auction site
> http://bidegg.com
>



--
Bidegg worlds best auction site
http://bidegg.com
Reply | Threaded
Open this post in threaded view
|

Re: keys and column names cannot be utf-8

Jonathan Ellis-3
guarantee?  in a pre-alpha trunk?  no, that is too strong a word.

but that's what *supposed* to work, so I will fix it if it doesn't. :)

On Tue, Jul 21, 2009 at 4:32 PM, <[hidden email]> wrote:

> if this would be the conf/storage-conf.xml
> <ColumnFamily ColumnSort="Name"
> Name="Standard1"  CompareWith="UTF8Type" FlushPeriodInMinutes="60"/>
> <ColumnFamily ColumnSort="Name"  CompareWith="UTF8Type" Name="Standard2"/>
> <ColumnFamily
> ColumnSort="Time"  CompareWith="UTF8Type" Name="StandardByTime1"/>
> <ColumnFamily ColumnType="Super" CompareWith="UTF8Type"
> CompareSubcolumnsWith="UTF8Type" Name="Super1"/>
> Jonathan can you clarify if this will guarantee proper python thrift utf8
> behavior thanks
> On Tue, Jul 21, 2009 at 2:29 PM, Jonathan Ellis <[hidden email]> wrote:
>>
>> you may also want to specify CompareSubcolumnsWith.
>>
>> On Tue, Jul 21, 2009 at 4:27 PM, <[hidden email]> wrote:
>> > thanks jonathan
>> > trying this
>> > <ColumnFamily ColumnType="Super" CompareWith="UTF8Type" Name="Super1"/>
>> >
>> > On Tue, Jul 21, 2009 at 2:24 PM, Jonathan Ellis <[hidden email]>
>> > wrote:
>> >>
>> >> On Tue, Jul 21, 2009 at 4:21 PM, Jonathan Ellis<[hidden email]>
>> >> wrote:
>> >> >> does this work in python thrift
>> >> >
>> >> > probably not, given the thrift utf8 bugs.
>> >>
>> >> to correct myself: now that we are using binary data in the thrift api
>> >> it can't screw us over.  so yes, UTF8Type should be fine.
>> >
>> >
>> >
>> > --
>> > Bidegg worlds best auction site
>> > http://bidegg.com
>> >
>
>
>
> --
> Bidegg worlds best auction site
> http://bidegg.com
>
Reply | Threaded
Open this post in threaded view
|

Re: keys and column names cannot be utf-8

mobiledreamers
In reply to this post by mobiledreamers
Strange this happened. in the 4 server nodes that run cassandra, the conf file had
ConfA
<ColumnFamily ColumnSort="Name" Name="Standard1"  FlushPeriodInMinutes="60"/>
<ColumnFamily ColumnSort="Name"  Name="Standard2"/>
<ColumnFamily ColumnSort="Time"  Name="StandardByTime1"/>
<ColumnFamily ColumnType="Super"   Name="Super1"/>

I changed it to the following and doing nodeprobe after restarting cassandra, the other 3 nodes are down,
ConfB
<ColumnFamily ColumnSort="Name" Name="Standard1"  CompareWith="UTF8Type" FlushPeriodInMinutes="60"/>
<ColumnFamily ColumnSort="Name"  CompareWith="UTF8Type" Name="Standard2"/>
<ColumnFamily ColumnSort="Time"  CompareWith="UTF8Type" Name="StandardByTime1"/>
<ColumnFamily ColumnType="Super" CompareWith="UTF8Type" CompareSubcolumnsWith="UTF8Type" Name="Super1"/>

If i revert ConfB and set ConfA, all 4 nodes show up in nodeprobe in all the 4 nodes

I m unsure how to debug this

On Tue, Jul 21, 2009 at 2:32 PM, <[hidden email]> wrote:
if this would be the conf/storage-conf.xml 

<ColumnFamily ColumnSort="Name" Name="Standard1"  CompareWith="UTF8Type" FlushPeriodInMinutes="60"/>
<ColumnFamily ColumnSort="Name"  CompareWith="UTF8Type" Name="Standard2"/>
<ColumnFamily ColumnSort="Time"  CompareWith="UTF8Type" Name="StandardByTime1"/>
<ColumnFamily ColumnType="Super" CompareWith="UTF8Type" CompareSubcolumnsWith="UTF8Type" Name="Super1"/>

Jonathan can you clarify if this will guarantee proper python thrift utf8 behavior thanks

On Tue, Jul 21, 2009 at 2:29 PM, Jonathan Ellis <[hidden email]> wrote:
you may also want to specify CompareSubcolumnsWith.

On Tue, Jul 21, 2009 at 4:27 PM, <[hidden email]> wrote:
> thanks jonathan
> trying this
> <ColumnFamily ColumnType="Super" CompareWith="UTF8Type" Name="Super1"/>
>
> On Tue, Jul 21, 2009 at 2:24 PM, Jonathan Ellis <[hidden email]> wrote:
>>
>> On Tue, Jul 21, 2009 at 4:21 PM, Jonathan Ellis<[hidden email]> wrote:
>> >> does this work in python thrift
>> >
>> > probably not, given the thrift utf8 bugs.
>>
>> to correct myself: now that we are using binary data in the thrift api
>> it can't screw us over.  so yes, UTF8Type should be fine.
>
>
>
> --
> Bidegg worlds best auction site
> http://bidegg.com
>



--
Bidegg worlds best auction site
http://bidegg.com



--
Bidegg worlds best auction site
http://bidegg.com
Reply | Threaded
Open this post in threaded view
|

Re: keys and column names cannot be utf-8

Jonathan Ellis-3
did you check to make sure all the nodes were running and had no
exceptions in their logs?

On Tue, Jul 21, 2009 at 4:46 PM, <[hidden email]> wrote:

> Strange this happened. in the 4 server nodes that run cassandra, the conf
> file had
> ConfA
> <ColumnFamily ColumnSort="Name"
> Name="Standard1"  FlushPeriodInMinutes="60"/>
> <ColumnFamily ColumnSort="Name"  Name="Standard2"/>
> <ColumnFamily ColumnSort="Time"  Name="StandardByTime1"/>
> <ColumnFamily ColumnType="Super"   Name="Super1"/>
> I changed it to the following and doing nodeprobe after restarting
> cassandra, the other 3 nodes are down,
> ConfB
> <ColumnFamily ColumnSort="Name"
> Name="Standard1"  CompareWith="UTF8Type" FlushPeriodInMinutes="60"/>
> <ColumnFamily ColumnSort="Name"  CompareWith="UTF8Type" Name="Standard2"/>
> <ColumnFamily
> ColumnSort="Time"  CompareWith="UTF8Type" Name="StandardByTime1"/>
> <ColumnFamily ColumnType="Super" CompareWith="UTF8Type"
> CompareSubcolumnsWith="UTF8Type" Name="Super1"/>
> If i revert ConfB and set ConfA, all 4 nodes show up in nodeprobe in all the
> 4 nodes
> I m unsure how to debug this
> On Tue, Jul 21, 2009 at 2:32 PM, <[hidden email]> wrote:
>>
>> if this would be the conf/storage-conf.xml
>> <ColumnFamily ColumnSort="Name"
>> Name="Standard1"  CompareWith="UTF8Type" FlushPeriodInMinutes="60"/>
>> <ColumnFamily ColumnSort="Name"  CompareWith="UTF8Type" Name="Standard2"/>
>> <ColumnFamily
>> ColumnSort="Time"  CompareWith="UTF8Type" Name="StandardByTime1"/>
>> <ColumnFamily ColumnType="Super" CompareWith="UTF8Type"
>> CompareSubcolumnsWith="UTF8Type" Name="Super1"/>
>> Jonathan can you clarify if this will guarantee proper python thrift utf8
>> behavior thanks
>> On Tue, Jul 21, 2009 at 2:29 PM, Jonathan Ellis <[hidden email]> wrote:
>>>
>>> you may also want to specify CompareSubcolumnsWith.
>>>
>>> On Tue, Jul 21, 2009 at 4:27 PM, <[hidden email]> wrote:
>>> > thanks jonathan
>>> > trying this
>>> > <ColumnFamily ColumnType="Super" CompareWith="UTF8Type" Name="Super1"/>
>>> >
>>> > On Tue, Jul 21, 2009 at 2:24 PM, Jonathan Ellis <[hidden email]>
>>> > wrote:
>>> >>
>>> >> On Tue, Jul 21, 2009 at 4:21 PM, Jonathan Ellis<[hidden email]>
>>> >> wrote:
>>> >> >> does this work in python thrift
>>> >> >
>>> >> > probably not, given the thrift utf8 bugs.
>>> >>
>>> >> to correct myself: now that we are using binary data in the thrift api
>>> >> it can't screw us over.  so yes, UTF8Type should be fine.
>>> >
>>> >
>>> >
>>> > --
>>> > Bidegg worlds best auction site
>>> > http://bidegg.com
>>> >
>>
>>
>>
>> --
>> Bidegg worlds best auction site
>> http://bidegg.com
>
>
>
> --
> Bidegg worlds best auction site
> http://bidegg.com
>
Reply | Threaded
Open this post in threaded view
|

Re: keys and column names cannot be utf-8

mobiledreamers
Still gives error - x.search and x.related are unicode words and when they are used as key or column name the following erros come up
 
 x.search
Out[5]: u'\ucd5c\uc9c4\uc2e4 \uc774\ud63c'
In [6]: x.related
Out[6]: u'\ucd5c\uc9c4\uc2e4 \uc774\ud63c'
In [7]: client.insert('Table1', x.search, ColumnPath('Super1', 'Related', x.related), pickle.dumps(dict(count=1)), time.time(), 0)
ERROR: An unexpected error occurred while tokenizing input
The following traceback may be corrupted or invalid
The error message is: ('EOF in multi-line statement', (1149, 0))

---------------------------------------------------------------------------
TApplicationException                     Traceback (most recent call last)

/home/mark/<ipython console> in <module>()

/home/mark/work/common/cassandra/Cassandra.pyc in insert(self, table, key, column_path, value, timestamp, block_for)
    359     """
    360     self.send_insert(table, key, column_path, value, timestamp, block_for)
--> 361     self.recv_insert()
    362
    363   def send_insert(self, table, key, column_path, value, timestamp, block_for):

/home/mark/work/common/cassandra/Cassandra.pyc in recv_insert(self)
    380       x.read(self._iprot)
    381       self._iprot.readMessageEnd()
--> 382       raise x
    383     result = insert_result()
    384     result.read(self._iprot)

TApplicationException: Internal error processing insert




INFO - Cassandra starting up...
DEBUG - insert
ERROR - Internal error processing insert
java.lang.NullPointerException
        at org.apache.cassandra.service.ThriftValidation.validateColumnPath(ThriftValidation.java:61)
        at org.apache.cassandra.service.CassandraServer.insert(CassandraServer.java:262)
        at org.apache.cassandra.service.Cassandra$Processor$insert.process(Cassandra.java:927)
        at org.apache.cassandra.service.Cassandra$Processor.process(Cassandra.java:796)
        at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:252)
        at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
        at java.lang.Thread.run(Thread.java:619)


 
On Tue, Jul 21, 2009 at 3:04 PM, Jonathan Ellis <[hidden email]> wrote:
did you check to make sure all the nodes were running and had no
exceptions in their logs?

On Tue, Jul 21, 2009 at 4:46 PM, <[hidden email]> wrote:
> Strange this happened. in the 4 server nodes that run cassandra, the conf
> file had
> ConfA
> <ColumnFamily ColumnSort="Name"
> Name="Standard1"  FlushPeriodInMinutes="60"/>
> <ColumnFamily ColumnSort="Name"  Name="Standard2"/>
> <ColumnFamily ColumnSort="Time"  Name="StandardByTime1"/>
> <ColumnFamily ColumnType="Super"   Name="Super1"/>
> I changed it to the following and doing nodeprobe after restarting
> cassandra, the other 3 nodes are down,
> ConfB
> <ColumnFamily ColumnSort="Name"
> Name="Standard1"  CompareWith="UTF8Type" FlushPeriodInMinutes="60"/>
> <ColumnFamily ColumnSort="Name"  CompareWith="UTF8Type" Name="Standard2"/>
> <ColumnFamily
> ColumnSort="Time"  CompareWith="UTF8Type" Name="StandardByTime1"/>
> <ColumnFamily ColumnType="Super" CompareWith="UTF8Type"
> CompareSubcolumnsWith="UTF8Type" Name="Super1"/>
> If i revert ConfB and set ConfA, all 4 nodes show up in nodeprobe in all the
> 4 nodes
> I m unsure how to debug this
> On Tue, Jul 21, 2009 at 2:32 PM, <[hidden email]> wrote:
>>
>> if this would be the conf/storage-conf.xml
>> <ColumnFamily ColumnSort="Name"
>> Name="Standard1"  CompareWith="UTF8Type" FlushPeriodInMinutes="60"/>
>> <ColumnFamily ColumnSort="Name"  CompareWith="UTF8Type" Name="Standard2"/>
>> <ColumnFamily
>> ColumnSort="Time"  CompareWith="UTF8Type" Name="StandardByTime1"/>
>> <ColumnFamily ColumnType="Super" CompareWith="UTF8Type"
>> CompareSubcolumnsWith="UTF8Type" Name="Super1"/>
>> Jonathan can you clarify if this will guarantee proper python thrift utf8
>> behavior thanks
>> On Tue, Jul 21, 2009 at 2:29 PM, Jonathan Ellis <[hidden email]> wrote:
>>>
>>> you may also want to specify CompareSubcolumnsWith.
>>>
>>> On Tue, Jul 21, 2009 at 4:27 PM, <[hidden email]> wrote:
>>> > thanks jonathan
>>> > trying this
>>> > <ColumnFamily ColumnType="Super" CompareWith="UTF8Type" Name="Super1"/>
>>> >
>>> > On Tue, Jul 21, 2009 at 2:24 PM, Jonathan Ellis <[hidden email]>
>>> > wrote:
>>> >>
>>> >> On Tue, Jul 21, 2009 at 4:21 PM, Jonathan Ellis<[hidden email]>
>>> >> wrote:
>>> >> >> does this work in python thrift
>>> >> >
>>> >> > probably not, given the thrift utf8 bugs.
>>> >>
>>> >> to correct myself: now that we are using binary data in the thrift api
>>> >> it can't screw us over.  so yes, UTF8Type should be fine.
>>> >
>>> >
>>> >
>>> > --
>>> > Bidegg worlds best auction site
>>> > <a onclick="return top.js.OpenExtLink(window,event,this)" href="http://bidegg.com/" target="_blank">http://bidegg.com
>>> >
>>
>>
>>
>> --
>> Bidegg worlds best auction site
>> <a onclick="return top.js.OpenExtLink(window,event,this)" href="http://bidegg.com/" target="_blank">http://bidegg.com
>
>
>
> --
> Bidegg worlds best auction site
> <a onclick="return top.js.OpenExtLink(window,event,this)" href="http://bidegg.com/" target="_blank">http://bidegg.com
>
 



--
Bidegg worlds best auction site
<a onclick="return top.js.OpenExtLink(window,event,this)" href="http://bidegg.com/" target="_blank">http://bidegg.com