perfomance issue

classic Classic list List threaded Threaded
7 messages Options
Reply | Threaded
Open this post in threaded view
|

perfomance issue

Kirill A. Korinskiy-3
Hello.

I'm write a small erlang benchmark: cassandra vs simple fs storage and
I have a results:

fs_storage:
10: 1732/228633/2873/1128 992/233261/2870/1026
25: 4531/35290/6084/632 1786/35292/5969/312
50: 4707/311825/13694/4644 1366/382913/14470/5823

cassandra:
10: 69014/424051/145871/28066 9553/427823/141925/29190
25: 165500/536108/339736/77173 14846/710370/341819/78694
50: 325790/1050753/667931/157962 9865/1033461/676317/160606

legends:
parallel request: get set min/max/avg/mad

So. Cassandra slower ~50 times.

Ok I change cassandra setting:

 ColumnFamily from UTF8Type to BytesType
 MemtableObjectCountInMillions from 0.1 to 1
 ConcurrentReads from 8 to 64
 ConcurrentWrites from 32 to 64

And I have:

fs_storage:
10: 2164/245375/2515/474 651/247790/2746/800
25: 5521/19959/5838/196 1836/20138/5943/186
50: 6369/32837/11476/944 1396/32818/11449/876

cassandra:
10: 39931/279903/48463/4812 21378/278782/51477/4526
25: 101911/407343/119135/9158 8794/410423/123768/9076
50: 209202/429959/238179/12180 7803/442580/244402/13443

Cassandra slower ~20 times.

I's normal? Can I do something wrong? Maybe you have any ideas?

--
wbr, Kirill
Reply | Threaded
Open this post in threaded view
|

Re: perfomance issue

Jonathan Ellis-3
It's impossible to say given what you have told us.

Step zero in benchmarking cassandra is turning the log level to INFO.
Step one is testing on a machine where you can put the commitlog
directory on its own disk.

It's true that frequently cassandra will be slower than custom code
writing to local disk, but that's not the interesting part of what
cassandra does. :)

-Jonathan

On Fri, Sep 25, 2009 at 10:23 AM, Kirill A. Korinskiy
<[hidden email]> wrote:

> Hello.
>
> I'm write a small erlang benchmark: cassandra vs simple fs storage and
> I have a results:
>
> fs_storage:
> 10: 1732/228633/2873/1128 992/233261/2870/1026
> 25: 4531/35290/6084/632 1786/35292/5969/312
> 50: 4707/311825/13694/4644 1366/382913/14470/5823
>
> cassandra:
> 10: 69014/424051/145871/28066 9553/427823/141925/29190
> 25: 165500/536108/339736/77173 14846/710370/341819/78694
> 50: 325790/1050753/667931/157962 9865/1033461/676317/160606
>
> legends:
> parallel request: get set min/max/avg/mad
>
> So. Cassandra slower ~50 times.
>
> Ok I change cassandra setting:
>
>  ColumnFamily from UTF8Type to BytesType
>  MemtableObjectCountInMillions from 0.1 to 1
>  ConcurrentReads from 8 to 64
>  ConcurrentWrites from 32 to 64
>
> And I have:
>
> fs_storage:
> 10: 2164/245375/2515/474 651/247790/2746/800
> 25: 5521/19959/5838/196 1836/20138/5943/186
> 50: 6369/32837/11476/944 1396/32818/11449/876
>
> cassandra:
> 10: 39931/279903/48463/4812 21378/278782/51477/4526
> 25: 101911/407343/119135/9158 8794/410423/123768/9076
> 50: 209202/429959/238179/12180 7803/442580/244402/13443
>
> Cassandra slower ~20 times.
>
> I's normal? Can I do something wrong? Maybe you have any ideas?
>
> --
> wbr, Kirill
>
Reply | Threaded
Open this post in threaded view
|

Re: perfomance issue

Michael Greene
In reply to this post by Kirill A. Korinskiy-3
Thanks for the results.  Perhaps you could shed further light:
Is this a single node system?
Is the log level changed from DEBUG to INFO?
Are the commit log and data directories on the same drive?
Are the sets/gets being processed interleaved in parallel, or one then the other?

Note that writes are ~6x and ~5x slower with those values, and gets are ~50x and ~20x slower

Michael

On Fri, Sep 25, 2009 at 10:23 AM, Kirill A. Korinskiy <[hidden email]> wrote:
Hello.

I'm write a small erlang benchmark: cassandra vs simple fs storage and
I have a results:

fs_storage:
10: 1732/228633/2873/1128 992/233261/2870/1026
25: 4531/35290/6084/632 1786/35292/5969/312
50: 4707/311825/13694/4644 1366/382913/14470/5823

cassandra:
10: 69014/424051/145871/28066 9553/427823/141925/29190
25: 165500/536108/339736/77173 14846/710370/341819/78694
50: 325790/1050753/667931/157962 9865/1033461/676317/160606

legends:
parallel request: get set min/max/avg/mad

So. Cassandra slower ~50 times.

Ok I change cassandra setting:

 ColumnFamily from UTF8Type to BytesType
 MemtableObjectCountInMillions from 0.1 to 1
 ConcurrentReads from 8 to 64
 ConcurrentWrites from 32 to 64

And I have:

fs_storage:
10: 2164/245375/2515/474 651/247790/2746/800
25: 5521/19959/5838/196 1836/20138/5943/186
50: 6369/32837/11476/944 1396/32818/11449/876

cassandra:
10: 39931/279903/48463/4812 21378/278782/51477/4526
25: 101911/407343/119135/9158 8794/410423/123768/9076
50: 209202/429959/238179/12180 7803/442580/244402/13443

Cassandra slower ~20 times.

I's normal? Can I do something wrong? Maybe you have any ideas?

--
wbr, Kirill

Reply | Threaded
Open this post in threaded view
|

Re: perfomance issue

Kirill A. Korinskiy-3
In reply to this post by Jonathan Ellis-3
At Fri, 25 Sep 2009 10:32:55 -0500,
Jonathan Ellis <[hidden email]> wrote:
>
> Step zero in benchmarking cassandra is turning the log level to
> INFO.

sure, i'm switching off all log messages

> Step one is testing on a machine where you can put the commitlog
> directory on its own disk.

you mean a dedicated disk for commitlog?

> It's true that frequently cassandra will be slower than custom code
> writing to local disk, but that's not the interesting part of what
> cassandra does. :)
>

sure I'm understand it. But slower ~20 times… It's really expected?

> -Jonathan
>
> On Fri, Sep 25, 2009 at 10:23 AM, Kirill A. Korinskiy
> <[hidden email]> wrote:
> > Hello.
> >
> > I'm write a small erlang benchmark: cassandra vs simple fs storage and
> > I have a results:
> >
> > fs_storage:
> > 10: 1732/228633/2873/1128 992/233261/2870/1026
> > 25: 4531/35290/6084/632 1786/35292/5969/312
> > 50: 4707/311825/13694/4644 1366/382913/14470/5823
> >
> > cassandra:
> > 10: 69014/424051/145871/28066 9553/427823/141925/29190
> > 25: 165500/536108/339736/77173 14846/710370/341819/78694
> > 50: 325790/1050753/667931/157962 9865/1033461/676317/160606
> >
> > legends:
> > parallel request: get set min/max/avg/mad
> >
> > So. Cassandra slower ~50 times.
> >
> > Ok I change cassandra setting:
> >
> >  ColumnFamily from UTF8Type to BytesType
> >  MemtableObjectCountInMillions from 0.1 to 1
> >  ConcurrentReads from 8 to 64
> >  ConcurrentWrites from 32 to 64
> >
> > And I have:
> >
> > fs_storage:
> > 10: 2164/245375/2515/474 651/247790/2746/800
> > 25: 5521/19959/5838/196 1836/20138/5943/186
> > 50: 6369/32837/11476/944 1396/32818/11449/876
> >
> > cassandra:
> > 10: 39931/279903/48463/4812 21378/278782/51477/4526
> > 25: 101911/407343/119135/9158 8794/410423/123768/9076
> > 50: 209202/429959/238179/12180 7803/442580/244402/13443
> >
> > Cassandra slower ~20 times.
> >
> > I's normal? Can I do something wrong? Maybe you have any ideas?
> >
> > --
> > wbr, Kirill
> >

--
wbr, Kirill
Reply | Threaded
Open this post in threaded view
|

Re: perfomance issue

Jonathan Ellis-3
I still have no idea what your test looks like or what your numbers
mean.  So how can I guess if it's expected?

As a very rough rule of thumb you can expect about 1000 ops per second
on a low-end-ish system.  So if you are getting that much out of
cassandra at least you are in the right ball park.

-Jonathan

On Fri, Sep 25, 2009 at 10:41 AM, Kirill A. Korinskiy
<[hidden email]> wrote:

> At Fri, 25 Sep 2009 10:32:55 -0500,
> Jonathan Ellis <[hidden email]> wrote:
>>
>> Step zero in benchmarking cassandra is turning the log level to
>> INFO.
>
> sure, i'm switching off all log messages
>
>> Step one is testing on a machine where you can put the commitlog
>> directory on its own disk.
>
> you mean a dedicated disk for commitlog?
>
>> It's true that frequently cassandra will be slower than custom code
>> writing to local disk, but that's not the interesting part of what
>> cassandra does. :)
>>
>
> sure I'm understand it. But slower ~20 times… It's really expected?
>
>> -Jonathan
>>
>> On Fri, Sep 25, 2009 at 10:23 AM, Kirill A. Korinskiy
>> <[hidden email]> wrote:
>> > Hello.
>> >
>> > I'm write a small erlang benchmark: cassandra vs simple fs storage and
>> > I have a results:
>> >
>> > fs_storage:
>> > 10: 1732/228633/2873/1128 992/233261/2870/1026
>> > 25: 4531/35290/6084/632 1786/35292/5969/312
>> > 50: 4707/311825/13694/4644 1366/382913/14470/5823
>> >
>> > cassandra:
>> > 10: 69014/424051/145871/28066 9553/427823/141925/29190
>> > 25: 165500/536108/339736/77173 14846/710370/341819/78694
>> > 50: 325790/1050753/667931/157962 9865/1033461/676317/160606
>> >
>> > legends:
>> > parallel request: get set min/max/avg/mad
>> >
>> > So. Cassandra slower ~50 times.
>> >
>> > Ok I change cassandra setting:
>> >
>> >  ColumnFamily from UTF8Type to BytesType
>> >  MemtableObjectCountInMillions from 0.1 to 1
>> >  ConcurrentReads from 8 to 64
>> >  ConcurrentWrites from 32 to 64
>> >
>> > And I have:
>> >
>> > fs_storage:
>> > 10: 2164/245375/2515/474 651/247790/2746/800
>> > 25: 5521/19959/5838/196 1836/20138/5943/186
>> > 50: 6369/32837/11476/944 1396/32818/11449/876
>> >
>> > cassandra:
>> > 10: 39931/279903/48463/4812 21378/278782/51477/4526
>> > 25: 101911/407343/119135/9158 8794/410423/123768/9076
>> > 50: 209202/429959/238179/12180 7803/442580/244402/13443
>> >
>> > Cassandra slower ~20 times.
>> >
>> > I's normal? Can I do something wrong? Maybe you have any ideas?
>> >
>> > --
>> > wbr, Kirill
>> >
>
> --
> wbr, Kirill
>
Reply | Threaded
Open this post in threaded view
|

Re: perfomance issue

Kirill A. Korinskiy-3
In reply to this post by Michael Greene
At Fri, 25 Sep 2009 10:34:06 -0500,
Michael Greene <[hidden email]> wrote:
>
> Thanks for the results.  Perhaps you could shed further light:
> Is this a single node system?

Ooops. I'm missing.

Nope. It's a cluster. I use a 3 node and ReplicationFactor set to 2.

> Is the log level changed from DEBUG to INFO?

I'm switching off all logs.

> Are the commit log and data directories on the same drive?

Yes, all data on same disk.

> Are the sets/gets being processed interleaved in parallel, or one then the other?

I'm using 10/25/50 parallel request to cluster. My erlang wrapper send
one request to one node and chose node by round robin scheduling.

I'm set values with cassandra_ALL and after setting get by cassandra_QUORUM.

> Note that writes are ~6x and ~5x slower with those values, and gets are ~50x and ~20x
> slower
>
> Michael
>
> On Fri, Sep 25, 2009 at 10:23 AM, Kirill A. Korinskiy <[hidden email]> wrote:
>
>     Hello.
>    
>     I'm write a small erlang benchmark: cassandra vs simple fs storage and
>     I have a results:
>    
>     fs_storage:
>     10: 1732/228633/2873/1128 992/233261/2870/1026
>     25: 4531/35290/6084/632 1786/35292/5969/312
>     50: 4707/311825/13694/4644 1366/382913/14470/5823
>    
>     cassandra:
>     10: 69014/424051/145871/28066 9553/427823/141925/29190
>     25: 165500/536108/339736/77173 14846/710370/341819/78694
>     50: 325790/1050753/667931/157962 9865/1033461/676317/160606
>    
>     legends:
>     parallel request: get set min/max/avg/mad
>    
>     So. Cassandra slower ~50 times.
>    
>     Ok I change cassandra setting:
>    
>      ColumnFamily from UTF8Type to BytesType
>      MemtableObjectCountInMillions from 0.1 to 1
>      ConcurrentReads from 8 to 64
>      ConcurrentWrites from 32 to 64
>    
>     And I have:
>    
>     fs_storage:
>     10: 2164/245375/2515/474 651/247790/2746/800
>     25: 5521/19959/5838/196 1836/20138/5943/186
>     50: 6369/32837/11476/944 1396/32818/11449/876
>    
>     cassandra:
>     10: 39931/279903/48463/4812 21378/278782/51477/4526
>     25: 101911/407343/119135/9158 8794/410423/123768/9076
>     50: 209202/429959/238179/12180 7803/442580/244402/13443
>    
>     Cassandra slower ~20 times.
>    
>     I's normal? Can I do something wrong? Maybe you have any ideas?
>    
>     --
>     wbr, Kirill
>
>

--
wbr, Kirill
Reply | Threaded
Open this post in threaded view
|

Re: perfomance issue

Kirill A. Korinskiy-3
In reply to this post by Kirill A. Korinskiy-3
At Fri, 25 Sep 2009 19:23:59 +0400,
Kirill A. Korinskiy <[hidden email]> wrote:

my erlang code for test as attached. I use:

cassandra from git: dd1688cf859b281a32ad28e19ebf01919d89e2b6
thrift: version 20080411-exported from freebsd ports
Erlang R13B01






--
wbr, Kirill

test_cassandra.erl (4K) Download Attachment
cassandra_cluster.erl (7K) Download Attachment
fs_storage.erl (9K) Download Attachment