Stress test inconsistencies

classic Classic list List threaded Threaded
11 messages Options
Reply | Threaded
Open this post in threaded view
|

Stress test inconsistencies

Oleg Proudnikov
Hi All,

I am struggling to make sense of a simple stress test I ran against the latest
Cassandra 0.7. My server performs very poorly compared to a desktop and even a
notebook.

Here is the command I execute - a single threaded insert that runs on the same
host as Cassnadra does (I am using new contrib/stress but old py_stress produces
similar results):

./stress -t 1 -o INSERT -c 30 -n 10000 -i 1

On a SUSE Linux server with a 4-core Intel XEON I get maximum 30 inserts a
second with 40ms latency. But on a Windows desktop I get incredible 200-260
inserts a second with a 4ms latency!!! Even on the smallest MacBook Pro I get
bursts of high throughput - 100+ inserts a second.

Could you please help me figure out what is wrong with my server? I tried
several servers actually with the same results. I would appreciate any help in
tracing down the bottleneck. Configuration is the same in all tests with the
server having the advantage of separate physical disks for commitlog and data.

Could you also share with me what numbers you get or what is reasonable to
expect from this test?

Thank you very much,
Oleg


Here is the output for the Linux server, Windows desktop and MacBook Pro, one
line per second:

Linux server - INtel XEON X3330 @ 2.666Mhz, 4G RAM, 2G heap

Created keyspaces. Sleeping 1s for propagation.
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
19,19,19,0.05947368421052632,1
46,27,27,0.04274074074074074,2
70,24,24,0.04733333333333333,3
95,25,25,0.04696,4
119,24,24,0.04820833333333333,5
147,28,28,0.04189285714285714,7
177,30,30,0.03903333333333334,8
206,29,29,0.04006896551724138,9
235,29,29,0.03903448275862069,10

Windows desktop: Core2 Duo CPU E6550 @ 2.333Mhz, 2G RAM, 1G heap

Keyspace already exists.
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
147,147,147,0.005292517006802721,1
351,204,204,0.0042009803921568625,2
527,176,176,0.006551136363636364,3
718,191,191,0.005617801047120419,4
980,262,262,0.00400763358778626,5
1206,226,226,0.004150442477876107,6
1416,210,210,0.005619047619047619,7
1678,262,262,0.0040038167938931295,8

MacBook Pro: Core2 Duo CPU @ 2.26Mhz, 2G RAM, 1G heap

Created keyspaces. Sleeping 1s for propagation.
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
0,0,0,NaN,1
7,7,7,0.21185714285714285,2
47,40,40,0.026925,3
171,124,124,0.007967741935483871,4
258,87,87,0.01206896551724138,6
294,36,36,0.022444444444444444,7
303,9,9,0.14377777777777778,8
307,4,4,0.2455,9
313,6,6,0.128,10
508,195,195,0.007938461538461538,11
792,284,284,0.0035985915492957746,12
882,90,90,0.01218888888888889,13



Reply | Threaded
Open this post in threaded view
|

Re: Stress test inconsistencies

Tyler Hobbs
Try using something higher than -t 1, like -t 100.

- Tyler

On Mon, Jan 24, 2011 at 9:38 PM, Oleg Proudnikov <[hidden email]> wrote:
Hi All,

I am struggling to make sense of a simple stress test I ran against the latest
Cassandra 0.7. My server performs very poorly compared to a desktop and even a
notebook.

Here is the command I execute - a single threaded insert that runs on the same
host as Cassnadra does (I am using new contrib/stress but old py_stress produces
similar results):

./stress -t 1 -o INSERT -c 30 -n 10000 -i 1

On a SUSE Linux server with a 4-core Intel XEON I get maximum 30 inserts a
second with 40ms latency. But on a Windows desktop I get incredible 200-260
inserts a second with a 4ms latency!!! Even on the smallest MacBook Pro I get
bursts of high throughput - 100+ inserts a second.

Could you please help me figure out what is wrong with my server? I tried
several servers actually with the same results. I would appreciate any help in
tracing down the bottleneck. Configuration is the same in all tests with the
server having the advantage of separate physical disks for commitlog and data.

Could you also share with me what numbers you get or what is reasonable to
expect from this test?

Thank you very much,
Oleg


Here is the output for the Linux server, Windows desktop and MacBook Pro, one
line per second:

Linux server - INtel XEON X3330 @ 2.666Mhz, 4G RAM, 2G heap

Created keyspaces. Sleeping 1s for propagation.
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
19,19,19,0.05947368421052632,1
46,27,27,0.04274074074074074,2
70,24,24,0.04733333333333333,3
95,25,25,0.04696,4
119,24,24,0.04820833333333333,5
147,28,28,0.04189285714285714,7
177,30,30,0.03903333333333334,8
206,29,29,0.04006896551724138,9
235,29,29,0.03903448275862069,10

Windows desktop: Core2 Duo CPU E6550 @ 2.333Mhz, 2G RAM, 1G heap

Keyspace already exists.
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
147,147,147,0.005292517006802721,1
351,204,204,0.0042009803921568625,2
527,176,176,0.006551136363636364,3
718,191,191,0.005617801047120419,4
980,262,262,0.00400763358778626,5
1206,226,226,0.004150442477876107,6
1416,210,210,0.005619047619047619,7
1678,262,262,0.0040038167938931295,8

MacBook Pro: Core2 Duo CPU @ 2.26Mhz, 2G RAM, 1G heap

Created keyspaces. Sleeping 1s for propagation.
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
0,0,0,NaN,1
7,7,7,0.21185714285714285,2
47,40,40,0.026925,3
171,124,124,0.007967741935483871,4
258,87,87,0.01206896551724138,6
294,36,36,0.022444444444444444,7
303,9,9,0.14377777777777778,8
307,4,4,0.2455,9
313,6,6,0.128,10
508,195,195,0.007938461538461538,11
792,284,284,0.0035985915492957746,12
882,90,90,0.01218888888888889,13




Reply | Threaded
Open this post in threaded view
|

Re: Stress test inconsistencies

Oleg Proudnikov
Tyler Hobbs <tyler <at> riptano.com> writes:

> Try using something higher than -t 1, like -t 100.- Tyler
>


Thank you, Tyler!

When I run contrib/stress with a higher thread count, the server does scale to
200 inserts a second with latency of 200ms. At the same time Windows desktop
scales to 900 inserts a second and latency of 120ms. There is a huge difference
that I am trying to understand and eliminate.

In my real life bulk load I have to stay with a single threaded client for the
POC I am doing. The only option I have is to run several client processes... My
real life load is heavier than what contrib/stress does. It takes several days
to bulk load 4 million batch mutations !!! It is really painful :-( Something is
just not right...

Oleg




Reply | Threaded
Open this post in threaded view
|

Re: Stress test inconsistencies

buddhasystem
Oleg,

I'm a novice at this, but for what it's worth I can't imagine you can have a _sustained_ 1kHz insertion rate on a single machine which also does some reads. If I'm wrong, I'll be glad to learn that I was. It just doesn't seem to square with a typical seek time on a hard drive.

Maxim
Reply | Threaded
Open this post in threaded view
|

Re: Stress test inconsistencies

Brandon Williams
In reply to this post by Oleg Proudnikov
On Tue, Jan 25, 2011 at 1:23 PM, Oleg Proudnikov <[hidden email]> wrote:
When I run contrib/stress with a higher thread count, the server does scale to
200 inserts a second with latency of 200ms. At the same time Windows desktop
scales to 900 inserts a second and latency of 120ms. There is a huge difference
that I am trying to understand and eliminate.

Those are really low numbers, are you still testing with 10k rows?  That's not enough, try 1M to give both JVMs enough time to warm up.

-Brandon 
Reply | Threaded
Open this post in threaded view
|

Re: Stress test inconsistencies

Oleg Proudnikov
Brandon Williams <driftx <at> gmail.com> writes:

>
> On Tue, Jan 25, 2011 at 1:23 PM, Oleg Proudnikov <olegp <at> cloudorange.com>
wrote:
>
> When I run contrib/stress with a higher thread count, the server does scale to
> 200 inserts a second with latency of 200ms. At the same time Windows desktop
> scales to 900 inserts a second and latency of 120ms. There is a huge difference
> that I am trying to understand and eliminate.
>
>
> Those are really low numbers, are you still testing with 10k rows?  That's not
enough, try 1M to give both JVMs enough time to warm up.
>
>
> -Brandon 
>

I agree, Brandon, the numbers are very low! The warm up does not seem to make
any difference though... There is something that is holding the server back
because the CPU is very low. I am trying to understand where this bottleneck is
on the Linux server. I do not think it is Cassandra's config as I use the same
config on Windows and get much higher numbers as I described.

Oleg


Reply | Threaded
Open this post in threaded view
|

Re: Stress test inconsistencies

Oleg Proudnikov
In reply to this post by buddhasystem
buddhasystem <potekhin <at> bnl.gov> writes:

>
>
> Oleg,
>
> I'm a novice at this, but for what it's worth I can't imagine you can have a
> _sustained_ 1kHz insertion rate on a single machine which also does some
> reads. If I'm wrong, I'll be glad to learn that I was. It just doesn't seem
> to square with a typical seek time on a hard drive.
>
> Maxim
>

Maxim,

As I understand during inserts Cassandra should not be constrained by random
seek time as it uses sequential writes. I do get high numbers on Windows but
there is something that is holding back my Linux server. I am trying to
understand what it is.

Oleg



Reply | Threaded
Open this post in threaded view
|

Re: Stress test inconsistencies

Anthony John
Look at iostat -x 10 10 when he active par tof your test is running. there should be something called svc_t - that should be in the 10ms range, and await should be low.

Will tell you if IO is slow, or if IO is not being issued. 

Also, ensure that you ain't swapping with something like "swapon -s"

On Tue, Jan 25, 2011 at 3:04 PM, Oleg Proudnikov <[hidden email]> wrote:
buddhasystem <potekhin <at> bnl.gov> writes:

>
>
> Oleg,
>
> I'm a novice at this, but for what it's worth I can't imagine you can have a
> _sustained_ 1kHz insertion rate on a single machine which also does some
> reads. If I'm wrong, I'll be glad to learn that I was. It just doesn't seem
> to square with a typical seek time on a hard drive.
>
> Maxim
>

Maxim,

As I understand during inserts Cassandra should not be constrained by random
seek time as it uses sequential writes. I do get high numbers on Windows but
there is something that is holding back my Linux server. I am trying to
understand what it is.

Oleg




Reply | Threaded
Open this post in threaded view
|

Re: Stress test inconsistencies

Oleg Proudnikov
In reply to this post by Oleg Proudnikov
Hi All,

I was able to run contrib/stress at a very impressive throughput. Single
threaded client was able to pump 2,000 inserts per second with 0.4 ms latency.
Multithreaded client was able to pump 7,000 inserts per second with 7ms latency.

Thank you very much for your help!

Oleg


Reply | Threaded
Open this post in threaded view
|

Re: Stress test inconsistencies

Jonathan Shook
Would you share with us the changes you made, or problems you found?

On Wed, Jan 26, 2011 at 10:41 AM, Oleg Proudnikov <[hidden email]> wrote:

> Hi All,
>
> I was able to run contrib/stress at a very impressive throughput. Single
> threaded client was able to pump 2,000 inserts per second with 0.4 ms latency.
> Multithreaded client was able to pump 7,000 inserts per second with 7ms latency.
>
> Thank you very much for your help!
>
> Oleg
>
>
>
Reply | Threaded
Open this post in threaded view
|

Re: Stress test inconsistencies

Oleg Proudnikov
I returned to periodic commit log fsync.


Jonathan Shook <jshook <at> gmail.com> writes:

>
> Would you share with us the changes you made, or problems you found?
>