Frequent timeout issues

classic Classic list List threaded Threaded
17 messages Options
Reply | Threaded
Open this post in threaded view
|

Frequent timeout issues

Amlan Roy
Hi,

I am new to Cassandra. I have setup a cluster with Cassandra 2.0.13. I am writing the same data in HBase and Cassandra and find that the writes are extremely slow in Cassandra and frequently seeing exception “Cassandra timeout during write query at consistency ONE". The cluster size for both HBase and Cassandra are same. 

Looks like something is wrong with my cluster setup. What can be the possible issue? Data and commit logs are written into two separate disks. 

Regards,
Amlan
Reply | Threaded
Open this post in threaded view
|

Re: Frequent timeout issues

Eric R Medley
Amlan,

Can you provide information on how much data is being written? Are any of the columns really large? Are any writes succeeding or are all timing out?

Regards,

Eric R Medley

On Apr 1, 2015, at 9:03 AM, Amlan Roy <[hidden email]> wrote:

Hi,

I am new to Cassandra. I have setup a cluster with Cassandra 2.0.13. I am writing the same data in HBase and Cassandra and find that the writes are extremely slow in Cassandra and frequently seeing exception “Cassandra timeout during write query at consistency ONE". The cluster size for both HBase and Cassandra are same. 

Looks like something is wrong with my cluster setup. What can be the possible issue? Data and commit logs are written into two separate disks. 

Regards,
Amlan

Reply | Threaded
Open this post in threaded view
|

Re: Frequent timeout issues

Eric R Medley
Also, can you provide the table details and the consistency level you are using?

Regards,

Eric R Medley

On Apr 1, 2015, at 9:13 AM, Eric R Medley <[hidden email]> wrote:

Amlan,

Can you provide information on how much data is being written? Are any of the columns really large? Are any writes succeeding or are all timing out?

Regards,

Eric R Medley

On Apr 1, 2015, at 9:03 AM, Amlan Roy <[hidden email]> wrote:

Hi,

I am new to Cassandra. I have setup a cluster with Cassandra 2.0.13. I am writing the same data in HBase and Cassandra and find that the writes are extremely slow in Cassandra and frequently seeing exception “Cassandra timeout during write query at consistency ONE". The cluster size for both HBase and Cassandra are same. 

Looks like something is wrong with my cluster setup. What can be the possible issue? Data and commit logs are written into two separate disks. 

Regards,
Amlan


Reply | Threaded
Open this post in threaded view
|

Re: Frequent timeout issues

Amlan Roy
In reply to this post by Eric R Medley
Hi Eric,

Thanks for the reply. Some columns are big but I see the issue even when I stop storing the big columns. Some of the writes are timing out, not all. Where can I find the number of writes to Cassandra?

Regards,
Amlan

On 01-Apr-2015, at 7:43 pm, Eric R Medley <[hidden email]> wrote:

Amlan,

Can you provide information on how much data is being written? Are any of the columns really large? Are any writes succeeding or are all timing out?

Regards,

Eric R Medley

On Apr 1, 2015, at 9:03 AM, Amlan Roy <[hidden email]> wrote:

Hi,

I am new to Cassandra. I have setup a cluster with Cassandra 2.0.13. I am writing the same data in HBase and Cassandra and find that the writes are extremely slow in Cassandra and frequently seeing exception “Cassandra timeout during write query at consistency ONE". The cluster size for both HBase and Cassandra are same. 

Looks like something is wrong with my cluster setup. What can be the possible issue? Data and commit logs are written into two separate disks. 

Regards,
Amlan


Reply | Threaded
Open this post in threaded view
|

Re: Frequent timeout issues

Amlan Roy
In reply to this post by Eric R Medley
Write consistency level is ONE.

This is the describe output for one of the tables.

CREATE TABLE event_data (
  event text,
  week text,
  bucket int,
  date timestamp,
  unique text,
  adt int,
  age list<int>,
  arrival list<timestamp>,
  bank text,
  bf double,
  cabin text,
  card text,
  carrier list<text>,
  cb double,
  channel text,
  chd int,
  company text,
  cookie text,
  coupon list<text>,
  depart list<timestamp>,
  dest list<text>,
  device text,
  dis double,
  domain text,
  duration bigint,
  emi int,
  expressway boolean,
  flight list<text>,
  freq_flyer list<text>,
  host text,
  host_ip text,
  inf int,
  instance text,
  insurance text,
  intl boolean,
  itinerary text,
  journey text,
  meal_pref list<text>,
  mkp double,
  name list<text>,
  origin list<text>,
  pax_type list<text>,
  payment text,
  pref_carrier list<text>,
  referrer text,
  result_cnt int,
  search text,
  src text,
  src_ip text,
  stops int,
  supplier list<text>,
  tags list<text>,
  total double,
  trip text,
  user text,
  user_agent text,
  PRIMARY KEY ((event, week, bucket), date, unique)
) WITH CLUSTERING ORDER BY (date DESC, unique ASC) AND
  bloom_filter_fp_chance=0.010000 AND
  caching='KEYS_ONLY' AND
  comment='' AND
  dclocal_read_repair_chance=0.100000 AND
  gc_grace_seconds=864000 AND
  index_interval=128 AND
  read_repair_chance=0.000000 AND
  replicate_on_write='true' AND
  populate_io_cache_on_flush='false' AND
  default_time_to_live=0 AND
  speculative_retry='99.0PERCENTILE' AND
  memtable_flush_period_in_ms=0 AND
  compaction={'class': 'SizeTieredCompactionStrategy'} AND
  compression={'sstable_compression': 'LZ4Compressor’};


On 01-Apr-2015, at 8:00 pm, Eric R Medley <[hidden email]> wrote:

Also, can you provide the table details and the consistency level you are using?

Regards,

Eric R Medley

On Apr 1, 2015, at 9:13 AM, Eric R Medley <[hidden email]> wrote:

Amlan,

Can you provide information on how much data is being written? Are any of the columns really large? Are any writes succeeding or are all timing out?

Regards,

Eric R Medley

On Apr 1, 2015, at 9:03 AM, Amlan Roy <[hidden email]> wrote:

Hi,

I am new to Cassandra. I have setup a cluster with Cassandra 2.0.13. I am writing the same data in HBase and Cassandra and find that the writes are extremely slow in Cassandra and frequently seeing exception “Cassandra timeout during write query at consistency ONE". The cluster size for both HBase and Cassandra are same. 

Looks like something is wrong with my cluster setup. What can be the possible issue? Data and commit logs are written into two separate disks. 

Regards,
Amlan



Reply | Threaded
Open this post in threaded view
|

Re: Frequent timeout issues

Brice Dutheil
And the keyspace? What is the replication factor.

Also how are the inserts done?

On Wednesday, April 1, 2015, Amlan Roy <[hidden email]> wrote:
Write consistency level is ONE.

This is the describe output for one of the tables.

CREATE TABLE event_data (
  event text,
  week text,
  bucket int,
  date timestamp,
  unique text,
  adt int,
  age list<int>,
  arrival list<timestamp>,
  bank text,
  bf double,
  cabin text,
  card text,
  carrier list<text>,
  cb double,
  channel text,
  chd int,
  company text,
  cookie text,
  coupon list<text>,
  depart list<timestamp>,
  dest list<text>,
  device text,
  dis double,
  domain text,
  duration bigint,
  emi int,
  expressway boolean,
  flight list<text>,
  freq_flyer list<text>,
  host text,
  host_ip text,
  inf int,
  instance text,
  insurance text,
  intl boolean,
  itinerary text,
  journey text,
  meal_pref list<text>,
  mkp double,
  name list<text>,
  origin list<text>,
  pax_type list<text>,
  payment text,
  pref_carrier list<text>,
  referrer text,
  result_cnt int,
  search text,
  src text,
  src_ip text,
  stops int,
  supplier list<text>,
  tags list<text>,
  total double,
  trip text,
  user text,
  user_agent text,
  PRIMARY KEY ((event, week, bucket), date, unique)
) WITH CLUSTERING ORDER BY (date DESC, unique ASC) AND
  bloom_filter_fp_chance=0.010000 AND
  caching='KEYS_ONLY' AND
  comment='' AND
  dclocal_read_repair_chance=0.100000 AND
  gc_grace_seconds=864000 AND
  index_interval=128 AND
  read_repair_chance=0.000000 AND
  replicate_on_write='true' AND
  populate_io_cache_on_flush='false' AND
  default_time_to_live=0 AND
  speculative_retry='99.0PERCENTILE' AND
  memtable_flush_period_in_ms=0 AND
  compaction={'class': 'SizeTieredCompactionStrategy'} AND
  compression={'sstable_compression': 'LZ4Compressor’};


On 01-Apr-2015, at 8:00 pm, Eric R Medley <<a href="javascript:_e(%7B%7D,&#39;cvml&#39;,&#39;emedley@xylocore.com&#39;);" target="_blank">emedley@...> wrote:

Also, can you provide the table details and the consistency level you are using?

Regards,

Eric R Medley

On Apr 1, 2015, at 9:13 AM, Eric R Medley <<a href="javascript:_e(%7B%7D,&#39;cvml&#39;,&#39;emedley@xylocore.com&#39;);" target="_blank">emedley@...> wrote:

Amlan,

Can you provide information on how much data is being written? Are any of the columns really large? Are any writes succeeding or are all timing out?

Regards,

Eric R Medley

On Apr 1, 2015, at 9:03 AM, Amlan Roy <<a href="javascript:_e(%7B%7D,&#39;cvml&#39;,&#39;amlan.roy@cleartrip.com&#39;);" target="_blank">amlan.roy@...> wrote:

Hi,

I am new to Cassandra. I have setup a cluster with Cassandra 2.0.13. I am writing the same data in HBase and Cassandra and find that the writes are extremely slow in Cassandra and frequently seeing exception “Cassandra timeout during write query at consistency ONE". The cluster size for both HBase and Cassandra are same. 

Looks like something is wrong with my cluster setup. What can be the possible issue? Data and commit logs are written into two separate disks. 

Regards,
Amlan





--
Brice
Reply | Threaded
Open this post in threaded view
|

Re: Frequent timeout issues

Eric R Medley
In reply to this post by Amlan Roy
Are you seeing any exceptions in the cassandra logs? What are the loads on your servers? Have you monitored the performance of those servers? How many writes are you performing at a time? How many writes per seconds?

Regards,

Eric R Medley

On Apr 1, 2015, at 9:40 AM, Amlan Roy <[hidden email]> wrote:

Write consistency level is ONE.

This is the describe output for one of the tables.

CREATE TABLE event_data (
  event text,
  week text,
  bucket int,
  date timestamp,
  unique text,
  adt int,
  age list<int>,
  arrival list<timestamp>,
  bank text,
  bf double,
  cabin text,
  card text,
  carrier list<text>,
  cb double,
  channel text,
  chd int,
  company text,
  cookie text,
  coupon list<text>,
  depart list<timestamp>,
  dest list<text>,
  device text,
  dis double,
  domain text,
  duration bigint,
  emi int,
  expressway boolean,
  flight list<text>,
  freq_flyer list<text>,
  host text,
  host_ip text,
  inf int,
  instance text,
  insurance text,
  intl boolean,
  itinerary text,
  journey text,
  meal_pref list<text>,
  mkp double,
  name list<text>,
  origin list<text>,
  pax_type list<text>,
  payment text,
  pref_carrier list<text>,
  referrer text,
  result_cnt int,
  search text,
  src text,
  src_ip text,
  stops int,
  supplier list<text>,
  tags list<text>,
  total double,
  trip text,
  user text,
  user_agent text,
  PRIMARY KEY ((event, week, bucket), date, unique)
) WITH CLUSTERING ORDER BY (date DESC, unique ASC) AND
  bloom_filter_fp_chance=0.010000 AND
  caching='KEYS_ONLY' AND
  comment='' AND
  dclocal_read_repair_chance=0.100000 AND
  gc_grace_seconds=864000 AND
  index_interval=128 AND
  read_repair_chance=0.000000 AND
  replicate_on_write='true' AND
  populate_io_cache_on_flush='false' AND
  default_time_to_live=0 AND
  speculative_retry='99.0PERCENTILE' AND
  memtable_flush_period_in_ms=0 AND
  compaction={'class': 'SizeTieredCompactionStrategy'} AND
  compression={'sstable_compression': 'LZ4Compressor’};


On 01-Apr-2015, at 8:00 pm, Eric R Medley <[hidden email]> wrote:

Also, can you provide the table details and the consistency level you are using?

Regards,

Eric R Medley

On Apr 1, 2015, at 9:13 AM, Eric R Medley <[hidden email]> wrote:

Amlan,

Can you provide information on how much data is being written? Are any of the columns really large? Are any writes succeeding or are all timing out?

Regards,

Eric R Medley

On Apr 1, 2015, at 9:03 AM, Amlan Roy <[hidden email]> wrote:

Hi,

I am new to Cassandra. I have setup a cluster with Cassandra 2.0.13. I am writing the same data in HBase and Cassandra and find that the writes are extremely slow in Cassandra and frequently seeing exception “Cassandra timeout during write query at consistency ONE". The cluster size for both HBase and Cassandra are same. 

Looks like something is wrong with my cluster setup. What can be the possible issue? Data and commit logs are written into two separate disks. 

Regards,
Amlan




Reply | Threaded
Open this post in threaded view
|

Re: Frequent timeout issues

Eric R Medley
In reply to this post by Brice Dutheil
Are HBase and Cassandra running on the same servers? Are the writes to each of these databases happening at the same time?

Regards,

Eric R Medley

On Apr 1, 2015, at 10:12 AM, Brice Dutheil <[hidden email]> wrote:

And the keyspace? What is the replication factor.

Also how are the inserts done?

On Wednesday, April 1, 2015, Amlan Roy <[hidden email]> wrote:
Write consistency level is ONE.

This is the describe output for one of the tables.

CREATE TABLE event_data (
  event text,
  week text,
  bucket int,
  date timestamp,
  unique text,
  adt int,
  age list<int>,
  arrival list<timestamp>,
  bank text,
  bf double,
  cabin text,
  card text,
  carrier list<text>,
  cb double,
  channel text,
  chd int,
  company text,
  cookie text,
  coupon list<text>,
  depart list<timestamp>,
  dest list<text>,
  device text,
  dis double,
  domain text,
  duration bigint,
  emi int,
  expressway boolean,
  flight list<text>,
  freq_flyer list<text>,
  host text,
  host_ip text,
  inf int,
  instance text,
  insurance text,
  intl boolean,
  itinerary text,
  journey text,
  meal_pref list<text>,
  mkp double,
  name list<text>,
  origin list<text>,
  pax_type list<text>,
  payment text,
  pref_carrier list<text>,
  referrer text,
  result_cnt int,
  search text,
  src text,
  src_ip text,
  stops int,
  supplier list<text>,
  tags list<text>,
  total double,
  trip text,
  user text,
  user_agent text,
  PRIMARY KEY ((event, week, bucket), date, unique)
) WITH CLUSTERING ORDER BY (date DESC, unique ASC) AND
  bloom_filter_fp_chance=0.010000 AND
  caching='KEYS_ONLY' AND
  comment='' AND
  dclocal_read_repair_chance=0.100000 AND
  gc_grace_seconds=864000 AND
  index_interval=128 AND
  read_repair_chance=0.000000 AND
  replicate_on_write='true' AND
  populate_io_cache_on_flush='false' AND
  default_time_to_live=0 AND
  speculative_retry='99.0PERCENTILE' AND
  memtable_flush_period_in_ms=0 AND
  compaction={'class': 'SizeTieredCompactionStrategy'} AND
  compression={'sstable_compression': 'LZ4Compressor’};


On 01-Apr-2015, at 8:00 pm, Eric R Medley <<a href="javascript:_e(%7B%7D,'cvml','emedley@xylocore.com');" target="_blank" class="">emedley@...> wrote:

Also, can you provide the table details and the consistency level you are using?

Regards,

Eric R Medley

On Apr 1, 2015, at 9:13 AM, Eric R Medley <<a href="javascript:_e(%7B%7D,'cvml','emedley@xylocore.com');" target="_blank" class="">emedley@...> wrote:

Amlan,

Can you provide information on how much data is being written? Are any of the columns really large? Are any writes succeeding or are all timing out?

Regards,

Eric R Medley

On Apr 1, 2015, at 9:03 AM, Amlan Roy <<a href="javascript:_e(%7B%7D,'cvml','amlan.roy@cleartrip.com');" target="_blank" class="">amlan.roy@...> wrote:

Hi,

I am new to Cassandra. I have setup a cluster with Cassandra 2.0.13. I am writing the same data in HBase and Cassandra and find that the writes are extremely slow in Cassandra and frequently seeing exception “Cassandra timeout during write query at consistency ONE". The cluster size for both HBase and Cassandra are same. 

Looks like something is wrong with my cluster setup. What can be the possible issue? Data and commit logs are written into two separate disks. 

Regards,
Amlan





--
Brice

Reply | Threaded
Open this post in threaded view
|

Re: Frequent timeout issues

Amlan Roy
In reply to this post by Brice Dutheil
Replication factor is 2.
CREATE KEYSPACE ct_keyspace WITH replication = {
  'class': 'NetworkTopologyStrategy',
  'DC1': '2'
};

Inserts are happening from Storm using java driver. Using prepared statement without batch.


On 01-Apr-2015, at 8:42 pm, Brice Dutheil <[hidden email]> wrote:

And the keyspace? What is the replication factor.

Also how are the inserts done?

On Wednesday, April 1, 2015, Amlan Roy <[hidden email]> wrote:
Write consistency level is ONE.

This is the describe output for one of the tables.

CREATE TABLE event_data (
  event text,
  week text,
  bucket int,
  date timestamp,
  unique text,
  adt int,
  age list<int>,
  arrival list<timestamp>,
  bank text,
  bf double,
  cabin text,
  card text,
  carrier list<text>,
  cb double,
  channel text,
  chd int,
  company text,
  cookie text,
  coupon list<text>,
  depart list<timestamp>,
  dest list<text>,
  device text,
  dis double,
  domain text,
  duration bigint,
  emi int,
  expressway boolean,
  flight list<text>,
  freq_flyer list<text>,
  host text,
  host_ip text,
  inf int,
  instance text,
  insurance text,
  intl boolean,
  itinerary text,
  journey text,
  meal_pref list<text>,
  mkp double,
  name list<text>,
  origin list<text>,
  pax_type list<text>,
  payment text,
  pref_carrier list<text>,
  referrer text,
  result_cnt int,
  search text,
  src text,
  src_ip text,
  stops int,
  supplier list<text>,
  tags list<text>,
  total double,
  trip text,
  user text,
  user_agent text,
  PRIMARY KEY ((event, week, bucket), date, unique)
) WITH CLUSTERING ORDER BY (date DESC, unique ASC) AND
  bloom_filter_fp_chance=0.010000 AND
  caching='KEYS_ONLY' AND
  comment='' AND
  dclocal_read_repair_chance=0.100000 AND
  gc_grace_seconds=864000 AND
  index_interval=128 AND
  read_repair_chance=0.000000 AND
  replicate_on_write='true' AND
  populate_io_cache_on_flush='false' AND
  default_time_to_live=0 AND
  speculative_retry='99.0PERCENTILE' AND
  memtable_flush_period_in_ms=0 AND
  compaction={'class': 'SizeTieredCompactionStrategy'} AND
  compression={'sstable_compression': 'LZ4Compressor’};


On 01-Apr-2015, at 8:00 pm, Eric R Medley <<a href="javascript:_e(%7B%7D,'cvml','emedley@xylocore.com');" target="_blank">emedley@...> wrote:

Also, can you provide the table details and the consistency level you are using?

Regards,

Eric R Medley

On Apr 1, 2015, at 9:13 AM, Eric R Medley <<a href="javascript:_e(%7B%7D,'cvml','emedley@xylocore.com');" target="_blank">emedley@...> wrote:

Amlan,

Can you provide information on how much data is being written? Are any of the columns really large? Are any writes succeeding or are all timing out?

Regards,

Eric R Medley

On Apr 1, 2015, at 9:03 AM, Amlan Roy <<a href="javascript:_e(%7B%7D,'cvml','amlan.roy@cleartrip.com');" target="_blank">amlan.roy@...> wrote:

Hi,

I am new to Cassandra. I have setup a cluster with Cassandra 2.0.13. I am writing the same data in HBase and Cassandra and find that the writes are extremely slow in Cassandra and frequently seeing exception “Cassandra timeout during write query at consistency ONE". The cluster size for both HBase and Cassandra are same. 

Looks like something is wrong with my cluster setup. What can be the possible issue? Data and commit logs are written into two separate disks. 

Regards,
Amlan





--
Brice

Reply | Threaded
Open this post in threaded view
|

Re: Frequent timeout issues

Brian O'Neill

Are you using the storm-cassandra-cql driver? 

If so, what version?
Batching or no batching?

-brian

---

Brian O'Neill 

Chief Technology Officer

Health Market Science, a LexisNexis Company

215.588.6024 Mobile @boneill42 


This information transmitted in this email message is for the intended recipient only and may contain confidential and/or privileged material. If you received this email in error and are not the intended recipient, or the person responsible to deliver it to the intended recipient, please contact the sender at the email above and delete this email and any attachments and destroy any copies thereof. Any review, retransmission, dissemination, copying or other use of, or taking any action in reliance upon, this information by persons or entities other than the intended recipient is strictly prohibited.

 


From: Amlan Roy <[hidden email]>
Reply-To: <[hidden email]>
Date: Wednesday, April 1, 2015 at 11:37 AM
To: <[hidden email]>
Subject: Re: Frequent timeout issues

Replication factor is 2.
CREATE KEYSPACE ct_keyspace WITH replication = {
  'class': 'NetworkTopologyStrategy',
  'DC1': '2'
};

Inserts are happening from Storm using java driver. Using prepared statement without batch.


On 01-Apr-2015, at 8:42 pm, Brice Dutheil <[hidden email]> wrote:

And the keyspace? What is the replication factor.

Also how are the inserts done?

On Wednesday, April 1, 2015, Amlan Roy <[hidden email]> wrote:
Write consistency level is ONE.

This is the describe output for one of the tables.

CREATE TABLE event_data (
  event text,
  week text,
  bucket int,
  date timestamp,
  unique text,
  adt int,
  age list<int>,
  arrival list<timestamp>,
  bank text,
  bf double,
  cabin text,
  card text,
  carrier list<text>,
  cb double,
  channel text,
  chd int,
  company text,
  cookie text,
  coupon list<text>,
  depart list<timestamp>,
  dest list<text>,
  device text,
  dis double,
  domain text,
  duration bigint,
  emi int,
  expressway boolean,
  flight list<text>,
  freq_flyer list<text>,
  host text,
  host_ip text,
  inf int,
  instance text,
  insurance text,
  intl boolean,
  itinerary text,
  journey text,
  meal_pref list<text>,
  mkp double,
  name list<text>,
  origin list<text>,
  pax_type list<text>,
  payment text,
  pref_carrier list<text>,
  referrer text,
  result_cnt int,
  search text,
  src text,
  src_ip text,
  stops int,
  supplier list<text>,
  tags list<text>,
  total double,
  trip text,
  user text,
  user_agent text,
  PRIMARY KEY ((event, week, bucket), date, unique)
) WITH CLUSTERING ORDER BY (date DESC, unique ASC) AND
  bloom_filter_fp_chance=0.010000 AND
  caching='KEYS_ONLY' AND
  comment='' AND
  dclocal_read_repair_chance=0.100000 AND
  gc_grace_seconds=864000 AND
  index_interval=128 AND
  read_repair_chance=0.000000 AND
  replicate_on_write='true' AND
  populate_io_cache_on_flush='false' AND
  default_time_to_live=0 AND
  speculative_retry='99.0PERCENTILE' AND
  memtable_flush_period_in_ms=0 AND
  compaction={'class': 'SizeTieredCompactionStrategy'} AND
  compression={'sstable_compression': 'LZ4Compressor’};


On 01-Apr-2015, at 8:00 pm, Eric R Medley <<a href="javascript:_e(%7B%7D,'cvml','emedley@xylocore.com');" target="_blank">emedley@...> wrote:

Also, can you provide the table details and the consistency level you are using?

Regards,

Eric R Medley

On Apr 1, 2015, at 9:13 AM, Eric R Medley <<a href="javascript:_e(%7B%7D,'cvml','emedley@xylocore.com');" target="_blank">emedley@...> wrote:

Amlan,

Can you provide information on how much data is being written? Are any of the columns really large? Are any writes succeeding or are all timing out?

Regards,

Eric R Medley

On Apr 1, 2015, at 9:03 AM, Amlan Roy <<a href="javascript:_e(%7B%7D,'cvml','amlan.roy@cleartrip.com');" target="_blank">amlan.roy@...> wrote:

Hi,

I am new to Cassandra. I have setup a cluster with Cassandra 2.0.13. I am writing the same data in HBase and Cassandra and find that the writes are extremely slow in Cassandra and frequently seeing exception “Cassandra timeout during write query at consistency ONE". The cluster size for both HBase and Cassandra are same. 

Looks like something is wrong with my cluster setup. What can be the possible issue? Data and commit logs are written into two separate disks. 

Regards,
Amlan





--
Brice

Reply | Threaded
Open this post in threaded view
|

Re: Frequent timeout issues

Amlan Roy
Using the datastax driver without batch.
http://www.datastax.com/documentation/developer/java-driver/2.1/java-driver/whatsNew2.html


On 01-Apr-2015, at 9:15 pm, Brian O'Neill <[hidden email]> wrote:


Are you using the storm-cassandra-cql driver? 

If so, what version?
Batching or no batching?

-brian

---
Brian O'Neill 
Chief Technology Officer
Health Market Science, a LexisNexis Company
215.588.6024 Mobile @boneill42 

This information transmitted in this email message is for the intended recipient only and may contain confidential and/or privileged material. If you received this email in error and are not the intended recipient, or the person responsible to deliver it to the intended recipient, please contact the sender at the email above and delete this email and any attachments and destroy any copies thereof. Any review, retransmission, dissemination, copying or other use of, or taking any action in reliance upon, this information by persons or entities other than the intended recipient is strictly prohibited.
 


From: Amlan Roy <[hidden email]>
Reply-To: <[hidden email]>
Date: Wednesday, April 1, 2015 at 11:37 AM
To: <[hidden email]>
Subject: Re: Frequent timeout issues

Replication factor is 2.
CREATE KEYSPACE ct_keyspace WITH replication = {
  'class': 'NetworkTopologyStrategy',
  'DC1': '2'
};

Inserts are happening from Storm using java driver. Using prepared statement without batch.


On 01-Apr-2015, at 8:42 pm, Brice Dutheil <[hidden email]> wrote:

And the keyspace? What is the replication factor.

Also how are the inserts done?

On Wednesday, April 1, 2015, Amlan Roy <[hidden email]> wrote:
Write consistency level is ONE.

This is the describe output for one of the tables.

CREATE TABLE event_data (
  event text,
  week text,
  bucket int,
  date timestamp,
  unique text,
  adt int,
  age list<int>,
  arrival list<timestamp>,
  bank text,
  bf double,
  cabin text,
  card text,
  carrier list<text>,
  cb double,
  channel text,
  chd int,
  company text,
  cookie text,
  coupon list<text>,
  depart list<timestamp>,
  dest list<text>,
  device text,
  dis double,
  domain text,
  duration bigint,
  emi int,
  expressway boolean,
  flight list<text>,
  freq_flyer list<text>,
  host text,
  host_ip text,
  inf int,
  instance text,
  insurance text,
  intl boolean,
  itinerary text,
  journey text,
  meal_pref list<text>,
  mkp double,
  name list<text>,
  origin list<text>,
  pax_type list<text>,
  payment text,
  pref_carrier list<text>,
  referrer text,
  result_cnt int,
  search text,
  src text,
  src_ip text,
  stops int,
  supplier list<text>,
  tags list<text>,
  total double,
  trip text,
  user text,
  user_agent text,
  PRIMARY KEY ((event, week, bucket), date, unique)
) WITH CLUSTERING ORDER BY (date DESC, unique ASC) AND
  bloom_filter_fp_chance=0.010000 AND
  caching='KEYS_ONLY' AND
  comment='' AND
  dclocal_read_repair_chance=0.100000 AND
  gc_grace_seconds=864000 AND
  index_interval=128 AND
  read_repair_chance=0.000000 AND
  replicate_on_write='true' AND
  populate_io_cache_on_flush='false' AND
  default_time_to_live=0 AND
  speculative_retry='99.0PERCENTILE' AND
  memtable_flush_period_in_ms=0 AND
  compaction={'class': 'SizeTieredCompactionStrategy'} AND
  compression={'sstable_compression': 'LZ4Compressor’};


On 01-Apr-2015, at 8:00 pm, Eric R Medley <<a href="javascript:_e(%7B%7D,'cvml','emedley@xylocore.com');" target="_blank">emedley@...> wrote:

Also, can you provide the table details and the consistency level you are using?

Regards,

Eric R Medley

On Apr 1, 2015, at 9:13 AM, Eric R Medley <<a href="javascript:_e(%7B%7D,'cvml','emedley@xylocore.com');" target="_blank">emedley@...> wrote:

Amlan,

Can you provide information on how much data is being written? Are any of the columns really large? Are any writes succeeding or are all timing out?

Regards,

Eric R Medley

On Apr 1, 2015, at 9:03 AM, Amlan Roy <<a href="javascript:_e(%7B%7D,'cvml','amlan.roy@cleartrip.com');" target="_blank">amlan.roy@...> wrote:

Hi,

I am new to Cassandra. I have setup a cluster with Cassandra 2.0.13. I am writing the same data in HBase and Cassandra and find that the writes are extremely slow in Cassandra and frequently seeing exception “Cassandra timeout during write query at consistency ONE". The cluster size for both HBase and Cassandra are same. 

Looks like something is wrong with my cluster setup. What can be the possible issue? Data and commit logs are written into two separate disks. 

Regards,
Amlan





--
Brice


Reply | Threaded
Open this post in threaded view
|

Re: Frequent timeout issues

Amlan Roy
In reply to this post by Eric R Medley
Did not see any exception in cassandra.log and system.log. Monitored using JConsole. Did not see anything wrong. Do I need to see any specific info? Doing almost 1000 writes/sec.

HBase and Cassandra are running on different clusters. For cassandra I have 6 nodes with 64GB RAM(Heap is at default setting) and 32 cores.

On 01-Apr-2015, at 8:43 pm, Eric R Medley <[hidden email]> wrote:

Are you seeing any exceptions in the cassandra logs? What are the loads on your servers? Have you monitored the performance of those servers? How many writes are you performing at a time? How many writes per seconds?

Regards,

Eric R Medley

On Apr 1, 2015, at 9:40 AM, Amlan Roy <[hidden email]> wrote:

Write consistency level is ONE.

This is the describe output for one of the tables.

CREATE TABLE event_data (
  event text,
  week text,
  bucket int,
  date timestamp,
  unique text,
  adt int,
  age list<int>,
  arrival list<timestamp>,
  bank text,
  bf double,
  cabin text,
  card text,
  carrier list<text>,
  cb double,
  channel text,
  chd int,
  company text,
  cookie text,
  coupon list<text>,
  depart list<timestamp>,
  dest list<text>,
  device text,
  dis double,
  domain text,
  duration bigint,
  emi int,
  expressway boolean,
  flight list<text>,
  freq_flyer list<text>,
  host text,
  host_ip text,
  inf int,
  instance text,
  insurance text,
  intl boolean,
  itinerary text,
  journey text,
  meal_pref list<text>,
  mkp double,
  name list<text>,
  origin list<text>,
  pax_type list<text>,
  payment text,
  pref_carrier list<text>,
  referrer text,
  result_cnt int,
  search text,
  src text,
  src_ip text,
  stops int,
  supplier list<text>,
  tags list<text>,
  total double,
  trip text,
  user text,
  user_agent text,
  PRIMARY KEY ((event, week, bucket), date, unique)
) WITH CLUSTERING ORDER BY (date DESC, unique ASC) AND
  bloom_filter_fp_chance=0.010000 AND
  caching='KEYS_ONLY' AND
  comment='' AND
  dclocal_read_repair_chance=0.100000 AND
  gc_grace_seconds=864000 AND
  index_interval=128 AND
  read_repair_chance=0.000000 AND
  replicate_on_write='true' AND
  populate_io_cache_on_flush='false' AND
  default_time_to_live=0 AND
  speculative_retry='99.0PERCENTILE' AND
  memtable_flush_period_in_ms=0 AND
  compaction={'class': 'SizeTieredCompactionStrategy'} AND
  compression={'sstable_compression': 'LZ4Compressor’};


On 01-Apr-2015, at 8:00 pm, Eric R Medley <[hidden email]> wrote:

Also, can you provide the table details and the consistency level you are using?

Regards,

Eric R Medley

On Apr 1, 2015, at 9:13 AM, Eric R Medley <[hidden email]> wrote:

Amlan,

Can you provide information on how much data is being written? Are any of the columns really large? Are any writes succeeding or are all timing out?

Regards,

Eric R Medley

On Apr 1, 2015, at 9:03 AM, Amlan Roy <[hidden email]> wrote:

Hi,

I am new to Cassandra. I have setup a cluster with Cassandra 2.0.13. I am writing the same data in HBase and Cassandra and find that the writes are extremely slow in Cassandra and frequently seeing exception “Cassandra timeout during write query at consistency ONE". The cluster size for both HBase and Cassandra are same. 

Looks like something is wrong with my cluster setup. What can be the possible issue? Data and commit logs are written into two separate disks. 

Regards,
Amlan





Reply | Threaded
Open this post in threaded view
|

Re: Frequent timeout issues

Robert Coli-3
In reply to this post by Amlan Roy
On Wed, Apr 1, 2015 at 8:37 AM, Amlan Roy <[hidden email]> wrote:
Replication factor is 2.

It is relatively unusual for people to use a replication factor of 2, for what it's worth.

=Rob

 
Reply | Threaded
Open this post in threaded view
|

Re: Frequent timeout issues

Anuj
In reply to this post by Amlan Roy
Are you writing multiple cf at same time?
Please run nodetool tpstats to make sure that FlushWriter etc doesnt have high All time blocked counts. A Blocked memtable FlushWriter may block/drop writes. If thats the case you may need to increase memtable flush writers..if u have many secondary indexes in cf ..make sure that memtable flush que size is set at least equal to no of indexes..

monitoring iostat and gc logs may help..

Thanks
Anuj Wadehra
From:"Amlan Roy" <[hidden email]>
Date:Wed, 1 Apr, 2015 at 9:27 pm
Subject:Re: Frequent timeout issues

Did not see any exception in cassandra.log and system.log. Monitored using JConsole. Did not see anything wrong. Do I need to see any specific info? Doing almost 1000 writes/sec.

HBase and Cassandra are running on different clusters. For cassandra I have 6 nodes with 64GB RAM(Heap is at default setting) and 32 cores.

On 01-Apr-2015, at 8:43 pm, Eric R Medley <<a rel="nofollow" shape="rect" ymailto="mailto:emedley@xylocore.com" target="_blank" href="javascript:return">emedley@...> wrote:

Reply | Threaded
Open this post in threaded view
|

Re: Frequent timeout issues

daemeon reiydelle
May not be relevant, but what is the "default" heap size you have deployed. Should be no more than 16gb (and be aware of the impacts of gc on that large size), suggest not smaller than 8-12gb.



On Wed, Apr 1, 2015 at 11:28 AM, Anuj Wadehra <[hidden email]> wrote:
Are you writing multiple cf at same time?
Please run nodetool tpstats to make sure that FlushWriter etc doesnt have high All time blocked counts. A Blocked memtable FlushWriter may block/drop writes. If thats the case you may need to increase memtable flush writers..if u have many secondary indexes in cf ..make sure that memtable flush que size is set at least equal to no of indexes..

monitoring iostat and gc logs may help..

Thanks
Anuj Wadehra
From:"Amlan Roy" <[hidden email]>
Date:Wed, 1 Apr, 2015 at 9:27 pm

Subject:Re: Frequent timeout issues

Did not see any exception in cassandra.log and system.log. Monitored using JConsole. Did not see anything wrong. Do I need to see any specific info? Doing almost 1000 writes/sec.

HBase and Cassandra are running on different clusters. For cassandra I have 6 nodes with 64GB RAM(Heap is at default setting) and 32 cores.

On 01-Apr-2015, at 8:43 pm, Eric R Medley <[hidden email]> wrote:


Reply | Threaded
Open this post in threaded view
|

Re: Frequent timeout issues

Jonathan Haddad
@Daemeon you may want to read through https://issues.apache.org/jira/browse/CASSANDRA-8150, there are perfectly valid cases for heap > 16gb.

On Thu, Apr 2, 2015 at 10:07 AM daemeon reiydelle <[hidden email]> wrote:
May not be relevant, but what is the "default" heap size you have deployed. Should be no more than 16gb (and be aware of the impacts of gc on that large size), suggest not smaller than 8-12gb.



On Wed, Apr 1, 2015 at 11:28 AM, Anuj Wadehra <[hidden email]> wrote:
Are you writing multiple cf at same time?
Please run nodetool tpstats to make sure that FlushWriter etc doesnt have high All time blocked counts. A Blocked memtable FlushWriter may block/drop writes. If thats the case you may need to increase memtable flush writers..if u have many secondary indexes in cf ..make sure that memtable flush que size is set at least equal to no of indexes..

monitoring iostat and gc logs may help..

Thanks
Anuj Wadehra
From:"Amlan Roy" <[hidden email]>
Date:Wed, 1 Apr, 2015 at 9:27 pm

Subject:Re: Frequent timeout issues

Did not see any exception in cassandra.log and system.log. Monitored using JConsole. Did not see anything wrong. Do I need to see any specific info? Doing almost 1000 writes/sec.

HBase and Cassandra are running on different clusters. For cassandra I have 6 nodes with 64GB RAM(Heap is at default setting) and 32 cores.

On 01-Apr-2015, at 8:43 pm, Eric R Medley <[hidden email]> wrote:


Reply | Threaded
Open this post in threaded view
|

Re: Frequent timeout issues

daemeon reiydelle
To the poster, I am sorry to have taken this off topic. Looking forward to your reply regarding your default heap size, frequency of hard garbage collection, etc. In any case I am not convinced that heap size/garbage collection is a root cause of your issue, but it has been so frequently a problem that I tend to ask that question early on.

Jon, thank you for pointing that out to those who are 100% convinced large heaps are an anti-pattern that this is not necessarily an anti-pattern ... I am well aware of that interesting thread, and find it provides a clear guidance that in most cases, large heaps are an anti-pattern ... except in fairly rare use cases, only after extensive analysis, and several iterations of tuning. FYI, I have (both in Hadoop and Cassandra) created specialized clusters with carefully monitored row sizes and schemas to leverage the read-mostly options of large heaps. 

My experiences may be a corner case, as I tend to work with clusters that have been up for a while, and sort of grew sideways from the original expecations.

The analysis is clear that, under certain specific conditions, with extensive tuning, it just might be possible to run with very large heaps. But thanks for pointing this out as there is a LOT of information included there that can help us to deal with certain corner cases where it IS possible to productively run larger heaps, and the implied anti-patterns.

To the poster, I am sorry to have taken this off topic. Looking forward to your reply regarding your default heap size, frequency of hard garbage collection, etc.





On Thu, Apr 2, 2015 at 10:16 AM, Jonathan Haddad <[hidden email]> wrote:
@Daemeon you may want to read through https://issues.apache.org/jira/browse/CASSANDRA-8150, there are perfectly valid cases for heap > 16gb.

On Thu, Apr 2, 2015 at 10:07 AM daemeon reiydelle <[hidden email]> wrote:
May not be relevant, but what is the "default" heap size you have deployed. Should be no more than 16gb (and be aware of the impacts of gc on that large size), suggest not smaller than 8-12gb.



On Wed, Apr 1, 2015 at 11:28 AM, Anuj Wadehra <[hidden email]> wrote:
Are you writing multiple cf at same time?
Please run nodetool tpstats to make sure that FlushWriter etc doesnt have high All time blocked counts. A Blocked memtable FlushWriter may block/drop writes. If thats the case you may need to increase memtable flush writers..if u have many secondary indexes in cf ..make sure that memtable flush que size is set at least equal to no of indexes..

monitoring iostat and gc logs may help..

Thanks
Anuj Wadehra
From:"Amlan Roy" <[hidden email]>
Date:Wed, 1 Apr, 2015 at 9:27 pm

Subject:Re: Frequent timeout issues

Did not see any exception in cassandra.log and system.log. Monitored using JConsole. Did not see anything wrong. Do I need to see any specific info? Doing almost 1000 writes/sec.

HBase and Cassandra are running on different clusters. For cassandra I have 6 nodes with 64GB RAM(Heap is at default setting) and 32 cores.

On 01-Apr-2015, at 8:43 pm, Eric R Medley <[hidden email]> wrote: