Wednesday, 7 August 2013

Cassandra Frequent Read Write Timeouts

Cassandra Frequent Read Write Timeouts

i had changed whole codebase from Thrift to CQL using datastax java driver
1.0.1 and cassandra 1.2.6..
with thrift i was getting frequent timeouts from start, i was not able to
proceed...Adopting CQL, tables designed as per that i got success and
lesser timeouts....
With that i was able to insert huge data which were not working with
thrift...But after a stage, data folder around 3.5GB......i am getting
frequent write timeout exceptions........even i do same earlier working
use case again that also throws timeout exception now...ITS RANDOM ONCE
WORKED IS NOT WORKING AGAIN EVEN AFTER FRESH SETUP....
CASSADNRA SERVER LOG
this is cassandra server partial log DEBUG mode at then time i got the
error :
http://pastebin.com/rW0B4MD0
Client exception is :
Caused by: com.datastax.driver.core.exceptions.WriteTimeoutException:
Cassandra timeout during write query at consistency ONE (1 replica were
required but only 0 acknowledged the write) at
com.datastax.driver.core.exceptions.WriteTimeoutException.copy(WriteTimeoutException.java:54)
at
com.datastax.driver.core.ResultSetFuture.extractCauseFromExecutionException(ResultSetFuture.java:214)
at
com.datastax.driver.core.ResultSetFuture.getUninterruptibly(ResultSetFuture.java:169)
at com.datastax.driver.core.Session.execute(Session.java:107) at
com.datastax.driver.core.Session.execute(Session.java:76)
Infrastructure : 16GB machine with 8GB heap given to cassandra, i7
processor.. I am using SINGLE node cassandra with this yaml tweaked for
timeout, everything else is default :
read_request_timeout_in_ms: 30000 range_request_timeout_in_ms: 30000
write_request_timeout_in_ms: 30000 truncate_request_timeout_in_ms: 60000
request_timeout_in_ms: 30000
USE CASE : i am running a usecase which stores Combinations(my project
terminology) in cassandra....Currently testing storing 2.5 lakh
combinations with 100 parallel threads..each thread storing one
combination...real case i need to support of many CRORES but that would
need different hardware and multi node cluster...
In Storing ONE combination takes around 2sec and involves :
527 INSERT INTO queries .... 506 UPDATE queries .... 954 SELECT queries ....
100 parallel threads parallely storing 100 combinations....
I had found behviour of WRITE TIMEOUTS random some time it works till 2
lakh then throw timeouts AND sometimes do not work even for 10k
combinations...RANDOM BEHAVIOUR...
Please help me on this....

No comments:

Post a Comment