Writing to multiple tables

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

Writing to multiple tables

Viswanathan Ramachandran
Hi,

Are Cassandra Batch statements http://www.datastax.com/documentation/cql/3.1/cql/cql_reference/batch_r.html  the recommended way for updating same information in multiple tables?

For example if I have the following tables:

person_by_dob
person_by_ssn
person_by_lastname


Then addition/modification of person will result in three writes.

Is BATCH the recommended way of updating all three tables at one go so that the information between the three tables is consistent ? 

In other words, is it an established cassandra usage pattern to use this BATCH feature for this purpose?

Are there alternate approaches and recommendations?

Thanks
Vish
Reply | Threaded
Open this post in threaded view
|

Re: Writing to multiple tables

Robert Coli-3
On Mon, Mar 16, 2015 at 12:13 PM, Viswanathan Ramachandran <[hidden email]> wrote:
Is BATCH the recommended way of updating all three tables at one go so that the information between the three tables is consistent ? 

As a general statement :

If you are looking to update multiple tables in a transaction, maybe Cassandra is not the ideal data-store for you. It is pretty un-Cassandric to want to do such a thing.

=Rob
 
Reply | Threaded
Open this post in threaded view
|

Re: Writing to multiple tables

DuyHai Doan
"Is BATCH the recommended way of updating all three tables at one go so that the information between the three tables is consistent ? "

 If you're thinking about "atomicity", no it's not atomic. Indeed with logged batches, what you gain is automatic retry done for you by the coordinator in case of failure in the middle of the batch, and that's pretty much it. The logged batch relieves the developer of the burden of having to set up a retry strategy client side

 In general, it is recommended not to put too much data/statements in a batch because the coordinator will block until having the ack for each statement in the batch. Having thousands of statements in the same batch or few statements with huge payload is definitely a bad idea.

"Are there alternate approaches and recommendations?" 

If you don't care about managing retry yourself (or relying on the retry policy of the driver), use executeAsync() to dispatch all statements over the nodes of your cluster


On Mon, Mar 16, 2015 at 8:38 PM, Robert Coli <[hidden email]> wrote:
On Mon, Mar 16, 2015 at 12:13 PM, Viswanathan Ramachandran <[hidden email]> wrote:
Is BATCH the recommended way of updating all three tables at one go so that the information between the three tables is consistent ? 

As a general statement :

If you are looking to update multiple tables in a transaction, maybe Cassandra is not the ideal data-store for you. It is pretty un-Cassandric to want to do such a thing.

=Rob