Cassandra ACID

classic Classic list List threaded Threaded
9 messages Options
Reply | Threaded
Open this post in threaded view
|

Cassandra ACID

AJ-2
Can any Cassandra contributors/guru's confirm my understanding of Cassandra's degree of support for the ACID properties?

I provide official references when known.  Please let me know if I missed some good official documentation.

Atomicity
All individual writes are atomic at the row level.  So, a batch mutate for one specific key will apply updates to all the columns for that one specific row atomically.  If part of the single-key batch update fails, then all of the updates will be reverted since they all pertained to one key/row.  Notice, I said 'reverted' not 'rolled back'.  Note: atomicity and isolation are related to the topic of transactions but one does not imply the other.  Even though row updates are atomic, they are not isolated from other users' updates or reads.   
Refs: http://wiki.apache.org/cassandra/FAQ#batch_mutate_atomic

Consistency
If you want 100% consistency, use consistency level QUORUM for both reads and writes and EACH_QUORUM in a multi-dc scenario. 
Refs: http://wiki.apache.org/cassandra/ArchitectureOverview

Isolation
NOTHING is isolated; because there is no transaction support in the first place.  This means that two or more clients can update the same row at the same time.  Their updates of the same or different columns may be interleaved and leave the row in a state that may not make sense depending on your application.  Note: this doesn't mean to say that two updates of the same column will be corrupted, obviously; columns are the smallest atomic unit ('atomic' in the more general thread-safe context).
Refs: None that directly address this explicitly and clearly and in one place.

Durability
Updates are made durable by the use of the commit log.  No worries here.
Refs: Plenty.
Reply | Threaded
Open this post in threaded view
|

Re: Cassandra ACID

Peter Schuller
> Atomicity
> All individual writes are atomic at the row level.  So, a batch mutate for
> one specific key will apply updates to all the columns for that one specific
> row atomically.  If part of the single-key batch update fails, then all of
> the updates will be reverted since they all pertained to one key/row.
> Notice, I said 'reverted' not 'rolled back'.  Note: atomicity and isolation
> are related to the topic of transactions but one does not imply the other.
> Even though row updates are atomic, they are not isolated from other users'
> updates or reads.
> Refs: http://wiki.apache.org/cassandra/FAQ#batch_mutate_atomic

Atomicity is sort of provided, but there's no reversion going on.
Cassandra does validation of batch mutations prior to their
application and then tries to apply it. In the absence of bugs in
Cassandra, it should generally be safe to say that writes are then
guaranteed to succeed. However I wouldn't necessarily rely on this
type of atomicity to the same level that I would in e.g. PostgreSQL.

One example of violated atomicity is when you run with periodic commit
log mode instead of batch wise. If you for example perform a write on
CL.ONE but the node that took the write got killed (eg SIGKILL) before
the periodic commit log flush, you will have eaten a write that then
gets dropped. If someone read the changes that the write entails, the
application-visible behavior will be that the write will be "undone"
rather than eventually done.

> Consistency
> If you want 100% consistency, use consistency level QUORUM for both reads
> and writes and EACH_QUORUM in a multi-dc scenario.
> Refs: http://wiki.apache.org/cassandra/ArchitectureOverview

For the limited definition of consistency it provides, yes. One thing
to be aware of is that *failed* writes at QUORUM followed by
*succeeding* reads at QUORUM may have readers see inconsistent results
across requests (see
https://issues.apache.org/jira/browse/CASSANDRA-2494 although I still
think it's a designed-for behavior rather than a bug). And of course
the usual bits about concurrent updates and updates spanning multiple
rows.

I'm just a bit hesitant to agree to the term "100% consistency" since
it sounds very all-encompassing :)

> Isolation
> NOTHING is isolated; because there is no transaction support in the first
> place.  This means that two or more clients can update the same row at the
> same time.  Their updates of the same or different columns may be
> interleaved and leave the row in a state that may not make sense depending
> on your application.  Note: this doesn't mean to say that two updates of the
> same column will be corrupted, obviously; columns are the smallest atomic
> unit ('atomic' in the more general thread-safe context).
> Refs: None that directly address this explicitly and clearly and in one
> place.

Yes but the relevant lack of isolation is for reads. Due to
Cassandra's conflict resolution model, given two updates with certain
timestamps associated with them, the actual timing of the writes will
not change the eventual result in the data (absent read-before-write
logic operating on that data concurrently).

The lack of isolation is thus mostly of concern to readers.

> Durability
> Updates are made durable by the use of the commit log.  No worries here.

But be careful about choosing batch commit log sync instead of
periodic if single-node durability or post-quorum-write durability is
a concern.

--
/ Peter Schuller
Reply | Threaded
Open this post in threaded view
|

Re: Cassandra ACID

Sylvain Lebresne-3
On Fri, Jun 24, 2011 at 9:11 AM, Peter Schuller
<[hidden email]> wrote:

>> Atomicity
>> All individual writes are atomic at the row level.  So, a batch mutate for
>> one specific key will apply updates to all the columns for that one specific
>> row atomically.  If part of the single-key batch update fails, then all of
>> the updates will be reverted since they all pertained to one key/row.
>> Notice, I said 'reverted' not 'rolled back'.  Note: atomicity and isolation
>> are related to the topic of transactions but one does not imply the other.
>> Even though row updates are atomic, they are not isolated from other users'
>> updates or reads.
>> Refs: http://wiki.apache.org/cassandra/FAQ#batch_mutate_atomic
>
> Atomicity is sort of provided, but there's no reversion going on.
> Cassandra does validation of batch mutations prior to their
> application and then tries to apply it. In the absence of bugs in
> Cassandra, it should generally be safe to say that writes are then
> guaranteed to succeed. However I wouldn't necessarily rely on this
> type of atomicity to the same level that I would in e.g. PostgreSQL.
>
> One example of violated atomicity is when you run with periodic commit
> log mode instead of batch wise. If you for example perform a write on
> CL.ONE but the node that took the write got killed (eg SIGKILL) before
> the periodic commit log flush, you will have eaten a write that then
> gets dropped. If someone read the changes that the write entails, the
> application-visible behavior will be that the write will be "undone"
> rather than eventually done.

I will disagree with the "atomicity is sort of provided". I think your violation
example is a violation of durability, not atomicity (a.k.a indivisibility).

We do always provide atomicity of updates in the same batch_mutate call
under a given key. Which means that for a given key, all update of the batch
will be applied, or none of them. This is *always* true and this does not depend
on the commit log (and granted, if the write timeout, you won't know which one
it is, but you are still guaranteed that it is either all or none).

That being said, we do not provide isolation, which means in particular that
reads *can* return a state where only parts of a batch update seems applied
(and it would clearly be cool to have isolation and I'm not even
saying this will
never happen). But atomicity guarantee you that even though you may observe
such a state (and honestly the window during which you can is uber small),
eventually you will observe that all have been applied (or none if you're in the
business of questioning durability (see below) but never "part of").

As for durability, it is true that in periodic commit log mode, durability on a
single node is subject to a small window of time. But true, serious durability
in the real world really only come from replication, and that's why we
use periodic
mode for the commit log by default (and you can always switch to batch if you
so wish). Which is not to say that Peter statement is technically wrong, but if
what we're doing is assess Cassandra durability, I'll argue that because it does
replication well (including across data center) while still having
strong single-node
durability guarantee, it has among the best durability story out there
(even with
periodic commit log).


--
Sylvain



>> Consistency
>> If you want 100% consistency, use consistency level QUORUM for both reads
>> and writes and EACH_QUORUM in a multi-dc scenario.
>> Refs: http://wiki.apache.org/cassandra/ArchitectureOverview
>
> For the limited definition of consistency it provides, yes. One thing
> to be aware of is that *failed* writes at QUORUM followed by
> *succeeding* reads at QUORUM may have readers see inconsistent results
> across requests (see
> https://issues.apache.org/jira/browse/CASSANDRA-2494 although I still
> think it's a designed-for behavior rather than a bug). And of course
> the usual bits about concurrent updates and updates spanning multiple
> rows.
>
> I'm just a bit hesitant to agree to the term "100% consistency" since
> it sounds very all-encompassing :)
>
>> Isolation
>> NOTHING is isolated; because there is no transaction support in the first
>> place.  This means that two or more clients can update the same row at the
>> same time.  Their updates of the same or different columns may be
>> interleaved and leave the row in a state that may not make sense depending
>> on your application.  Note: this doesn't mean to say that two updates of the
>> same column will be corrupted, obviously; columns are the smallest atomic
>> unit ('atomic' in the more general thread-safe context).
>> Refs: None that directly address this explicitly and clearly and in one
>> place.
>
> Yes but the relevant lack of isolation is for reads. Due to
> Cassandra's conflict resolution model, given two updates with certain
> timestamps associated with them, the actual timing of the writes will
> not change the eventual result in the data (absent read-before-write
> logic operating on that data concurrently).
>
> The lack of isolation is thus mostly of concern to readers.
>
>> Durability
>> Updates are made durable by the use of the commit log.  No worries here.
>
> But be careful about choosing batch commit log sync instead of
> periodic if single-node durability or post-quorum-write durability is
> a concern.
>
> --
> / Peter Schuller
>
Reply | Threaded
Open this post in threaded view
|

Re: Cassandra ACID

Jim Newsham
In reply to this post by AJ-2
On 6/23/2011 8:55 PM, AJ wrote:
Can any Cassandra contributors/guru's confirm my understanding of Cassandra's degree of support for the ACID properties?

I provide official references when known.  Please let me know if I missed some good official documentation.

Atomicity
All individual writes are atomic at the row level.  So, a batch mutate for one specific key will apply updates to all the columns for that one specific row atomically.  If part of the single-key batch update fails, then all of the updates will be reverted since they all pertained to one key/row.  Notice, I said 'reverted' not 'rolled back'.  Note: atomicity and isolation are related to the topic of transactions but one does not imply the other.  Even though row updates are atomic, they are not isolated from other users' updates or reads.   
Refs: http://wiki.apache.org/cassandra/FAQ#batch_mutate_atomic

Consistency
If you want 100% consistency, use consistency level QUORUM for both reads and writes and EACH_QUORUM in a multi-dc scenario. 
Refs: http://wiki.apache.org/cassandra/ArchitectureOverview


This is a pretty narrow interpretation of consistency.  In a traditional database, consistency prevents you from getting into a logically inconsistent state, where records in one table do not agree with records in another table.  This includes referential integrity, cascading deletes, etc.  It seems to me Cassandra has no support for this concept whatsoever.

Isolation
NOTHING is isolated; because there is no transaction support in the first place.  This means that two or more clients can update the same row at the same time.  Their updates of the same or different columns may be interleaved and leave the row in a state that may not make sense depending on your application.  Note: this doesn't mean to say that two updates of the same column will be corrupted, obviously; columns are the smallest atomic unit ('atomic' in the more general thread-safe context).
Refs: None that directly address this explicitly and clearly and in one place.

Durability
Updates are made durable by the use of the commit log.  No worries here.
Refs: Plenty.

Jim
Reply | Threaded
Open this post in threaded view
|

Re: Cassandra ACID

AJ-2
Ok, here it is reworked; consider it a summary of the thread.  If I left out an important point that you think is 100% correct even if you already mentioned it, then make some noise about it and provide some evidence so it's captured sufficiently.  And, if you're in a debate, please try and get to a resolution; all will appreciate it.

It will be evident below that Consistency is not the only thing that is "tunable", at least indirectly.  Unfortunately, you still can't tunafish.  Ar ar ar.

Atomicity
All individual writes are atomic at the row level.  So, a batch mutate for one specific key will apply updates to all the columns for that one specific row atomically.  If part of the single-key batch update fails, then all of the updates will be reverted since they all pertained to one key/row.  Notice, I said 'reverted' not 'rolled back'.  Note: atomicity and isolation are related to the topic of transactions but one does not imply the other.  Even though row updates are atomic, they are not isolated from other users' updates or reads.   
Refs: http://wiki.apache.org/cassandra/FAQ#batch_mutate_atomic

Consistency
Cassandra does not provide the same scope of Consistency as defined in the ACID standard.  Consistency in C* does not include referential integrity since C* is not a relational database.  Any referential integrity required would have to be handled by the client.  Also, even though the official docs say that QUORUM writes/reads is the minimal consistency_level setting to guarantee full consistency, this assumes that the write preceding the read does not fail (see comments below).  Therefore, an ALL write would be necessary prior to a QUORUM read of the same data.  For a multi-dc scenario use an ALL write followed by a EACH_QUORUM read.
Refs: http://wiki.apache.org/cassandra/ArchitectureOverview

Isolation
NOTHING is isolated; because there is no transaction support in the first place.  This means that two or more clients can update the same row at the same time.  Their updates of the same or different columns may be interleaved and leave the row in a state that may not make sense depending on your application.  Note: this doesn't mean to say that two updates of the same column will be corrupted, obviously; columns are the smallest atomic unit ('atomic' in the more general thread-safe context).
Refs: None that directly address this explicitly and clearly and in one place.

Durability
Updates are made highly durable at the level comparable to a DBMS by the use of the commit log.  However, this requires "commitlog_sync: batch" in cassandra.yaml.  For "some" performance improvement with "some" cost in durability you can specify "commitlog_sync: periodic".  See discussion below for more details.
Refs: Plenty + this thread.



On 6/24/2011 1:46 PM, Jim Newsham wrote:
On 6/23/2011 8:55 PM, AJ wrote:
Can any Cassandra contributors/guru's confirm my understanding of Cassandra's degree of support for the ACID properties?

I provide official references when known.  Please let me know if I missed some good official documentation.

Atomicity
All individual writes are atomic at the row level.  So, a batch mutate for one specific key will apply updates to all the columns for that one specific row atomically.  If part of the single-key batch update fails, then all of the updates will be reverted since they all pertained to one key/row.  Notice, I said 'reverted' not 'rolled back'.  Note: atomicity and isolation are related to the topic of transactions but one does not imply the other.  Even though row updates are atomic, they are not isolated from other users' updates or reads.   
Refs: http://wiki.apache.org/cassandra/FAQ#batch_mutate_atomic

Consistency
If you want 100% consistency, use consistency level QUORUM for both reads and writes and EACH_QUORUM in a multi-dc scenario. 
Refs: http://wiki.apache.org/cassandra/ArchitectureOverview


This is a pretty narrow interpretation of consistency.  In a traditional database, consistency prevents you from getting into a logically inconsistent state, where records in one table do not agree with records in another table.  This includes referential integrity, cascading deletes, etc.  It seems to me Cassandra has no support for this concept whatsoever.

Isolation
NOTHING is isolated; because there is no transaction support in the first place.  This means that two or more clients can update the same row at the same time.  Their updates of the same or different columns may be interleaved and leave the row in a state that may not make sense depending on your application.  Note: this doesn't mean to say that two updates of the same column will be corrupted, obviously; columns are the smallest atomic unit ('atomic' in the more general thread-safe context).
Refs: None that directly address this explicitly and clearly and in one place.

Durability
Updates are made durable by the use of the commit log.  No worries here.
Refs: Plenty.

Jim

Reply | Threaded
Open this post in threaded view
|

Re: Cassandra ACID

Peter Schuller
In reply to this post by Sylvain Lebresne-3
> We do always provide atomicity of updates in the same batch_mutate call
> under a given key. Which means that for a given key, all update of the batch
> will be applied, or none of them. This is *always* true and this does not depend
> on the commit log (and granted, if the write timeout, you won't know which one
> it is, but you are still guaranteed that it is either all or none).
>
> That being said, we do not provide isolation, which means in particular that
> reads *can* return a state where only parts of a batch update seems applied
> (and it would clearly be cool to have isolation and I'm not even
> saying this will
> never happen). But atomicity guarantee you that even though you may observe
> such a state (and honestly the window during which you can is uber small),
> eventually you will observe that all have been applied (or none if you're in the
> business of questioning durability (see below) but never "part of").

You're right of course. I was playing loose with terms. So in the
terms of what is durable (for whatever definition of durable you have
decided to adopt for the cluster), atomicity is preserved with
periodic commit. I stand corrected.

> As for durability, it is true that in periodic commit log mode, durability on a
> single node is subject to a small window of time. But true, serious durability
> in the real world really only come from replication, and that's why we
> use periodic
> mode for the commit log by default (and you can always switch to batch if you
> so wish). Which is not to say that Peter statement is technically wrong, but if
> what we're doing is assess Cassandra durability, I'll argue that because it does
> replication well (including across data center) while still having
> strong single-node
> durability guarantee, it has among the best durability story out there
> (even with
> periodic commit log).

I agree, but with one caveat:

The operator has to be aware that if the application is also using
CL.ONE, killing/restarting a node (unless done softly by 'nodetool
drain' or disablinb thrift/rpc) may result in lost writes even though
no node actually had a "real problem". What I mean is, you may decide
that hardware failures are sufficiently uncommon that you're fine
doing CL.ONE on writes for some particular application. However you
may not expect regular cluster operations like node restarts to affect
durability. In that sense, replication is subtly less effective than
what one may think as an alternative to single-node durability, for
applications writing at CL.ONE (or CL.ANY).

(The probability of actually loosing writes in practice may be low,
and I have never made measurements (and even if I did they would be
subject to random details that could change at any time, such as
timing).)

Do you agree?

--
/ Peter Schuller
Reply | Threaded
Open this post in threaded view
|

Re: Cassandra ACID

Terje Marthinussen
In reply to this post by Sylvain Lebresne-3
That being said, we do not provide isolation, which means in particular that
reads *can* return a state where only parts of a batch update seems applied
(and it would clearly be cool to have isolation and I'm not even
saying this will
never happen).

Out of curiosity, do you see any architectural issues that makes this one hard to do (given the limitations already in place for atomicity)  or is it more a case of "its just that nobody has put it high enough on their priority list to do it yet?"

Terje 

Reply | Threaded
Open this post in threaded view
|

RE: Cassandra ACID

Jeremiah Jordan
In reply to this post by AJ-2
For your Consistency case, it is actually an ALL read that is needed, not an ALL write.  ALL read, with what ever consistency level of write that you need (to support machines dyeing) is the only way to get consistent results in the face of a failed write which was at > ONE that went to one node, but not the others.


From: AJ [mailto:[hidden email]]
Sent: Friday, June 24, 2011 11:28 PM
To: [hidden email]
Subject: Re: Cassandra ACID

Ok, here it is reworked; consider it a summary of the thread.  If I left out an important point that you think is 100% correct even if you already mentioned it, then make some noise about it and provide some evidence so it's captured sufficiently.  And, if you're in a debate, please try and get to a resolution; all will appreciate it.

It will be evident below that Consistency is not the only thing that is "tunable", at least indirectly.  Unfortunately, you still can't tunafish.  Ar ar ar.

Atomicity
All individual writes are atomic at the row level.  So, a batch mutate for one specific key will apply updates to all the columns for that one specific row atomically.  If part of the single-key batch update fails, then all of the updates will be reverted since they all pertained to one key/row.  Notice, I said 'reverted' not 'rolled back'.  Note: atomicity and isolation are related to the topic of transactions but one does not imply the other.  Even though row updates are atomic, they are not isolated from other users' updates or reads.   
Refs: http://wiki.apache.org/cassandra/FAQ#batch_mutate_atomic

Consistency
Cassandra does not provide the same scope of Consistency as defined in the ACID standard.  Consistency in C* does not include referential integrity since C* is not a relational database.  Any referential integrity required would have to be handled by the client.  Also, even though the official docs say that QUORUM writes/reads is the minimal consistency_level setting to guarantee full consistency, this assumes that the write preceding the read does not fail (see comments below).  Therefore, an ALL write would be necessary prior to a QUORUM read of the same data.  For a multi-dc scenario use an ALL write followed by a EACH_QUORUM read.
Refs: http://wiki.apache.org/cassandra/ArchitectureOverview

Isolation
NOTHING is isolated; because there is no transaction support in the first place.  This means that two or more clients can update the same row at the same time.  Their updates of the same or different columns may be interleaved and leave the row in a state that may not make sense depending on your application.  Note: this doesn't mean to say that two updates of the same column will be corrupted, obviously; columns are the smallest atomic unit ('atomic' in the more general thread-safe context).
Refs: None that directly address this explicitly and clearly and in one place.

Durability
Updates are made highly durable at the level comparable to a DBMS by the use of the commit log.  However, this requires "commitlog_sync: batch" in cassandra.yaml.  For "some" performance improvement with "some" cost in durability you can specify "commitlog_sync: periodic".  See discussion below for more details.
Refs: Plenty + this thread.



On 6/24/2011 1:46 PM, Jim Newsham wrote:
On 6/23/2011 8:55 PM, AJ wrote:
Can any Cassandra contributors/guru's confirm my understanding of Cassandra's degree of support for the ACID properties?

I provide official references when known.  Please let me know if I missed some good official documentation.

Atomicity
All individual writes are atomic at the row level.  So, a batch mutate for one specific key will apply updates to all the columns for that one specific row atomically.  If part of the single-key batch update fails, then all of the updates will be reverted since they all pertained to one key/row.  Notice, I said 'reverted' not 'rolled back'.  Note: atomicity and isolation are related to the topic of transactions but one does not imply the other.  Even though row updates are atomic, they are not isolated from other users' updates or reads.   
Refs: http://wiki.apache.org/cassandra/FAQ#batch_mutate_atomic

Consistency
If you want 100% consistency, use consistency level QUORUM for both reads and writes and EACH_QUORUM in a multi-dc scenario. 
Refs: http://wiki.apache.org/cassandra/ArchitectureOverview


This is a pretty narrow interpretation of consistency.  In a traditional database, consistency prevents you from getting into a logically inconsistent state, where records in one table do not agree with records in another table.  This includes referential integrity, cascading deletes, etc.  It seems to me Cassandra has no support for this concept whatsoever.

Isolation
NOTHING is isolated; because there is no transaction support in the first place.  This means that two or more clients can update the same row at the same time.  Their updates of the same or different columns may be interleaved and leave the row in a state that may not make sense depending on your application.  Note: this doesn't mean to say that two updates of the same column will be corrupted, obviously; columns are the smallest atomic unit ('atomic' in the more general thread-safe context).
Refs: None that directly address this explicitly and clearly and in one place.

Durability
Updates are made durable by the use of the commit log.  No worries here.
Refs: Plenty.

Jim

Reply | Threaded
Open this post in threaded view
|

Re: Cassandra ACID

AJ-2
On 6/30/2011 1:57 PM, Jeremiah Jordan wrote:
For your Consistency case, it is actually an ALL read that is needed, not an ALL write.  ALL read, with what ever consistency level of write that you need (to support machines dyeing) is the only way to get consistent results in the face of a failed write which was at > ONE that went to one node, but not the others.


True, an ALL read is the best and final test for consistency for that read.  I think an ALL write is more of a preemptive measure.  If you know you'll be needing consistency later, better to get it in while you can.  But, this leads to a whole other set of complex topics.  I like the flexibility, however.

Atomicity
All individual writes are atomic at the row level.  So, a batch mutate for one specific key will apply updates to all the columns for that one specific row atomically.  If part of the single-key batch update fails, then all of the updates will be reverted since they all pertained to one key/row.  Notice, I said 'reverted' not 'rolled back'.  Note: atomicity and isolation are related to the topic of transactions but one does not imply the other.  Even though row updates are atomic, they are not isolated from other users' updates or reads.   
Refs: http://wiki.apache.org/cassandra/FAQ#batch_mutate_atomic

Consistency
Cassandra does not provide the same scope of Consistency as defined in the ACID standard.  Consistency in C* does not include referential integrity since C* is not a relational database.  Any referential integrity required would have to be handled by the client.  Also, even though the official docs say that QUORUM writes/reads is the minimal consistency_level setting to guarantee full consistency, this assumes that the write preceding the read does not fail (see comments below).  What to do in this case is not fully understood by this author.
Refs: http://wiki.apache.org/cassandra/ArchitectureOverview

Isolation
NOTHING is isolated; because there is no transaction support in the first place.  This means that two or more clients can update the same row at the same time.  Their updates of the same or different columns may be interleaved and leave the row in a state that may not make sense depending on your application.  Note: this doesn't mean to say that two updates of the same column will be corrupted, obviously; columns are the smallest atomic unit ('atomic' in the more general thread-safe context).
Refs: None that directly address this explicitly and clearly and in one place.

Durability
Updates are made highly durable at the level comparable to a DBMS by the use of the commit log.  However, this requires "commitlog_sync: batch" in cassandra.yaml.  For "some" performance improvement with "some" cost in durability you can specify "commitlog_sync: periodic".  See discussion below for more details.
Refs: Plenty + this thread.



From: AJ [[hidden email]]
Sent: Friday, June 24, 2011 11:28 PM
To: [hidden email]
Subject: Re: Cassandra ACID

Ok, here it is reworked; consider it a summary of the thread.  If I left out an important point that you think is 100% correct even if you already mentioned it, then make some noise about it and provide some evidence so it's captured sufficiently.  And, if you're in a debate, please try and get to a resolution; all will appreciate it.

It will be evident below that Consistency is not the only thing that is "tunable", at least indirectly.  Unfortunately, you still can't tunafish.  Ar ar ar.

Atomicity
All individual writes are atomic at the row level.  So, a batch mutate for one specific key will apply updates to all the columns for that one specific row atomically.  If part of the single-key batch update fails, then all of the updates will be reverted since they all pertained to one key/row.  Notice, I said 'reverted' not 'rolled back'.  Note: atomicity and isolation are related to the topic of transactions but one does not imply the other.  Even though row updates are atomic, they are not isolated from other users' updates or reads.   
Refs: http://wiki.apache.org/cassandra/FAQ#batch_mutate_atomic

Consistency
Cassandra does not provide the same scope of Consistency as defined in the ACID standard.  Consistency in C* does not include referential integrity since C* is not a relational database.  Any referential integrity required would have to be handled by the client.  Also, even though the official docs say that QUORUM writes/reads is the minimal consistency_level setting to guarantee full consistency, this assumes that the write preceding the read does not fail (see comments below).  Therefore, an ALL write would be necessary prior to a QUORUM read of the same data.  For a multi-dc scenario use an ALL write followed by a EACH_QUORUM read.
Refs: http://wiki.apache.org/cassandra/ArchitectureOverview

Isolation
NOTHING is isolated; because there is no transaction support in the first place.  This means that two or more clients can update the same row at the same time.  Their updates of the same or different columns may be interleaved and leave the row in a state that may not make sense depending on your application.  Note: this doesn't mean to say that two updates of the same column will be corrupted, obviously; columns are the smallest atomic unit ('atomic' in the more general thread-safe context).
Refs: None that directly address this explicitly and clearly and in one place.

Durability
Updates are made highly durable at the level comparable to a DBMS by the use of the commit log.  However, this requires "commitlog_sync: batch" in cassandra.yaml.  For "some" performance improvement with "some" cost in durability you can specify "commitlog_sync: periodic".  See discussion below for more details.
Refs: Plenty + this thread.



On 6/24/2011 1:46 PM, Jim Newsham wrote:
On 6/23/2011 8:55 PM, AJ wrote:
Can any Cassandra contributors/guru's confirm my understanding of Cassandra's degree of support for the ACID properties?

I provide official references when known.  Please let me know if I missed some good official documentation.

Atomicity
All individual writes are atomic at the row level.  So, a batch mutate for one specific key will apply updates to all the columns for that one specific row atomically.  If part of the single-key batch update fails, then all of the updates will be reverted since they all pertained to one key/row.  Notice, I said 'reverted' not 'rolled back'.  Note: atomicity and isolation are related to the topic of transactions but one does not imply the other.  Even though row updates are atomic, they are not isolated from other users' updates or reads.   
Refs: http://wiki.apache.org/cassandra/FAQ#batch_mutate_atomic

Consistency
If you want 100% consistency, use consistency level QUORUM for both reads and writes and EACH_QUORUM in a multi-dc scenario. 
Refs: http://wiki.apache.org/cassandra/ArchitectureOverview


This is a pretty narrow interpretation of consistency.  In a traditional database, consistency prevents you from getting into a logically inconsistent state, where records in one table do not agree with records in another table.  This includes referential integrity, cascading deletes, etc.  It seems to me Cassandra has no support for this concept whatsoever.

Isolation
NOTHING is isolated; because there is no transaction support in the first place.  This means that two or more clients can update the same row at the same time.  Their updates of the same or different columns may be interleaved and leave the row in a state that may not make sense depending on your application.  Note: this doesn't mean to say that two updates of the same column will be corrupted, obviously; columns are the smallest atomic unit ('atomic' in the more general thread-safe context).
Refs: None that directly address this explicitly and clearly and in one place.

Durability
Updates are made durable by the use of the commit log.  No worries here.
Refs: Plenty.

Jim