Quantcast

Implementing locks using cassandra only

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Implementing locks using cassandra only

Thorsten von Eicken-3
I've been musing about how to implement locks using just cassandra, i.e.
without sql db or zookeeper. I wrote up what I've come up with on the
wiki (it's a bit too long for an email) at
<http://wiki.apache.org/cassandra/Locking>. I'm wondering whether I've
overlooked something, especially I'm not yet comfortable enough with the
semantics of failing cassandra write operations to figure out whether
these can cause problems, so I suspect some further iterations on the
algorithm will be necessary.
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Implementing locks using cassandra only

Jonathan Ellis-3
Bakery algorithm is really interesting -- AFAIK it's the only mutex
algorithm that doesn't assume an existing atomic operation like CAS.

On Mon, Sep 13, 2010 at 11:46 AM, Thorsten von Eicken
<[hidden email]> wrote:
> I've been musing about how to implement locks using just cassandra, i.e.
> without sql db or zookeeper. I wrote up what I've come up with on the wiki
> (it's a bit too long for an email) at
> <http://wiki.apache.org/cassandra/Locking>. I'm wondering whether I've
> overlooked something, especially I'm not yet comfortable enough with the
> semantics of failing cassandra write operations to figure out whether these
> can cause problems, so I suspect some further iterations on the algorithm
> will be necessary.
>



--
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of Riptano, the source for professional Cassandra support
http://riptano.com
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Implementing locks using cassandra only

aaron morton
Thanks for the post it looks really handy. I found this page to help my simple mind understand whats going on http://nob.cs.ucdavis.edu/classes/ecs150-1999-02/sync-bakery.html 

 I've got some thoughts / questions mostly about implementation...

1. Any thoughts on how it would work with a dynamic number of clients? For example say I've got a python web app, on 2 web servers each with 8 tornado web servers running which support async IO. If I only every do sync IO to cassandra there will only every be 16 clients. If a request uses async IO, it may yield and allow the web server to process another request while it's waiting using another client connection.  

Could I use a TimeUUID as the client ID and replace max_clients with just the number of columns in the Choosing CF? This may cause a problem with the loop at line 8, as it would potentially change the count. Or use a TimeUUID as the client_id and just have a max_clients set suitable high in config.

2. What about if you re-read the number_hash (as well as the choosing hash) when you detected a client was choosing at line 9? This may be a little safer, as I guess the original algorithm assumes the numbers array is shared memory. 

3. Your check at line 13 misses the comparison of the client id's as a tie breaker. Was this intentional? 

4. Cassandra 0.7 has a TTL on the column values. Perhaps this could be used in the Numbers and even the Choosing CF? Say give clients 1 second to get through their choosing section and 10 seconds to get through their lock. 

 I've not read through the distributed work queue section yet, hope to later today.

Thanks
Aaron

On 14 Sep, 2010,at 05:50 AM, Jonathan Ellis <[hidden email]> wrote:

Bakery algorithm is really interesting -- AFAIK it's the only mutex
algorithm that doesn't assume an existing atomic operation like CAS.

On Mon, Sep 13, 2010 at 11:46 AM, Thorsten von Eicken
<[hidden email]> wrote:
> I've been musing about how to implement locks using just cassandra, i.e.
> without sql db or zookeeper. I wrote up what I've come up with on the wiki
> (it's a bit too long for an email) at
> <http://wiki.apache.org/cassandra/Locking>. I'm wondering whether I've
> overlooked something, especially I'm not yet comfortable enough with the
> semantics of failing cassandra write operations to figure out whether these
> can cause problems, so I suspect some further iterations on the algorithm
> will be necessary.
>



--
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of Riptano, the source for professional Cassandra support
http://riptano.com
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Implementing locks using cassandra only

aaron morton
Sorry, another thought was when storing in the Number CF how about using the the number as the column name and the client ID as the column value. That way you could do a slice to get the highest number, and the client can still assemble the hash of client_id to number. 

Aaron


On 14 Sep, 2010,at 10:46 AM, Aaron Morton <[hidden email]> wrote:

Thanks for the post it looks really handy. I found this page to help my simple mind understand whats going on http://nob.cs.ucdavis.edu/classes/ecs150-1999-02/sync-bakery.html 

 I've got some thoughts / questions mostly about implementation...


1. Any thoughts on how it would work with a dynamic number of clients? For example say I've got a python web app, on 2 web servers each with 8 tornado web servers running which support async IO. If I only every do sync IO to cassandra there will only every be 16 clients. If a request uses async IO, it may yield and allow the web server to process another request while it's waiting using another client connection.  

Could I use a TimeUUID as the client ID and replace max_clients with just the number of columns in the Choosing CF? This may cause a problem with the loop at line 8, as it would potentially change the count. Or use a TimeUUID as the client_id and just have a max_clients set suitable high in config.

2. What about if you re-read the number_hash (as well as the choosing hash) when you detected a client was choosing at line 9? This may be a little safer, as I guess the original algorithm assumes the numbers array is shared memory. 

3. Your check at line 13 misses the comparison of the client id's as a tie breaker. Was this intentional? 


4. Cassandra 0.7 has a TTL on the column values. Perhaps this could be used in the Numbers and even the Choosing CF? Say give clients 1 second to get through their choosing section and 10 seconds to get through their lock. 

 I've not read through the distributed work queue section yet, hope to later today.

Thanks
Aaron

On 14 Sep, 2010,at 05:50 AM, Jonathan Ellis <[hidden email]> wrote:

Bakery algorithm is really interesting -- AFAIK it's the only mutex
algorithm that doesn't assume an existing atomic operation like CAS.

On Mon, Sep 13, 2010 at 11:46 AM, Thorsten von Eicken
<[hidden email]> wrote:
> I've been musing about how to implement locks using just cassandra, ie.
> without sql db or zookeeper. I wrote up what I've come up with on the wiki
> (it's a bit too long for an email) at
> <http://wiki.apache.org/cassandra/Locking>. I'm wondering whether I've
> overlooked something, especially I'm not yet comfortable enough with the
> semantics of failing cassandra write operations to figure out whether these
> can cause problems, so I suspect some further iterations on the algorithm
> will be necessary.
>



--
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of Riptano, the source for professional Cassandra support
http://riptano.com
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

unsubscribe

Kevin Cox
In reply to this post by Jonathan Ellis-3


Loading...