This is the story of how I learned to say ‘No’ to SQL and to cope with my wife’s addiction to Coach bags.
The answer is support groups. I kid. No, I don’t.
There are a number of alternatives to relational databases. However, for the purposes of this post, I am focusing on persistent, distributed key/value stores.
It all seems to have started with Amazon’s Dynamo.
Clarification: I’m not saying that Dynamo was the first distributed key/store. I imagine most developers are familiar with memcached and products such as Oracle Coherence when it comes to caching and data grids. There is also Berkely DB, a persistent key/value store. However, Dynamo is both distributed and persistent. That, and it seems to have lead (indirectly and directly) to the development of numerous open source implementations as I’ll explain below. It appears that Dynamo and Google BigTable had a one night stand and their illegitimate children formed the NoSQL movement.
In order to understand how today’s key value stores are borrowing heavily from Dynamo, it is worth briefly going over some of its characteristics.
Consistent hashing is a means of looking up a node based on a key. The idea is that a key is generated for each node. When you do a get/put/delete, you use the key to look up the key of the closest node.
For proper descriptions of consistent hashing check out these blog posts:
CAP Theorem
The theorem states that you can only have two of these properties at a time. This is interesting because I’ve found that a number of distributed key value stores are sacrificing consistency for availability and partition tolerance.
This is where the notion of eventually consistent comes into play. The idea is that if there are no further updates, the data will eventually become consistent.
For proper descriptions of the CAP theorem check out these blog posts:
For a proper description of eventually consistent check out this blog post by Werner Vogels (CTO – Amazon).
There seem to be two types of variations.
Project Voldemort – LinkedIn
Voldemort is a key value store. It has, by far, the best documentation.
Tokyo Cabinet/Tyrant – Mixi
Tokyo Cabinet is a key value store. Tyrant is the server that provides network access to Cabinet.
While I wouldn’t consider Cabinet to be distributed, I had to add it because it is ridiculously fast.
About 1 million inserts in 0.4 seconds fast.
The general consensus is that if your data can fit onto a single server, use Tokyo Cabinet/Tyrant. If not, use something like Voldemort.
Notes
This blog post provides additional information with respect to Tokyo Cabinet/Tyrant. And here is the presentation.
CouchDB is a document store, and because of that it supports search functionality.
Notes
This blog post provides a nice comparison of the differences between CouchDB and Tokyo Cabinent/Tyrant.
Apache Cassandra – Facebook
Cassandra is really a BigTable clone. Think of it as a completely denormalized database. However, it too borrows heavily Dynamo and uses a DHT.
Notes
You can find two (useful) presentations on Cassandra here and here.
Why are distributed key/value stores gaining momentum? It is simple: performance and scalability.
I like to think of it as an exercise in simplicity. A typical key/value store only supports 3 operations:
That is it!
The problem with databases is that they are difficult to scale horizontally. One solution is sharding. One problem with sharding is that you will likely lose the ‘relational’ aspect due to the performance cost of running a query across several nodes. The other problem is the cost of rebalancing the nodes if additional sharding is needed. Ultimately you will likely end up denormalizing the data. The is the first step on the path to key/value stores.
That being said, there are still use cases where you want to use a database. One case might be search. Typical key/value stores are not going to provide search functionality. However, document stores (CouchDB) and BigTable clones (Cassandra) will. The other resolves around consistency. As mentioned before, these distributed key/value stores are relaxing on consistency in favour of availability and partition tolerance. A common example is banking transactions. I don’t think anyone wants relax on consistency when working with banking transactions.
On the other hand, here is a use case from my own experience where I think a key/value store would be more appropriate.
I used to work on an application for generating and processing insurance applications. There were two steps to this process. The first step was determining if insurance was even available to the applicant. The next step was building a dynamic application for the applicant. For the first step, we maintained a complex set of business rules in our database. Essentially we used the applicant’s zip code and business type to determine if insurance was even available. For the second step, we maintained a list of questions based on the applicant’s insurance type and the carriers available.
The SQL for these operations was quite complex and included stored procedures as a result. A better alternative might have been to denormalize the database and push both the business rules and the applications to a key/value store. That is not to say that we wouldn’t continue to use a database. We would. It is just that the database would be used to enter the business rules/applications. The key/value store would be used to retrieve the business rules/applications. Ultimately, the business rules could be collapsed to a key that is the hash of the applicant’s business type and zip code. The value could be a serialized application. Now we just call get(key) and if the value is null, then there is no insurance available. If it is not, then insurance is available and this is the serialized application. On top of that, the applicant’s can save their application and update it at a later date. So, we might as well persist the application instance to a key/value store as well.
Of course, I have simplified things quite a bit here. However, my point is that this process is not transactional in the sense that banking transactions are. The only writes are for updating the answers, and for all intensive purposes this will be done a single client (the applicant). Concurrency is not really a problem here.
Notes:
This blog post provides a nice summary of the issues surrounding sharding.
Anti-RDBMS: A list of distributed key-value stores
Performance comparison: key/value stores for language model counts
Some Notes on Distributed Key Stores
Evaluating key-value and document stores for short read data
NoSQL: If Only It Was That Easy
Building Scalable Databases: Denormalization, the NoSQL Movement and Digg
Drop ACID and Think About Data
Readings in Distributed Systems – Great List of Papers
Tags: Cassandra, CouchDB, distributed, key/value store, NoSQL, performance, Scalability, Tokyo Cabinet, Tokyo Tyrant, Voldemort
| M | T | W | T | F | S | S |
|---|---|---|---|---|---|---|
| « Oct | ||||||
| 1 | 2 | 3 | 4 | 5 | ||
| 6 | 7 | 8 | 9 | 10 | 11 | 12 |
| 13 | 14 | 15 | 16 | 17 | 18 | 19 |
| 20 | 21 | 22 | 23 | 24 | 25 | 26 |
| 27 | 28 | 29 | 30 | |||
The Replicated/Partitioned Key/Value stores started with Tangosol Coherence (now Oracle Coherence). It isn’t open source, but it has the benefit of providing significantly lower latencies and higher throughput than the alternatives you mentioned. It also eschews eventual consistency in order to provide “transactional in the sense that banking transactions are,” which isn’t surprising since it’s used by all the major banks, stock exchanges, etc.
Peace,
Cameron Purdy | Oracle Coherence
Shane,
Would you consider IBM’s Lotus Domino server to be a NoSQL style document store?
It has eventual consistency (server to server and server to client replication, full-text search, is schema-less, etc.)
Thanks, this is a nice overview.
Tim
GemFire Enterprise, which is used by the largest financial instituitions and banks on the planet is a key-value data container that provides horizontal scale on commodity hardware with support for transactions, failover, multi-language support with synchronous and asynchronous replication models, within a cluster and across WANs.
GemFire manages data in-memory striping it over commodity hardware with the ability to persist it to disk using shared nothing storage
You can download evaluation versions at http://download.gemstone.com
Cheers
Suds
[...] memcached has been a topic of discussion for me recently, so “How I learned to say ‘No’ to SQL, NoSQL” piqued my interested. FTA I imagine most developers are familiar with memcached and [...]