Skip to main content

How To Choose A NoSQL Database?

How To Choose A NoSQL Database?

NoSQL databases are now part of web-scale architecture. The question is when to use what? Below, I try to compare the NoSQL data stores that I have worked with. Hopefully, it would be useful for programmers exploring and deciding the technology for their web-scale application.



When to use NoSQL
Before deciding on to use NoSQL instead of a SQL technology, you should ask yourself following questions about your use case (includes ACID test of your application) :
  1. Transactions vs No transactions (Do you need atomicity?)
    [Most NoSQL databases don’t support transactions]
  2. Consistent or eventual consistent (Are you okay with eventual consistency?)
    [Most support configurable consistency mode. You should test your scale with the consistency mode your application requires. For example, your performance test holds no good when done on “eventual consistency” mode and you decide to use hard consistency for your application.]
  3. Vertical vs horizontal scaling (what’s your scale? your use case need infinite scale or needs are finite?)
    [This sometimes boils down to what stage of business are you in. Don’t over-engineer of you are an early stage startup and growing < 5x a month. Postpone & focus on biz growth]
  4. Availability (No downtime? Hot failover?)
    [Some NoSQL DBs support hot failovers, some not. More below.]
  5. Do you really need a NoSQL DB. Why RDBMS doesn’t work for you?
    [Don’t use NoSQL just for the heck of it]
So, once you have decided to go for a NoSQL Data store. Next question should be key-value or document-oriented.

Key-Value vs Document-oriented

Key-value stores: If you have clear data structure defined such that all the data would have exactly one key, go for a key-value store. It’s like you have a big Hashtable, and people mostly use it for Cache stores or clearly key based data. However, things start going a little nasty when you need query the same data on basis of multiple keys!
Some key value stores are: MemcacheRedisAerospike.
Two important things about designing your data model around key-value store are:
  1. You need to know all use cases in advance and you could not change the query-able fields in your data without a redesign.
  2. Remember, if you are going to maintain multiple keys around same data in a key-value store, updates to multiple tables/buckets/collection/whatever are NOT atomic. You need to deal with this yourself.
Document-oriented: If you are just moving away from RDBMS and want to keep your data in as object way and as close to table-like structure as possible, document-structure is the way to go! Particularly useful when you are creating an app and don’t want to deal with RDBMS table design early-on (in prototyping stage) and your schema could change drastically over time. However note:
  1. Secondary indexes may not perform as well.
  2. Transactions are not available.
Popular document-oriented databases are: MongoDBCouchbase.

In-memory vs disk persistence (Cache or data-store) ?

Another key concern while deciding data stores is whether you are using it as data store of your application or you are using it as a cache over your data store to scale for your traffic needs.
Once you have decided the kind of use-case you have, here are some of the popular NoSQL stores you could use:

Comparing Key-value NoSQL databases

Memcache:
  • In-memory cache
  • No persistence
  • TTL supported
  • client-side clustering only (client stores value at multiple nodes). Horizontally scalable through client.
  • Not good for large-size values/documents
Redis:
  • In-memory cache
  • Disk supported – backup and rebuild from disk
  • TTL supported
  • Super-fast
  • Data structure support in addition to key-value
  • Clustering support  not mature enough yet. Vertically scalable.
  • Horizontal scaling could be tricky.
Aerospike:
  • Both in-memory & on-disk
  • Extremely fast (could support >1 Million TPS on a single node)
  • Horizontally scalable. Server side clustering. Sharded & replicated data
  • Automatic failovers
  • Supports Secondary indexes.
  • CAS, TTL support
  • Enterprise class

When to use what? Memcache vs Redis vs Aerospike

If I am an early stage startup, I would rather prefer to go with Redis and avoid nuances of maintaining a cluster etc. If I have scaled above half a million TPS (transactions per second) where I need to scale horizontally I would go for Aerospike. I would use memcache (memcached) only when I am going really mean and want to even offload maintaining the servers – in which case I would go for hosted version of Memcached which is Amazon Elasticache.

 Comparing document-oriented NoSQL databases

MongoDB:
  • Fast
  • Mature & stable – feature rich
  • Supports failovers
  • Horizontally scalable reads – read from replica/secondary
  • Writes not scalable horizontally unless you use mongo shards
  • Supports advanced querying
  • Supports multiple secondary indexes
  • Shards architecture becomes tricky, not scalable beyond a point where you need secondary indexes. Elementary shard deployment need 9 nodes at minimum.
  • Document-level locks are a problem if you have a very high write-rate
Couchbase Server:
  • Fast
  • Sharded cluster instead of master-slave of mongodb
  • Hot failover support
  • Horizontally scalable
  • Supports secondary indexes through views
  • Learning curve bigger than mongoDB
  • Claims to be faster
When to use what? MongoDB vs Couchbase
For most de-facto use cases, I would go for mongo unless my write-rate is extremely high (I would think again only when my writes are > 10% and I am doing more than few thousand transactions a second). Fast prototyping, schema-less design, on-the-fly indexes etc makes it a ideal choice for early stage traffic.
I would consider Couchbase only when I have scaled beyond a point where write-locks are becoming a problem and I do have secondary indexes and I need extremely high availability (* more on couchbase in coming posts).

This article is reference from : 
http://www.nextbigwhat.com/how-to-choose-nosql-database-297/

Comments

Popular posts from this blog

Why Dozer Framework (Bean Manipulation)

Why Dozer ? Let us think about a situation that you have a source bean which contains lot of fields and the source bean belongs to a different project or module. Now you want to expose the bean to  the outside world as a part of your web service REST service development. It is not advisable to do it. There may be the following reasons. The source bean is not serialized and a final class. The source system does not allow doing it because of security breach. The source bean is very heavy and contains lot of nested beans. The source bean has fields of different types which may not be required for other system. The source bean has lot of fields; some of them are not required. Scenario to use Dozer Suppose You want to make a REST call or web service call to get the minimal account details of a person. But the source system has a bean called “Acc0untBean” which contains many sensitive information like person’s internet banking passw0rd, PAN no or social sec...

Difference between Micro Service and Web Services

Micro web services and Web services are two different concepts of application development architecture, Which can be differentiated from it's development style and layered architecture.In This article I will explain the difference between Web Services and Micro Services Web Services ? Web services are services that are made available from a business's Web server for Web users or other Web-connected programs. it is a way to expose the functionality of an application to other application, without a user interface. It is a service which exposes an API over HTTP. Web Services allow applications developed in different technologies to communicate with each other through a common format like XML, Jason, etc.  Web services are not tied to any one operating system or programming language. For example, an application developed in Java can be used in C#, Android, Php etc., and vice versa.  Web Service is a connection technology, a way to connect services together into a ...

JAVA_OPTS Variable Details

Memory Available to the Java JVM Increasing the memory available to the Java JVM JAVA_OPTS="-Xmx1024m -Xms256m" export JAVA_OPT Options description: -Xmx sets the maximum amount of memory that can be allocated to the JVM heap; here it is being set to 1024 megabytes. -Xms sets the initial amount of memory allocated to the JVM heap; here it is being set to 256 megabytes. Run Java JVM in Server Mode The Java JVM can optimize a number of things for server environments. You can explicitly select the Java HotSpot Server VM with the -server option. JAVA_OPTS="-Xmx1024m -Xms256m -server" export JAVA_OPT What the option means: -server instructs the launcher to use the Java HotSpot Server VM. PermGen Memory If you start getting java.lang.OutOfMemoryError: PermGen space error messages. You may want to include a "-XX:MaxPermSize" option in your JAVA_OPTS. JAVA_OPTS="-Xmx1024m -Xms256m -server -XX:MaxPermSize=128m" export...