In: Computer Science
The CAP theorem says a database cannot have all three of these features: partition (fault tolerance) and consistency and availability. We know that both MongoDB database system and Cassandra database system provide fault tolerance (partition). So how do they handle consistency and availability in their different ways?
Cassandra database:
Cassandra maintains tunable consistency. When performing a read or write operation a database client can specify a consistency level. The consistency level refers to the number of replicas that need to respond for a read or write operation to be considered complete. For reading non-critical data (the number of “likes” on a social media post, for example), it’s probably not essential to have the very latest data. You can set the consistency level to ONE and Cassandra will simply retrieve a value from the closest replica. If I’m concerned about accuracy, however, I can specify a higher consistency level, like TWO, THREE, or QUORUM. If a QUORUM (essentially a majority) of replicas reply, and if the data was written with similarly strong consistency, users can be confident that they have the latest data. If there are inconsistencies between replicas when data is read, Cassandra will internally manage a process to ensure that replicas are synchronized and contain the most recent data.
The larger the consistency level, the availability of the system
will decrease.
Example: Say we have set Consistency level to
THREE, for a Cassandra DB whose replication factor
is 3.
If one of the replicas disconnects from the cluster, both read and
write will start to fail, making the system Unavailable for both
read and write. In Summary, Cassandra is always available but once
we start tweaking it to make more consistent, we lose
availability.
MongoDB database:
By default, Mongo DB Client(MongoDB driver), sends all read/write requests to the leader/primary node.
Again this default behavior allows Mongo DB to be a consistent system but not available due to the below reasons:
So, if we use MongoDB client with its default behavior, MongoDB behaves as a Consistent system and not Available.
We could simply configure read-preference mode in MongoDB client
to read from any secondary nodes.
But, by doing so we are breaking consistency. Now, a write to
primary/leader can be successful but, secondary’s might not have
updated the latest data from primary due to any reason.
So, with this setup, you get high availability for reads but lose
consistency and inturn you get eventual consistency.