The design of Apache Cassandra

Note: We have to compromise over strong consistency in order to achieve very high availability and low latency. PACELC theorem states the tradeoff between availability & consistency and latency & consistency. Cassandra provides tunable consistency levels which allows its clients to create the required balance between consistency and the availability.

Design – data model

To understand the design of Cassandra, we will take the example of an online bookstore built on top of Cassandra.

Cassandra uses tables to store the data. Each row in the table is characterized by a schema that defines the structure of the row (the columns and their types). There are as many tables as the number of schemas in the schema list defined by the Cassandra user. The rows that belong to a schema form a column family, which we call the table. On the highest level, it has keyspaces. A user can create a keyspace, define the schemas, and provide the replication factor and strategy, as shown in the following illustration.

The partition key is used to split the table into different sets of rows, where each set is called a partition. Partitioning helps achieve scalability.
The clustering key is used to sort the data within a partition which helps in fast retrieval of the data, as illustrated below.

Partitioning

Cassandra uses a consistent hashing partitioning technique for horizontal data partitioning. We illustrate it with an example of how Cassandra distributes bookstore data on different nodes in the cluster.

The number of data partitions depends on the number of nodes in the ring, including virtual nodes. In the illustration below, bookstore's data is split into four partitions as there are four nodes in the cluster.

As in consistent hashing, nodes are arranged in a ring of length L, where L is an integer. Each node is assigned a key within the ring length range. Let’s assume Node 1 is assigned a key m, Node 2 is assigned a key n, Node 3 is assigned a key o, Node 4 is assigned a key p. Keys are all integers. The consistent hashing ring range is not required to necessarily start from zero. That’s why we have the range defined as [start–end].

When the administrator of the bookstore inserts a book in the bookstore, a hash is computed based on the partition key (Book ID), which helps find the node responsible for storing that book. In the above image hash(273) mod Length_of_the_ring is assumed to be greater than p and less than or equal to m that’s why the book with ID 273 is stored on Node 1.

Node 1 stores all of the entries whose hash lies between p and m . Here, m is inclusive.
Node 2 stores all of the entries whose hash lies between m and n . Here, n is inclusive.
Node 3 stores all of the entries whose hash lies between n and o. Here, o is inclusive.
Node 4 stores all of the entries whose hash lies between o and p. Here, p is inclusive.

The benefit of using clustering key

In the above example, the Language column, as the clustering key, helps sort the rows of a partition in alphabetical order of book language, which in return helps find a book in a specific language efficiently. Alternatively, if we use the Quantity column as the clustering key, it will sort the rows of a partition in ascending/descending order of the quantity of the book, which in return, helps find books with fewer quantities efficiently.

Replication

The replication factor (RF) in a keyspace determines the number of times each partition is replicated, and the replication strategy determines the nodes (RF-1 out of the total number of nodes in the cluster) for replicating a partition. The replication helps achieve availability.

In conclusion, Apache Cassandra’s design aligns with its goals of high availability, scalability, and performance. It enables linear scalability by partitioning data horizontally across cluster nodes, allowing for easy addition or removal of nodes as needed. Additionally, it offers configurable replication factors and strategies to meet specific availability requirements. By utilizing efficient data partitioning and replication strategies, Cassandra delivers outstanding performance characterized by high throughput and low latency. Its flexible data model, including primary keys with partition and clustering components, supports efficient data organization and retrieval, catering to various application needs.

Test yourself

Test what you have learned so far.

Free AI Mock Interviews

Coding Interview

Coding PatternsFree Interview

Gain insights and practical experience with coding patterns through targeted MCQs and coding problems, designed to match and challenge your expertise level.

System Design

YouTubeFree Interview

Learn to design a video streaming platform like YouTube by tackling functional and non-functional requirements, core components, and high-level to detailed design challenges.