Choosing the right database in a System Design interview

Key takeaways

  • Database selection is dependent on data structure and consistency requirements. The choice between SQL and NoSQL databases hinges on the nature of the data (structured, semi-structured, or unstructured) and the level of consistency required (strong or eventual).

  • SQL vs. NoSQL databases offer different trade-offs. SQL databases excel in complex queries and strong consistency, while NoSQL databases are better suited for high write throughput and eventual consistency.

  • Scalability is a crucial factor. Both vertical and horizontal scaling are options, but NoSQL databases are generally more scalable due to their distributed nature, making them suitable for applications with rapid growth.

Imagine this: You’re in the hot seat of a System Design interview, and the challenge is to choose a database for managing order-related data in an e-commerce system. Your data is neatly structured and demands high consistency, but it doesn’t entirely fit the mold of a standard relational database. You need transactions to be isolated, atomic, and to maintain all the ACID properties, yet your system has to scale like a powerhouse.

How do you decide which storage solution fits best? Let’s explore the mechanisms you can use to decide on a database.

The database choices

In any application, we have two database choices for the primary data: SQL and NoSQL. NoSQL has different types of databases, such as columnar, graph, key-value, and document databases. The choice between SQL and NoSQL depends on various factors, as discussed below.

With the recent advent of AI and GenAI, there has been a rise in the popularity of vector databases. These databases store vectors instead of other data types and are often used for things like semantic search.

Key considerations for database selection

In an interview, choosing a database may not be straightforward. By considering all relevant factors, we can be confident in our choice and justify it accordingly.

Data structure

Understanding the nature of your data is the first step in database selection.

  • Structured data: Typically organized in tables with predefined schemas, structured data is best suited for relational databases like MySQL or PostgreSQL. These databases excel in scenarios where data integrity and complex querying are paramount.

  • Semi-structured data: This type of data, which includes JSON or XML formats, can be effectively managed by document-oriented databases like MongoDB or Couchbase.

  • Unstructured data: For data that lacks a fixed structure, such as multimedia files or large text documents, NoSQL databases or object storage solutions (blob storesA blob store is a database that stores data in binary format. It is commonly used for media files such as photos, videos, and audio data. ) may be more appropriate.

SQL vs. NoSQL databases
SQL vs. NoSQL databases

ACID vs. BASE

Understand the importance of ACID (Atomicity, Consistency, Isolation, Durability) properties for transactions. If strong consistency is required, a relational database is preferable. For eventual consistency, NoSQL databases like Cassandra or DynamoDB may be more appropriate as they conform to BASE (Basically Available, Soft State, and Eventual Consistency) properties.

Query patterns

Understanding how your application will interact with the database is vital:

  • Complex queries: If your application requires complex joinsThese are queries that join two or more tables to retrieve results. and aggregationsThese are queries that perform some aggregation function, such as sum, count, etc., a relational database is typically the best choice.

  • High write throughput: For applications that need to handle a large volume of write operations, such as logging systems or real-time analytics, NoSQL databases like Cassandra or DynamoDB can provide the necessary performance.

Here is a comparison between some of the common databases:

Database

Use Cases

Strengths

Weaknesses

Relational

Banking applications, CRM systems, etc.

  • Strong consistency

  • Complex queries

Limited scalability

Document

Content management systems (CMS), social media platforms, etc.

  • Flexible schema

  • Easy to scale

Eventual consistency

Columnar

Data warehousing, IoT, etc.

  • Efficient for analytics

  • High write throughput 

Complexity in queries

Key-value

Caching, URL shortener, etc.

  • Simple data model

  • High performance 

Limited querying capabilities

Graph

Social networks, recommendation systems, etc.

  • Relationship visualization

  • Efficient for traversing connections

Less mature tooling compared to relational databases

Point to ponder

Question

Is one database enough?

Show Answer

Scaling data

Scalability is a key factor in database selection:

  • Vertical scaling: Relational databases can be vertically scaled by adding more resources to a single server, but this approach has its limits.

  • Horizontal scaling: NoSQL databases are designed to scale out by adding more servers, making them ideal for applications that expect rapid growth. For example, Google Bigtable and Amazon DynamoDB are built to handle massive datasets across distributed systems.

Vertical vs. horizontal scaling
Vertical vs. horizontal scaling

Common use cases

Specific application requirements can guide your database choice. For example:

  • E-commerce platforms: A relational database can manage user accounts and transactions, while a NoSQL database can handle product catalogs and user-generated reviews.

  • Social media applications: A combination of databases may be necessary—using a relational database for user profiles and a document store for posts and comments.

By considering these factors and utilizing the suggested illustrations and notes, candidates can effectively communicate their database choices during System Design interviews. This not only showcases their technical knowledge but also their ability to align database selection with business requirements and application goals. Ultimately, a well-informed database choice can lead to a more robust and scalable system architecture.

Test your knowledge!

Q

You are tasked with creating a customer relationship management (CRM) system for a large enterprise. The system requires complex queries, transactions, and strong consistency to manage customer data, sales records, and interactions. Which database type would be the most appropriate for this scenario?

A)

SQL

B)

NoSQL

C)

Vector

D)

All three

Frequently asked questions

Haven’t found what you were looking for? Contact Us


What is NewSQL, and when is it ideal choice?

NewSQL is a class of relational database management systems (RDBMS) designed to offer the best of both worlds: the scalability of NoSQL databases and the ACID (Atomicity, Consistency, Isolation, Durability) property is the same as that of traditional SQL databases.


Which database should you use in a System Design interview?

While there may not be a direct answer, you should be prepared to justify your database choice. Be confident in your explanation and demonstrate your understanding of the trade-offs between SQL and NoSQL databases.


What is sharding, and when should I mention it?

Sharding is a database strategy that divides large datasets into smaller, more manageable chunks called shards. These shards are distributed across multiple servers, which helps improve scalability and performance.

Sharding is ideal for:

  • Massive datasets
  • High read/write load
  • Complex data models
  • Global distribution

Free Resources

Copyright ©2025 Educative, Inc. All rights reserved