What’s the “N+1 selects problem” in ORM?

Key takeaways:
A common performance issue in ORM frameworks where retrieving a list of primary records (N) results in executing additional queries (1 for each primary record), leading to a total of N + 1 queries.
The N+1 problem can significantly slow down application performance by increasing database workload, incurring network latency, and reducing scalability as the number of records grows.
Retrieving users and their posts demonstrates the problem: 1 query to get all users plus multiple queries for each user’s posts can result in 11 total queries for 10 users.
Several strategies that can help mitigate the N+1 selects problem are:
Eager loading: Fetch all related data in a single query to avoid multiple additional queries.
Join fetching: Use SQL joins to retrieve primary and related data simultaneously.
Caching: Store frequently accessed data to reduce redundant database queries.
DTO projections: Use data transfer objects to minimize data transfer and avoid unnecessary joins.
Batching: Group multiple queries into a single request to reduce trips to the database.

In software development, particularly when using object-relational mapping (ORM) frameworks, the “N+1 selects problem” is a common performance issue developers encounter. Understanding this problem is crucial for optimizing database interactions and ensuring efficient application performance. Let’s explore the problem, how it occurs, and ways to mitigate its impact.

ORM is a programming technique that converts data between incompatible systems, particularly between object-oriented programming languages and relational databases. This abstraction allows developers to interact with the database using object-oriented code instead of writing raw SQL queries. While ORM frameworks simplify database interactions, they can inadvertently introduce performance issues, including the N+1 selects problem.

N+1 problem

The N+1 selects problem occurs when fetching a list of records from a database leads to executing excessive queries. Here’s how it typically unfolds:

Initial query (N): The application sends a query to retrieve a set of primary records (e.g., a list of users or products). Let’s say this results in N primary records.
Subsequent queries (+1 for each primary record): For each of these N records, an additional query is sent to fetch related data (e.g., user profiles, associated products, etc.). This results in N additional queries.

Thus, the application executes N + 1 queries—N for the primary records and 1 for each associated record.

Example of the problem

Let’s consider a simple scenario with two tables: Users and Posts. Each user can have multiple posts. If an application retrieves a list of users and, for each user, retrieves their posts, the following happens:

Step 1: Execute one query to get all users (1 query).
Step 2: For each user, execute a query to get their posts. If there are 10 users, that results in 10 additional queries.

In this case, the application executes 1 + 10 = 11 queries, leading to significant performance overhead.

Why is this a problem?

The N+1 selects problem can severely degrade application performance due to:

More queries mean more work for the database, slowing down response times and straining resources.
Each query incurs network latency. Multiple small queries lead to longer wait times for users.
As the number of records increases, performance degradation becomes more pronounced, making the application less scalable.

Solutions to the N+1 selects problem

Fortunately, several strategies can help mitigate the N+1 selects problem:

Eager loading: As demonstrated in the code below, eager loading fetches all necessary related data in a single query. By including the Post model in the user query, we eliminate the need for additional queries for each user’s posts.

Join fetching: This technique retrieves primary and related data in one go using SQL joins, similar to eager loading, but focusing on how the SQL is structured.
Caching: Caching frequently accessed data can help reduce redundant queries. By storing query results in memory, the application can serve requests faster without hitting the database each time.
Using DTO projections: Instead of loading full entity objects, developers can define data transfer objects (DTOs) that contain only the required fields, minimizing data transfer and avoiding unnecessary joins.
Batching: Some ORM frameworks allow batching of queries, reducing the number of trips to the database by grouping multiple queries into a single request.

Example code

Here’s an example of how to retrieve users along with their posts using an ORM framework while avoiding the N+1 selects problem through eager loading:

// index.js
const { sequelize, User, Post } = require('./models');

async function initializeDatabase() {
    // Sync all models with the database
    await sequelize.sync({ force: true });

    // Create some sample users and posts
    const user1 = await User.create({ name: 'John Doe' });
    const user2 = await User.create({ name: 'Jane Doe' });

    await Post.create({ title: 'Post 1', userId: user1.id });
    await Post.create({ title: 'Post 2', userId: user1.id });
    await Post.create({ title: 'Post 3', userId: user2.id });
}

async function getUsersWithPosts() {
    const users = await User.findAll(); // N+1 selects problem
    for (const user of users) {
        const posts = await Post.findAll({ where: { userId: user.id } });
        user.posts = posts; // Attach posts to user
    }
    return users;
}

async function getUsersWithPostsEagerLoaded() {
    const users = await User.findAll({
        include: [{ model: Post, required: false }],
    });
    return users;
}

async function main() {
    await initializeDatabase();

    // Fetch users with posts using N+1 problem method
    const usersWithPostsNPlus1 = await getUsersWithPosts();
    console.log('Users with posts (N+1 selects):', JSON.stringify(usersWithPostsNPlus1, null, 2));

    // Fetch users with posts using eager loading
    const usersWithPostsEager = await getUsersWithPostsEagerLoaded();
    console.log('Users with posts (eager loading):', JSON.stringify(usersWithPostsEager, null, 2));
}

main().catch((error) => {
    console.error('Error:', error);
}).finally(() => {
    sequelize.close(); // Close the database connection
});

Solution of the N+1 problem using eager loading

Explanation

In the index.js file above:

Lines 17–24: The getUsersWithPosts function retrieves users along with their posts but is prone to the N+1 selects problem. It starts by asynchronously fetching all users using User.findAll(). Then, for each user, it executes another asynchronous query to fetch their posts with Post.findAll({ where: { userId: user.id } }). This results in one query to fetch users and one additional query for each user to fetch their posts, leading to potentially many queries being executed. Finally, the function attaches the retrieved posts to each user object and returns the complete list of users with their respective posts.
Lines 26–31: The getUsersWithPostsEagerLoaded function is defined as an asynchronous function, utilizing the async keyword for handling asynchronous operations. It calls User.findAll() to fetch all users from the database. To implement eager loading, the include option is used to load the associated posts, and the parameter required: false ensures that users without any posts are still included in the result. Ultimately, the function returns a list of users along with their respective posts.

Note: You should see output in the console showing users with their posts for both methods (N+1 selects and eager loading).

Quiz

Let's have a quiz to review what you have learned so far.

Frequently asked questions

Haven’t found what you were looking for? Contact Us

What is the N+1 selects problem in RESTful?

The N+1 problem in RESTful APIs occurs when a request retrieves a collection of items (N) and then, for each item, makes an additional request to retrieve related data (1). This results in many database queries (1 for the collection + N for each item), leading to inefficient data retrieval and performance issues.

What is the N+ 1 problem in GraphQL?

In GraphQL, the N+1 problem arises when a single query requests a collection of items, and for each item, the resolver fetches related data separately. This can lead to multiple database calls for fetching relationships, resulting in similar performance inefficiencies as in REST, where the initial query retrieves N items and the subsequent queries retrieve related data, causing N additional queries.

Free AI Mock Interviews

Coding Interview

Coding PatternsFree Interview

Gain insights and practical experience with coding patterns through targeted MCQs and coding problems, designed to match and challenge your expertise level.

System Design

YouTubeFree Interview

Learn to design a video streaming platform like YouTube by tackling functional and non-functional requirements, core components, and high-level to detailed design challenges.

Free Resources