A database is a collection of structured data stored in an organized manner. Using SQL, we can create tables, insert data into tables, modify the records, and delete the existing data from the tables. The need to retrieve the information in a proper format for processing sometimes requires you to rotate rows into columns.
Transposing a table means converting the rows into columns and vice versa. Suppose we have a table containing the number of projects launched each quarter in the years 2021 and 2022.
id | Year | Quarter | ProjectCount |
1 | Y21 | Q1 | 45 |
2 | Y21 | Q2 | 60 |
3 | Y21 | Q3 | 42 |
4 | Y21 | Q4 | 72 |
5 | Y22 | Q1 | 62 |
6 | Y22 | Q2 | 55 |
7 | Y22 | Q3 | 58 |
8 | Y22 | Q4 | 71 |
The issue is that its output view doesn’t allow us to read and differentiate it properly. In this specific example, we will transpose rows into columns. Let’s see how transposing rows into columns make the difference here.
Suppose we want to see the number of projects launched each quarter in 2021 and 2022. The table will look like the following:
Year | Q1 | Q2 | Q3 | Q4 |
Y21 | 45 | 60 | 42 | 72 |
Y22 | 62 | 55 | 58 | 71 |
Let’s add the data given in the “Projects launched each quarter” table to the database as follows:
CREATE TABLE Projects (ID INT AUTO_INCREMENT PRIMARY KEY,Year VARCHAR(10),Quarter VARCHAR(5),ProjectCount INT);INSERT INTO ProjectsVALUES(1, 'Y21', 'Q1', 45),(2, 'Y21', 'Q2', 60),(3, 'Y21', 'Q3', 42),(4, 'Y21', 'Q4', 72),(5, 'Y22', 'Q1', 62),(6, 'Y22', 'Q2', 55),(7, 'Y22', 'Q3', 58),(8, 'Y22', 'Q4', 71);
We can view all the records using the SELECT
statement.
SELECT * FROM Projects;
There are multiple ways to pivot the data in SQL. For this specific case, we use SQL to get quarters as columns and the count of projects for each quarter as rows. Let’s look at the following methods of rotating the rows into columns.
CASE
expressionWe can use the aggregate functions with the CASE
expression for each quarter. The CASE
expression checks the value of the Quarter
column, and if it matches the specified quarter (e.g., Q1
), it returns the corresponding ProjectCount
value; otherwise, it returns NULL
.
We’re using the MAX
function here, that then selects the non-NULL value for each quarter, effectively transposing the rows into columns.
-- Using CASE expression to generate the desired tableSELECTYear,MAX(CASE WHEN Quarter = 'Q1' THEN ProjectCount END) AS Q1,MAX(CASE WHEN Quarter = 'Q2' THEN ProjectCount END) AS Q2,MAX(CASE WHEN Quarter = 'Q3' THEN ProjectCount END) AS Q3,MAX(CASE WHEN Quarter = 'Q4' THEN ProjectCount END) AS Q4FROM ProjectsGROUP BY Year;
JOIN
clauseWhen dealing with data that requires multiple levels of pivoting, we can use JOIN
to join a table with itself based on a common column. In the context of transposing data, a self-join can be used to create a new table structure that organizes the data in a pivot-like format.
-- Using JOIN to generate the desired tableSELECTP1.Year,P1.ProjectCount AS Q1,P2.ProjectCount AS Q2,P3.ProjectCount AS Q3,P4.ProjectCount AS Q4FROM Projects P1JOIN Projects P2 ON P1.Year = P2.YearJOIN Projects P3 ON P1.Year = P3.YearJOIN Projects P4 ON P1.Year = P4.YearWHERE P1.Quarter = 'Q1'AND P2.Quarter = 'Q2'AND P3.Quarter = 'Q3'AND P4.Quarter = 'Q4';
Dynamic queries are SQL statements that are constructed and executed based on the data present in the database. They allow us to pivot data and transform rows into columns.
Dynamic query allows us to use the functions like GROUP_CONCAT
and CONCAT
to create the necessary SQL statement. The PREPARE
statement is then used to prepare the dynamic SQL query, and EXECUTE
is used to execute it, providing the desired result with two rows, one for each year, and quarters as columns with project counts.
Note: It’s important to use dynamic queries carefully to avoid SQL injection vulnerabilities.
-- Using Dynamic SQL to generate the desired tableSET @sql = NULL;SELECTGROUP_CONCAT(DISTINCTCONCAT('SUM(CASE WHEN Quarter = ''',Quarter,''' THEN ProjectCount END) AS ',Quarter)) INTO @sqlFROM Projects;SET @sql = CONCAT('SELECT Year, ', @sql, 'FROM ProjectsGROUP BY Year');PREPARE stmt FROM @sql;EXECUTE stmt;DEALLOCATE PREPARE stmt;
The CASE
expression is the easiest and the most common method to transpose the rows into columns. But it’s limited to transposing only a fixed number of categories.
On the other hand, while self-joins can be a powerful technique for data manipulation, they can become complex and difficult to manage as the number of pivot levels increases. In such cases, other methods, like using CASE
expressions or a combination of aggregate functions and GROUP BY
clauses, may be more straightforward and easier to maintain.
Compared to the other two techniques, the dynamic query handles more categories and can also handle unknown data.
Free Resources