In this SQL query, we aim to identify employees who earn more than their respective managers. We have a table named "employees," which contains information about each employee, including their salary and the "emp_id" of their manager. The goal is to compare the salary of each employee with the salary of their manager and retrieve the records of employees who earn more than their managers.
There are two approaches to this query:
Join (self)
Subquery (correlated)
We will explore both approaches in this Answer. Let's first explore the Join approach.
The Join approach for a query to find out the employees that earn more than their managers involves combining data from the "employees" table with itself to compare the salary of each employee with the salary of their respective manager. This comparison helps identify employees who earn more than their managers.
Let's have a look at the "employees" table below.
employee_id | employee_name | salary | manager_id |
1 | John | 50000 | 2 |
2 | Alice | 60000 | 3 |
3 | Bob | 55000 | 5 |
4 | Micheal | 45000 | 2 |
5 | Sarah | 40000 | 6 |
6 | Robert | 35000 | NULL |
Let's create the table, formulate the schema and insert values in it in the code below:
CREATE TABLE employees (emp_id INT PRIMARY KEY,emp_name VARCHAR(20),salary INT,mgr_id INT);-- //inserting values in tableINSERT INTO employees (emp_id, emp_name, salary, mgr_id)VALUES(1, 'John', 50000, 2),(2, 'Alice', 60000, 3),(3, 'Bob', 55000, 5),(4, 'Micheal', 45000, 2),(5, 'Sarah', 40000, 6),(6, 'Robert', 35000, NULL);SELECT * FROM employees;
Line 1–6: In the CREATE TABLE
statement, we create a new table called "employees." The table has four columns: emp_id ,emp_name
,salary , mgr_id.
Line 9–16: In the INSERT INTO
statement, we insert multiple rows of data into the "employees" table. Each row corresponds to an employee record, and each value in the parentheses represents the data to be inserted for an employee.
Line 18: In the SELECT
statement, we retrieve data from the "employees" table. The asterisk *
is used as a shorthand to select all columns from the table.
Here's the query implementation using joins in SQL.
SELECT emp.*FROM employees empJOIN employees mgr ON emp.manager_id = mgr.employee_idWHERE emp.salary > mgr.salary;
Line 1: SELECT emp.*
selects all columns (*
) from the "employees" table. The alias emp
is used to refer to the "employees" table in the query.
Line 2: In FROM employees emp
, the FROM
clause specifies the table from which we are querying, which is the "employees" table. We assign the alias emp
to this instance of the table.
Line 3: JOIN employees mgr ON emp.manager_id = mgr.employee_id
is a JOIN
operation on the "employees" table (emp
) with itself. It links each employee to their corresponding manager using the "manager_id" column in the "employees" table (alias emp
) and the "employee_id" column in the "employees" table (alias mgr
).
Line 4: In WHERE emp.salary > mgr.salary
, the WHERE
clause filters the result set to include only those rows where the salary of the employee (emp.salary
) is greater than the salary of their respective manager (mgr.salary
).
Let's combine the codes above to view the end result of the query.
CREATE TABLE employees (emp_id INT PRIMARY KEY,emp_name VARCHAR(20),salary INT,mgr_id INT);-- Inserting values in the tableINSERT INTO employees (emp_id, emp_name, salary, mgr_id)VALUES(1, 'John', 50000, 2),(2, 'Alice', 60000, 3),(3, 'Bob', 55000, 5),(4, 'Micheal', 45000, 2),(5, 'Sarah', 40000, 6),(6, 'Robert', 35000, NULL);-- Target query--SELECT emp.*FROM employees empJOIN employees mgr ON emp.mgr_id = mgr.emp_idWHERE emp.salary > mgr.salary;
employee_id | employee_name | salary | manager_id |
2 | Alice | 60000 | 3 |
3 | Bob | 55000 | 5 |
5 | Sarah | 40000 | 6 |
Now that we have explored how to solve the query using a Join let's have a look at the other approach of using SQL subqueries.
SELECT emp.*FROM employees empWHERE emp.salary > (SELECT mgr.salaryFROM employees mgrWHERE emp.mgr_id = mgr.emp_id);
In SQL, first, the subquery is evaluated. The subquery selects the manager's salaries from the "employees" table from the manager_id= employee_id
columns. This returns the column of salaries as output. The column of salaries is then compared with the employee salaries, if an employee's salary is greater than their manager's salary, then that employee's data is recorded into the final output by the select emp. *
statement.
Now that we have understood the second approach, let's evaluate if it produces the same result as the first one. Here's the complete code to analyze the result.
CREATE TABLE employees (emp_id INT PRIMARY KEY,emp_name VARCHAR(20),salary INT,mgr_id INT);-- Inserting values in the tableINSERT INTO employees (emp_id, emp_name, salary, mgr_id)VALUES(1, 'John', 50000, 2),(2, 'Alice', 60000, 3),(3, 'Bob', 55000, 5),(4, 'Micheal', 45000, 2),(5, 'Sarah', 40000, 6),(6, 'Robert', 35000, NULL);SELECT emp.*FROM employees empWHERE emp.salary > (SELECT mgr.salaryFROM employees mgrWHERE emp.mgr_id = mgr.emp_id);
In conclusion, both the Join and subquery approaches are valuable SQL techniques, and their choice depends on the specific requirements and context of the query. In general, if you need to retrieve additional information from the joined table or if the query involves complex conditions and comparisons, the Join approach may be more efficient and readable. On the other hand, if you have a simple comparison to make or need to calculate a single value based on a subquery, the subquery approach can be more concise and straightforward.
Free Resources