A window function is used to calculate values over a set of rows. It lets you perform calculations across any number of table rows that are linked to the current row. A window in SQL refers to the set of rows on which the window function operates. Window frames specify the range of rows within a partition and they allow us to do more precise calculations. Partitions are used to group rows together within the same partition before applying the window function. A partition can be defined using the PARTITION BY
clause. The window functions can be predefined or user-defined. Window functions in SQL can be broadly categorized into the following three categories:
Ranking window functions
Analytic window functions
In SQL, analytic window functions assign values to the rows in the result set based on the values in the other rows. Analytic window functions are also called value window functions. These functions include LAG()
, LEAD()
, FIRST_VALUE()
, LAST_VALUE()
, and NTH_VALUE()
.
The window functions FIRST_VALUE()
, LAST_VALUE()
, and NTH_VALUE()
are available in various relational database management systems (RDBMS), including PostgreSQL, that support advanced SQL functionalities.
FIRST_VALUE()
: Returns the first value in the data set.
LAST_VALUE()
: Returns the last value in the data set.
NTH_VALUE()
: Returns the Nth value in the data set.
Following is the syntax for a window function:
window_function ()OVER ( [PARTITION BY clause][ORDER BY clause][window_frame clause])
The window_function()
represents the window function. We replace it with the actual window function to perform the operation.
We create partitions by defining the column/condition in PARTITION BY clause
. This clause becomes optional if there’s no need to create partitions. The window function will treat the dataset as a single window in such cases.
The ORDER BY clause
specifies the order within each partition.
The window_frame clause
is used to define the range of rows considered for the calculations.
Let’s understand each of the window functions through coding examples:
Let’s suppose we have the following data on the student’s exam results. We’re using the SELECT
query to view the records in the Exam_result
table.
SELECT * FROM Exam_result;
FIRST_VALUE()
window function in PostgreSQLThe window function FIRST_VALUE()
will return the value from the first row in the window frame based on the specified order. Following is the syntax for FIRST_VALUE()
:
FIRST_VALUE ( [expression] )OVER ( [PARTITION BY clause][ORDER BY clause] )
The FIRST_VALUE([expression])
represents the window function with the expression value.
We create partitions by defining the column/condition in PARTITION BY clause
. This clause becomes optional if there’s no need to create partitions. The window function will treat the dataset as a single window in such cases.
The ORDER BY clause
specifies the order within each partition.
We can use this function to retrieve the lowest marks in a subject. The coding example is:
SELECTstd_id,std_name,marks,subject_name,FIRST_VALUE(marks)OVER(PARTITION BY subject_nameORDER BY marks ASC)FROMExam_result;
Lines 1–5: We use the SELECT
statement to read specific columns from the Exam_result
table, including std_id
, std_name
, marks
, and subject_name
.
Lines 6–10: The FIRST_VALUE()
window function is used to calculate the first value of the marks
column in each partition defined by the subject_name
column. The function is applied to each row within the partition, and the result is stored in the FIRST_VALUE(marks)
column in ascending order as written in ORDER BY
clause.
Lines 11–12: The FROM clause
specifies the source table Exam_result
from which the data is being queried.
LAST_VALUE()
window function in PostgreSQLThe window function LAST_VALUE()
will return the value in the last row in the window frame based on the specified order. Following is the syntax for LAST_VALUE()
:
LAST_VALUE ( [expression] )OVER ( [PARTITION BY clause][ORDER BY clause][RANGE clause] )
The LAST_VALUE([expression])
represents the window function with the expression value.
We create partitions by defining the column/condition in PARTITION BY clause
. This is the same argument as given for FIRST_VALUE()
.
The ORDER BY clause
specifies the order within each partition.
The RANGE clause
with the syntax RANGE BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
means that the window frame for each row in the partition includes all rows from the first row to the last row of the partition, without regard to the actual values of the rows but based on their order.
The coding example is:
SELECTstd_id,std_name,marks,subject_name,LAST_VALUE(marks)OVER(PARTITION BY subject_nameORDER BY marks ASC)FROMExam_result;
Lines 1–5: We use the SELECT
statement to read specific columns from the Exam_result
table, including std_id
, std_name
, marks
, and subject_name
.
Lines 6–10: The LAST_VALUE()
window function is used to calculate the last value of the marks
column in each partition defined by the subject_name
column. The function is applied to each row within the partition, and the result is stored in the LAST_VALUE(marks)
column in ascending order as written in ORDER BY
clause.
Lines 11–12: The FROM clause
specifies the source table Exam_result
from which the data is being queried.
We can see that the marks
and last_value
columns output the same values. We can use this function to retrieve the highest marks in a subject, as demonstrated in the example below:
SELECTstd_id,std_name,marks,subject_name,LAST_VALUE(marks)OVER(PARTITION BY subject_nameORDER BY marks ASCROWS BETWEENUNBOUNDED PRECEDING ANDUNBOUNDED FOLLOWING)FROMExam_result;
Lines 1–5: We use the SELECT
statement to read specific columns from the Exam_result
table, including std_id
, std_name
, marks
, and subject_name
.
Lines 6–13: The LAST_VALUE()
window function is used to calculate the last value of the marks
column in each partition defined by the subject_name
column. The function is applied to each row within the partition, and the result is stored in the LAST_VALUE(marks)
column. The output is sorted in ascending order of marks
column. The ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
clause specifies that the window frame for each row in the partition includes all rows from the first row to the last row of the partition, without regard to the actual values of the rows but based on their order. This means that for each row, the function considers all rows in the partition for calculating the result.
Lines 14–15: The FROM clause
specifies the source table Exam_result
from which the data is being queried.
NTH_VALUE()
window function in PostgreSQLThe window function NTH_VALUE()
returns the value in the nth row in the window frame based on the specified order. Following is the syntax for NTH_VALUE()
:
NTH_VALUE ( [expression, n] )OVER ( [PARTITION BY clause][ORDER BY clause][RANGE clause] )
The FIRST_VALUE([expression])
represents the window function with the expression value.
We create partitions by defining the column/condition in PARTITION BY clause
. This is the same argument as given for FIRST_VALUE()
and LAST_VALUE()
.
The ORDER BY clause
specifies the order within each partition.
The RANGE clause
has already been explained in the LAST_VALUE()
section.
This function requires an additional argument specifying the position n
within the window frame. Let’s retrieve the second-highest marks in a subject as follows:
SELECTstd_id,std_name,marks,subject_name,NTH_VALUE(marks, 2)OVER(PARTITION BY subject_nameORDER BY marks DESCRANGE BETWEENUNBOUNDED PRECEDING ANDUNBOUNDED FOLLOWING)FROMExam_result;
Lines 1–5: We use the SELECT
statement to read specific columns from the Exam_result
table, including std_id
, std_name
, marks
, and subject_name
.
Lines 6–13: The NTH_VALUE()
window function is used to calculate the value of the marks
column in each partition defined by the subject_name
column. The function is applied to each row within the partition, and the result is stored in the NTH_VALUE(marks)
column. The output is sorted in descending order of marks
column. The ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING clause
specifies that the window frame for each row in the partition includes all rows from the first row to the last row of the partition. The output changes the value of NULL
record. You may try removing this clause to see how the output looks without this clause.
Lines 14–15: The FROM clause
specifies the source table Exam_result
from which the data is being queried.
FIRST_VALUE()
, LAST_VALUE()
, and NTH_VALUE()
The window functions FIRST_VALUE()
, LAST_VALUE()
, and NTH_VALUE()
are used to retrieve specific values based on the order. Let’s use all three functions in a single query. We’re using FIRST_VALUE()
function to retrieve the highest marks, LAST_VALUE()
to retrieve the lowest marks, and NTH_VALUE()
function to retrieve the second-highest value in a subject:
SELECTstd_id,std_name,marks,subject_name,FIRST_VALUE(marks)OVER(PARTITION BY subject_nameORDER BY marks DESC),LAST_VALUE(marks)OVER(PARTITION BY subject_nameORDER BY marks DESC),NTH_VALUE(marks, 2)OVER(PARTITION BY subject_nameORDER BY marks DESCRANGE BETWEENUNBOUNDED PRECEDING ANDUNBOUNDED FOLLOWING)FROMExam_result;
Lines 1–5: We use the SELECT
statement to retrieve specific columns std_id
, std_name
, marks
, and subject_name
from the Exam_result
table.
Lines 6–23: The FIRST_VALUE()
, LAST_VALUE()
, and NTH_VALUE()
window functions are used to calculate certain values within each partition defined by the subject_name
column.
FIRST_VALUE(marks)
calculates the first value of the marks
column in each partition, ordered by marks
in descending order.
LAST_VALUE(marks)
calculates the last value of the marks
column in each partition, ordered by marks
in descending order.
NTH_VALUE(marks, 2)
calculates the value of the marks
column that corresponds to the second row in each partition, ordered by marks
in descending order. The RANGE BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING clause
specifies that the window frame for each row in the partition includes all rows from the first row to the last row of the partition.
Lines 24–25: The FROM clause
specifies the source table Exam_result
from which the data is being queried.
The window functions FIRST_VALUE()
, LAST_VALUE()
, and NTH_VALUE()
are used for data analysis. These functions enable users to retrieve precise values from specified positions within a window frame based on ordered partitions.
Free Resources