How to use APPROX_COUNT_DISTINCT in SQL 2019

The APPROX_COUNT_DISTICNT function in SQL Server 2019 returns the approximate count of distinct non-null values in a group.

Arguments and return value

The APPROX_COUNT_DISTINCT function takes in an expression of any type except the following:

  • image
  • sql_variant
  • ntext
  • text

The APPROX_COUNT_DISTINCT function returns a bigint value.

Important points about the APPROX_COUNT_DISTINCT function

  • The function is designed to work on huge data sets with millions of rows and columns with distinct values.
  • Its use is suitable with big data where aggregation with higher responsiveness than precision is needed.
  • APPROX_COUNT_DISTINCT function implementation can provide up to a 2% error rate within a 97% probability.
  • In contrast to the APPROX_COUNT_DISTINCT function, the COUNT_DISTINCT function is more precise, requires more memory, and is more likely to spill memory to disk.

Example

The following demonstrates how to use the APPROX_COUNT_DISTINCT using a small database. The function can be implemented on big data scenarios in a similar way.

CREATE TABLE orders(x int, y int);
INSERT INTO orders VALUES(1, 25);
INSERT INTO orders VALUES(2, 25);
INSERT INTO orders VALUES(3, 25);
INSERT INTO orders VALUES(4, 30);
INSERT INTO orders VALUES(5, 30);
INSERT INTO orders VALUES(6, 50);
INSERT INTO orders VALUES(7, 60);
SELECT y,APPROX_COUNT_DISTINCT(x) AS Approx_Distinct_key
FROM orders
GROUP BY y;

Output

y   Approx_Distinct_Key
-----------------------
25   3
30   2
50   1
60   1

Explanation

  • We first create a simple table orders with two columns, x and y, and fill it with values.
  • The GROUP BY method divides the values in x into four categories because there are four distinct values in y.
  • The APPROX_COUNT_DISTINCT method calculates the number of distinct values in each category.

Free Resources

Copyright ©2025 Educative, Inc. All rights reserved