The APPROX_COUNT_DISTICNT
function in SQL Server 2019 returns the approximate count of distinct non-null values in a group.
The APPROX_COUNT_DISTINCT
function takes in an expression of any type except the following:
The APPROX_COUNT_DISTINCT
function returns a bigint value.
APPROX_COUNT_DISTINCT
function implementation can provide up to a 2% error rate within a 97% probability.APPROX_COUNT_DISTINCT
function, the COUNT_DISTINCT
function is more precise, requires more memory, and is more likely to spill memory to disk.The following demonstrates how to use the APPROX_COUNT_DISTINCT
using a small database. The function can be implemented on big data scenarios in a similar way.
CREATE TABLE orders(x int, y int);INSERT INTO orders VALUES(1, 25);INSERT INTO orders VALUES(2, 25);INSERT INTO orders VALUES(3, 25);INSERT INTO orders VALUES(4, 30);INSERT INTO orders VALUES(5, 30);INSERT INTO orders VALUES(6, 50);INSERT INTO orders VALUES(7, 60);SELECT y,APPROX_COUNT_DISTINCT(x) AS Approx_Distinct_keyFROM ordersGROUP BY y;
y Approx_Distinct_Key
-----------------------
25 3
30 2
50 1
60 1
orders
with two columns, x
and y
, and fill it with values.GROUP BY
method divides the values in x
into four categories because there are four distinct values in y
.APPROX_COUNT_DISTINCT
method calculates the number of distinct values in each category.Free Resources