Links
Comment on page

APPROX_COUNT_DISTINCT

Approximates the number of distinct non-null input values.
This function is used as an approximation of COUNT(DISTINCT ...) in order to improve performance. It should be used when there are relatively few (under 1 million) rows, and where the total number of distinct values is high.

Syntax

APPROX_COUNT_DISTINCT(X)

Arguments

X

Type: any type
The values to be counted.

Returns

Type: bigint
Returns the approximate number of distinct non-null input values.
If all input values are null, zero is returned.

Examples

Transformation job example

SQL

CREATE JOB function_operator_example
ADD_MISSING_COLUMNS = true
AS INSERT INTO default_glue_catalog.upsolver_samples.orders_transformed_data MAP_COLUMNS_BY_NAME
SELECT
LOWER(ordertype) as ordertype,
ROUND(AVG(nettotal), 2) as avgtotal
FROM default_glue_catalog.upsolver_samples.orders_raw_data
WHERE $commit_time BETWEEN run_start_time() - PARSE_DURATION('1d') AND run_end_time()
GROUP BY 1
LIMIT 2;

Query result

ordertype
avgtotal
pickup
1062.14
shipping
1061.91