APPROX_COUNT_DISTINCT
Approximates the number of distinct non-null input values.
This function is used as an approximation of COUNT(DISTINCT ...)
in order to improve performance. It should be used when there are relatively few (under one million) rows, and where the total number of distinct values is high.
Syntax
APPROX_COUNT_DISTINCT(X)
Arguments
X
X
Type: any type
The values to be counted.
Returns
Type: bigint
Returns the approximate number of distinct non-null input values.
If all input values are null, zero is returned.
Examples
Transformation job example
SQL
CREATE JOB function_operator_example
ADD_MISSING_COLUMNS = TRUE
AS INSERT INTO default_glue_catalog.upsolver_samples.orders_transformed_data
MAP_COLUMNS_BY_NAME
SELECT
LOWER(ordertype) AS ordertype,
ROUND(AVG(nettotal), 2) AS avgtotal
FROM default_glue_catalog.upsolver_samples.orders_raw_data
WHERE $commit_time BETWEEN run_start_time() - PARSE_DURATION('1d')
AND run_end_time()
GROUP BY 1
LIMIT 2;
Query result
ordertype
avgtotal
pickup
1062.14
shipping
1061.91
Last updated