APPROX_COUNT_DISTINCT

Approximates the number of distinct non-null input values.

This function is used as an approximation of COUNT(DISTINCT ...) in order to improve performance. It should be used when there are relatively few (under 1 million) rows, and where the total number of distinct values is high.

Syntax

APPROX_COUNT_DISTINCT(X)

Arguments

X

Type: any type

The values to be counted.

Returns

Type: bigint

Returns the approximate number of distinct non-null input values.

If all input values are null, zero is returned.

Examples

Transformation job example

SQL

CREATE JOB function_operator_example
    ADD_MISSING_COLUMNS = true	
    AS INSERT INTO default_glue_catalog.upsolver_samples.orders_transformed_data MAP_COLUMNS_BY_NAME
    SELECT 
        LOWER(ordertype) as ordertype,
        ROUND(AVG(nettotal), 2) as avgtotal
    FROM default_glue_catalog.upsolver_samples.orders_raw_data 
    WHERE $commit_time BETWEEN run_start_time() - PARSE_DURATION('1d') AND run_end_time()
    GROUP BY 1
    LIMIT 2;

Query result

ordertypeavgtotal

pickup

1062.14

shipping

1061.91

Last updated