Links

REGEX_NAMED_GROUPS

Matches the regular expression on the input string. Returns record with field names and group names

Syntax

REGEX_NAMED_GROUPS(pattern, allMatches, filterEmpty, input)

Arguments

Name
Type
Description
Default Value
pattern
string
Regular Expression Pattern
allMatches
boolean
Return all the matches of the pattern, and not only the first one
false
filterEmpty
boolean
Filter out empty matches
false
input
string

Examples

pattern
allMatches
filterEmpty
input
Output
'^(?:(?.?):/)?/?(?[^:/\s]+)(?::(?\d))?(?:(/\w+)/)(?[\w-.]+[^#?\s]+)(?:.)?$'
false
false
'https://www.domain.com/page.html'
{scheme: https, domain: www.domain.com, port: null, page: page.html}
'^(?:(?.?):/)?/?(?[^:/\s]+)(?::(?\d))?(?:(/\w+)/)(?[\w-.]+[^#?\s]+)(?:.)?$'
false
false
'http://www.domain.com:8080/page.html'
{scheme: http, domain: www.domain.com, port: 8080, page: page.html}
'^(?\d*)$'
false
false
'123'
{digits: 123}
'^(?\d*)$'
false
false
'foo'
null
'^(?\d*)$'
false
false
''
{digits: ``}
'^(?\d*)$'
false
true
''
null
'\bwww.(?[^.]*).com\b'
true
false
'www.upsolver.com'
{domain: upsolver}
'\bwww.(?[^.]*).com\b'
true
false
'www.a.com www.b.com'
[{domain: a}, {domain: b}]
'\bwww.(?[^.]*).com\b'
false
false
'www.a.com www.b.com'
{domain: a}

Transformation job example

SQL

CREATE JOB function_operator_example
ADD_MISSING_COLUMNS = true
AS INSERT INTO default_glue_catalog.upsolver_samples.orders_transformed_data MAP_COLUMNS_BY_NAME
SELECT pattern, allMatches, filterEmpty, input,
REGEX_NAMED_GROUPS('^(?:(?<scheme>.*?):\/)?\/?(?<domain>[^:\/\s]+)(?::(?<port>\d*))?(?:(\/\w+)*\/)(?<page>[\w\-\.]+[^#?\s]+)(?:.*)?$', false, false, input) AS Output
FROM default_glue_catalog.upsolver_samples.orders_raw_data
LET pattern = '^(?:(?<scheme>.*?):\/)?\/?(?<domain>[^:\/\s]+)(?::(?<port>\d*))?(?:(\/\w+)*\/)(?<page>[\w\-\.]+[^#?\s]+)(?:.*)?$',
allMatches = false,
filterEmpty = false,
input = 'https://www.domain.com/page.html'
WHERE time_filter()
LIMIT 1;

Query result

pattern
allMatches
filterEmpty
input
Output
'^(?:(?.?):/)?/?(?[^:/\s]+)(?::(?\d))?(?:(/\w+)/)(?[\w-.]+[^#?\s]+)(?:.)?$'
false
false
'https://www.domain.com/page.html'
{scheme: https, domain: www.domain.com, port: null, page: page.html}