Upsolver
Search…
API content formats
This page provides a guide on how to configure different content formats in Upsolver using API calls.

Auto-Detect

Example

1
{
2
"clazz" : "AutoDetectContentType"
3
}
Copied!

Avro

Example

1
{
2
"clazz" : "AvroContentType"
3
}
Copied!

Parquet

Example

1
{
2
"clazz" : "ParquetContentType"
3
}
Copied!

ORC

Example

1
{
2
"clazz" : "OrcContentType"
3
}
Copied!

JSON

JSON data. Multiple JSONs can be read from a single file/record by appending them with optional whitespace in between.

Fields

Field
Name
Type
Description
Optional
nestedJsonPaths
Nested Json Paths
[][]
Paths to string fields that contain a "stringified" JSON that should be parsed into the schema. Each such path is represented as an array of the path parts.
+
nestedJsons
Nested Jsons
NestedPath[]
Paths to string fields that contain a "stringified" JSON that should be parsed into the schema. Each such path is represented as an array of the path parts.
+
splitRootArray
Split Root Array
Boolean
If the root object is an array, it can either be parsed as separate events, or as a single event which contains only an array.
+
keepOriginalNestedJsonString
Keep Original Nested Json String
Boolean
When using the nestedJsonPaths parameter, the original JSON string can optionally be kept in addition to the parsed value. If this is selected, the parsed data is stored in a record with the suffix _parsed.
+
storeJsonAsString
Store Json As String
Boolean
Whether to store the JSON in native format in a separate field.
+

Example

1
{
2
"clazz" : "JsonContentType"
3
}
Copied!

CSV

Fields

Field
Name
Type
Description
Optional
inferTypes
Infer Types
Boolean
Whether or not to infer types. If not selected, Upsolver will read all fields as strings.
header
Header
String
If applicable, the header of the file. If you only add details for one column, additional columns will be labeled as overflow columns.
delimiter
Delimiter
Char
The delimiter between columns of data.
+
nullValue
Null Value
String
If applicable, the default null value in the data.
+
nestedJsons
Nested Jsons
NestedPath[]
Paths to string fields that contain a "stringified" JSON that should be parsed into the schema. Each such path is represented as an array of the path parts.
+
keepOriginalNestedJsonString
Keep Original Nested Json String
Boolean
When using the nestedJsonPaths parameter, the original JSON string can optionally be kept in addition to the parsed value. If this is selected, the parsed data will be stored in a record with the suffix _parsed.
+

Example

1
{
2
"clazz" : "CsvContentType",
3
"inferTypes" : true,
4
"header" : "header1,header2,header2"
5
}
Copied!

TSV

Fields

Field
Name
Type
Description
Optional
inferTypes
Infer Types
Boolean
Whether or not to infer types. If not selected, Upsolver will read all fields as strings.
header
Header
String
If applicable, the header of the file. If you only add details for one column, additional columns will be labeled as overflow columns.
nestedJsons
Nested Jsons
NestedPath[]
Paths to string fields that contain a "stringified" JSON that should be parsed into the schema. Each such path is represented as an array of the path parts.
+
keepOriginalNestedJsonString
Keep Original Nested Json String
Boolean
When using the nestedJsonPaths parameter, the original JSON string can optionally be kept in addition to the parsed value. If this is selected, the parsed data will be stored in a record with the suffix _parsed.
+

Example

1
{
2
"clazz" : "TsvContentType",
3
"inferTypes" : true,
4
"header" : "header1,header2,header2"
5
}
Copied!

x-www-form-urlencoded

Fields

Field
Name
Type
Description
Optional
inferTypes
Infer Types
Boolean
Whether or not to infer types. If not selected, Upsolver will read all fields as strings.

Example

1
{
2
"clazz" : "WWWFormUrlEncodedType",
3
"inferTypes" : true,
4
}
Copied!

Protobuf

Fields

Field
Name
Type
Description
Optional
schemaFiles
Schema Files
SchemaFile[]
mainFile
Main File
String
The main file from the list of selected schema files.
messageType
Message Type
String
The message type.
bytesParsers
Bytes Parsers
BytesParser[]
BytesParsers can be used to define special parsing behavior for 'bytes' fields in the schema.
bytesParsers.path
Path
String
The path to the field that should use this parser.
bytesParsers.parserSchema
Parser Schema
String
The schema that the parser will use. In CSV and TSV formats, it should be a comma-delimited list of the field names in the CSV/TSV rows. In the Avro format, it should be the Avro schema used to decode the bytes of the inner field.
+
bytesParsers.schemaType
Schema Type
String
The type of parser to use. Supported types are JSON, CSv, TSF, or Avro.
+

Example

1
{
2
"clazz" : "ProtobufContentType",
3
"schemaFiles" : "schemaFiles",
4
"mainFile" : "mainFile",
5
"messageType" : "messageType",
6
"bytesParsers" : "bytesParsers"
7
}
Copied!

Avro-record

Individual Avro records without the framing or schema.

Fields

Field
Name
Type
Description
Optional
schema
Schema
String
The Avro schema used to decode the messages. Note that the behavior is undefined if the schema does not match the data.
bytesParsers
Bytes Parsers
BytesParser[]
BytesParsers can be used to define special parsing behavior for 'bytes' fields in the schema.
+
bytesParsers.path
Path
String
The path to the field that should use this parser.
bytesParsers.parserSchema
Parser Schema
String
The schema that the parser will use. In CSv and TSV formats, it should be a comma-delimited list of the field names in the CSV/TSV rows. In the Avro format, it should be the Avro schema used to decode the bytes of the inner field.
+
bytesParsers.schemaType
Schema Type
String
The type of parser to use. Supported types are JSON, CSV, TSV, or Avro.
+

Example

1
{
2
"clazz" : "AvroRecordContentType",
3
"schema" : "{ \"type\": \"record\", \"name\":
4
\"root\", \"fields\": [ {\"name\": \"value\", \"type\":
5
\"string\" } ] }"
6
}
Copied!

Avro w/ Schema Registry

Individual Avro records with schema provided by Schema Registry.

Fields

Field
Name
Type
Description
Optional
schemaRegistryUrl
Schema Registry Url
String
The URL of getting schema where {id} is the ID of the schema.
schemaRegistryFormat
Schema Registry Format
SchemaRegistryFormat
The strategy used to encode the schema version into every message.
+
bytesParsers
Bytes Parsers
BytesParser[]
BytesParsers can be used to define special parsing behavior for 'bytes' fields in the schema.
bytesParsers.path
Path
String
The path to the field that should use this parser.
bytesParsers.parserSchema
Parser Schema
String
The schema that the parser will use. In CSV and TSV formats, it should be a comma-delimited list of the field names in the CSV/TSV rows. In the Avro format, it should be the Avro schema used to decode the bytes of the inner field.
+
bytesParsers.schemaType
Schema Type
String
The type of parser to use. Supported types are JSON, CSV, TSV, or Avro.
+

Example

1
{
2
"clazz" : "AvroSchemaRegistryContentType",
3
"schemaRegistryUrl" : "schemaRegistryUrl"
4
}
Copied!

XML

XML data. Multiple XMLs can be read from a single file/record by appending them with optional whitespace in between.

Fields

Field
Name
Type
Description
Optional
storeRootAsString
Store Root As String
Boolean
Whether to store the XML root in native format in a separate field.
+

Example

1
{
2
"clazz" : "XmlContentType",
3
}
Copied!
Last modified 8mo ago