Upsolver
Search…
Index
This page contains a global dictionary that contains descriptions of all the terms that may be encountered when using Upsolver.
Terms
Component
Description
# of Arrays
data source
The number of arrays in the selected hierarchy.
# of Fields
data source
The number of fields in the selected hierarchy.
# of Keys
data source
The number of keys in the selected hierarchy.
<Database> Connection
output property
Connection string of database connected to this output.
<Storage> Connection
output property
Connection string for storage location.
<Type> Connection
data source property
The connection string that represents a resource and its credentials.
Additional Kafka Properties
output property
Tags that can be sent to external monitoring systems.
Additional Processing Units for Replay
cluster property
Amount of processing units for replay tasks that run on a separate cluster up to this size, billed separately. If active, it is recommended for this cluster to be at least as big as the maximum size of the cluster.
Additional Schemas
Avro-record property
(Optional) Additional Avro record schemas.
Aggregated
output property
Whether or not this output is aggregated.
Aliases
output property
The lookup table alias.
Allow Maintenance Access
cluster property
Whether or not Upsolver is allowed to access your instances over SSH for maintenance purposes.
Attached Workspaces
data source property
The workspaces attached to the data source.
Attached Workspaces
output property
The workspaces attached to this output.
Avro Record Schema
Avro-record property
(Optional) The Avro record schema.
Bytes Parsers
Avro-record property
(Optional) A parser for JSON-format schemas.
Clusters
output property
Click to edit the output clusters.
Compaction Shards #
output property
The number of files that can be compacted in parallel during a compaction.
Compression
data source property
The compression in the data (Zip, GZip, Snappy, SnappyUnframed, Tar, or None).
Compression
output property
The compression applied to the output (None, GZip, or Snappy).
Compute Cluster
data source property
Compute cluster this data source is running on.
Compute Cluster
output property
The compute cluster running the calculation.
Connection
output property
Connection type (e.g. Athena).
Connection
output property
Elasticsearch connection string.
Connection
moitoring systems property
The Elasticsearch connection.
Connection String
moitoring systems property
The connection string to connect InfluxDB.
Content Format
data source property
Content format of ingested data source (Avro, Parquet, ORC, JSON, CSV, TSV, x-www-form-urlencoded, Protobuf, Avro-record, Avro with Schema Registry, or XML).
Created By
data source property
Which user this data source was created by.
Created By
output property
Which user this output was created by.
Creation Time
data source property
Time this data source was created.
Creation Time
output property
Time this output was created.
Data Dump Date
data source property
The date that the data starts.
Data Sources
output property
The required data sources to base the output on.
Database Name
output property
Output database name.
Database Name
moitoring systems property
The database name.
Datadog API Key
moitoring systems property
The API key to access Datadog.
Date Format
output property
How output will be partitioned by time (e.g. yyyy/MM/dd).
Date Pattern
data source property
The date pattern of the files ingested.
Delay
data source
How far behind the system is processing the data, in minutes.
Delete Indices
output property
Delete indices from ElasticSearch based on retention period.
Delimiter
CSV property
The delimiter between columns of data.
Density in Data
data source
The density in the hierarchy, that is, how many of the events in this branch of the data hierarchy include this field, expressed a percentage.
Density in Events
data source
How many of the events in this data source include this field, expressed as a percentage (e.g. 20.81%).
Description
data source property
Description of data source.
Description
output property
Description of output.
Distinct Values
data source
How many unique values appear in this field.
DNS Alias
cluster property
DNS alias for private IPs. This is an alternative option to Elastic IPs.
Elastic IPs
cluster property
Whether or not Elastic IP is enabled. If enabled, Elastic IPs will be created for these servers. The server instances will use these Elastic IPs, allowing you to open access for your servers in external resources.
Enable gzip
moitoring systems property
Enable gzip compress for the HTTP request body.
End Read At
data source property
When to stop reading the stream.
Ending At
output property
Whether to continue processing indefinitely, stop processing as soon as possible, or stop on a specific date and time.
ETA
data source
The expected time of arrival of the data (e.g. when the system is ingesting the data at about the same rate as the data is being generated, this will be less than a minute).
Excluded Partitions
output property
The partitions to exclude. You can import, export, or edit the list.
Execution Parallelism
data source property
The number of independent shards to parse data, to increase parallelism and reduce latency. This should remain 1 in most cases.
Expose Private IPs
cluster property
Whether or not to expose the cluster with its private IP in the DNS record.
Extra Security Groups
cluster property
Security groups that can be added in addition to the default security groups created by Upsolver.
Fail on Write Error
output property
If enabled, any error while copying to the target will cause the load to fail. If disabled the same behavior will occur after 100K errors (The max allowed by Redshift)
Field Content Samples Over Time
data source
A time-series graph of the total number of events that include the selected field.
Fields Breakdown
data source
A stacked bar chart (by data type) of the number of fields versus the density/distinct values or a stacked bar chart of the number of fields by data type.
Fields Statistics
data source
A list of the fields in the hierarchy element showing the Type, Density, Top Values, Key, Distinct Values, Array, First Seen, and Last Seen.
File Name Pattern
data source property
The file name pattern.
Files Being Written
data source
Number of files currently being written from this data source to outputs.
First Seen
data source
The first time this field included a value (e.g. a year ago).
Folder
data source property
The data folder (e.g. billing data). If this is not specified, the data is assumed to be in the top-level of the hierarchy.
Header
CSV property
If applicable, the header of the file. If you only add details for one column, additional columns will be labeled as overflow columns.
Header
TSV property
If applicable, the header of the file. If you only add details for one column, additional columns will be labeled as overflow columns.
Hosts
cluster property
Add the following host/IP mappings to the hosts file.
Index Name
output property
Name of index output is written to.
Index Partition Size
output property
Size of index partition.
Index Type
moitoring systems property
(Optional) The type of the index.
Infer Types
CSV property
Whether or not to infer types. If not selected, Upsolver will read all fields as strings.
Infer Types
TSV property
Whether or not to infer types. If not selected, Upsolver will read all fields as strings.
Intermediate Storage
output property
Location where Upsolver stores the intermediate bulk files before uploading.
Intermediate Storage Location
output property
Where Upsolver stores the intermediate bulk files before uploading.
Interval
data source property
The sliding interval to wait for data.
Is Managed
moitoring systems property
True if you are using a managed Splunk account and False otherwise. This can be inferred from the URL of your account: splunkcloud.com usually means managed and cloud.splunk.com usually means self-service.
Kafka Hosts
output property
Comma-separated list of host:port pairs required to connect to Kafka cluster.
Kafka Topic
data source property
The Kafka topic.
Kafka Version
data source property
The Kafka version.
Kinesis Connection
output property
Connection string for Kinesis stream.
Last Seen
data source
The last time this field included a value (e.g. 2 minutes ago).
Main File
Protobuf property
(Optional) The main file from the list of selected schema files.
Max Delay
data source property
The maximum delay to consider the data, that is, any data that arrives delayed by more than the max delay is filtered out.
Max Elastic IPs Num
cluster property
Maximum number of available IPs. Leave empty for no limit.
Message Type
Protobuf property
(Optional) The message type.
Modified By
data source property
Which user last modified this data source.
Modified By
output property
Which user last modified this output.
Modified Time
data source property
If applicable, time this data source was last modified.
Modified Time
output property
If applicable, time this output was last modified.
Name
data source property
Data source name in Upsolver.
Name
output property
Output name in Upsolver.
Name
cluster property
Cluster name.
Name
moitoring systems property
The name of the monitoring report.
Namespace
moitoring systems property
The namespace — container for CloudWatch Metrics.
Null Value
CSV property
If applicable, the default null value in the data.
Omit Key Columns
output property
Whether or not to store the key columns in the lookup table. This saves space if enabled, but prevents iteration from returning the key columns.
Output Format
output property
File format for output.
Output Interval
output property
The output interval in minutes, hours or days.
Output Shards #
output property
Set the number of files to be created each interval in the output. This applies only to aggregated outputs (for non-aggregated outputs, the Shards configuration has this effect).
Override Window Size
output property
If enabled, this feature can be set to minutes, hours, or days, or Infinite Override Size. For example, if you wish to create an hourly output that will aggregate data for the last day, then the output interval should be 1 hour, but override window size should be 1 day. This option is useful when used in combination with upserts. For example, if you want the latest total count of events per user, set the Override Window to infinite; in this case, every time an event arises for a user, the updated total count for the user is upserted, and the old record is deleted.
Partition Size
output property
The output partition size (Hourly, Daily, Weekly, or Yearly).
Password
moitoring systems property
The InfluxDB password.
Path
output property
The desired path for data to be exported to. If left empty, a system generated unique path will be used.
Pattern
data source property
The file pattern.
Prefix
data source property
The file prefix.
Private VPC
cluster property
Your private VPC connection.
Processing Units
cluster property
Amount of processing units.
Read From Start
data source property
Whether the stream was read from the start.
Real Time
output property
Load data into this lookup table directly from the input stream using a real time cluster if one is deployed.
Real Time Statistics
data source property
Whether the data source statistics are calculated in real-time directly from the input stream. This is relevant to lookup tables, when answers are required very fast and in real-time.
Region
cluster property
Your AWS region.
Region
moitoring systems property
The CloudWatch AWS region.
Reporting Tags
output property
Tags that can be sent to external monitoring systems.
Retention
data source property
The retention policy in Upsolver in minutes, hours, or days. The data is deleted permanently after this period elapses (unless Soft Retention is set to Yes). By default, the data is kept forever.
Retention
output property
The retention policy in Upsolver in minutes, hours, or days. The data is deleted permanently after this period elapses (unless Soft Retention is set to Yes). By default, the data is kept forever.
Retention Policy
moitoring systems property
(Optional) The retention policy in InfluxDB.
Run Compactions
output property
Whether to run compactions on the output.
S3 Connection
output property
S3 storage location where Upsolver stores the intermediate bulk files before uploading.
S3 Storage
output property
S3 storage location where Upsolver stores the intermediate bulk files before uploading.
Scaling Strategy
cluster property
Cluster's scaling strategy.
Schema
output property
Schema of this output.
Schema files
Protobuf property
(Optional) Click Select to choose the required schema files.
Schema Registry URL
Avro with Schema Registry property
(Optional) The URL to the schema registry.
Selected
data source
The most recent data values for the selected field and columns. You can change the columns that appear by clicking Choose Columns.
Server Units
cluster property
Server compute units; how many compute units one server will use. Choosing a server with "high memory" indicates a server type that has less CPU units but more RAM.
Shards
data source property
The number of shards, and the more shards the quicker the data processing.
Shards #
output property
The number of independent shards to write, to increase parallelism and reduce latency. This should remain 1 in most cases, and should never be larger than the number of shards of the data sources.
Show Sparse Fields
data source property
Select to show fields with density under 0.01%.
SignalFx Auth Token
moitoring systems property
The auth token.
SignalFx Region
moitoring systems property
The region where your SignalFx environment runs.
Skip Failed Files
output property
If enabled, when a load fails the manifest of that load will be skipped. The skipped manifest will be saved aside for manual re-processing once the copy error has been fixed.
Soft Retention
data source property
A setting that prevents data deletion when the retention policy in Upsolver activates. When enabled, the metadata is purged but the underlying data (e.g. S3 object) is not deleted.
Soft Retention
output property
A setting that prevents data deletion when the retention policy in Upsolver activates. When enabled, the metadata is purged but the underlying data (e.g. S3 object) is not deleted.
Speed
data source
The speed at which the data is being ingested into the data source.
Split Root Array
JSON property
If the content is an array of JSON records, select to split the array content into separate records.
Splunk Deployment Name
moitoring systems property
The deployment name. For self-service accounts with a URL in the structureprd-p-xxx.cloud.splunk.com, the deployment name is: p-xxx. For managed accounts with a URL in the structure XXX.splunkcloud.com, the deployment name is: XXX.
Splunk HTTP Event Collector Token
moitoring systems property
The HEC token. Enables sending data over HTTP (or HTTPS) directly to Splunk Enterprise or Splunk Cloud.
Start Execution From
output property
When execution started.
Start Ingesting From
data source property
When ingestion started.
Store JSON as String
JSON property
Whether to store the JSON in native format in a separate field.
Store Raw Data
data source property
Whether to store an additional copy of the data in its original format.
Stream Name
data source property
The Kinesis stream name.
Stream Name
output property
Name of stream output is written to.
Table Name
output property
Output table name.
Tabular
output property
Whether or not the output is tabular.
Tags
cluster property
Assign custom metadata to your cluster using tags consisting of a key and a value, both of which you define. This is a convenient way to categorize your AWS resources in different ways (e.g. by purpose, owner, or environment). For example, you could define a set of tags for your account's clusters that helps you track each cluster's owner or identify a production cluster versus a testing cluster. It is recommended that you create a consistent set of tags to meet your organization requirements.
Target Storage
data source property
Where to store the data read (the output storage).
Time
output property
  • Event based data sources (Kafka, Kinesis, Azure Event Hubs) time is the ingestion time - when the event is stored in the source
  • File based data sources (S3, Azure Blob, etc...) time is the timestamp when Upsolver discovers the new file(s)
  • JDBC data sources time is when Upsolver discovers a new or updated row
Topic Name
output property
Name of topic this output is written to.
Total Values
data source
The total number of values ingested for this field.
Unresolved Errors
data source
Number of unresolved errors stemming from outputs created from this data source.
Use SSL
data source property
Whether or not to use SSL.
User
moitoring systems property
The InfluxDB user.
Value Distribution
data source
The percentage distribution of the field values. These distribution values can be exported by clicking Export.
Window
output property
Window of time to keep in lookup table.
Workspaces
cluster property
The workspaces attached to this cluster.
Write Logs to Upsolver
cluster property
Whether or not to write logs to Upsolver's S3 environment.
Written Files
data source
Number of files written to outputs from this data source.
Copy link