Conform Mapping#

The conform mapping performs simply name and type mangling transformations to conform data to some standard. For example, you can replace all date columns by timestamp columns (this is required for older versions of Hive) or you can transform column names from camel case to snake case to better match SQL.

Example#

mappings:
  partial_facts:
    kind: conform
    input: facts
    naming: snakeCase
    types:
      date: timestamp

Fields#

kind (mandatory) (type: string): conform
broadcast (optional) (type: boolean) (default: false): Hint for broadcasting the result of this mapping for map-side joins.
cache (optional) (type: string) (default: NONE): Cache mode for the results of this mapping. Supported values are
- NONE
- DISK_ONLY
- MEMORY_ONLY
- MEMORY_ONLY_SER
- MEMORY_AND_DISK
- MEMORY_AND_DISK_SER
input (mandatory) (type: string): Specifies the name of the input mapping to be conformed.
naming (optional) (type: string): Specifies the naming scheme used for the output. The following values are supported:
- camelCase - This will
- snakeCase
- camelCaseUpper
types (optional) (type: map:string): Specifies the list of types and how they should be replaced. The following types can be specified as source types:
- BYTE or TNINYINT
- SHORT or SMALLINT
- INT or INTEGER
- LONG or BIGINT
- BOOLEAN or BOOL
- FLOAT
- DOUBLE
- DECIMAL
- STRING or TEXT
- DURATION
- TIMESTAMP
- DATE Note that both CHAR(n) and VARCHAR(n) are matched to the entry for STRING type.
flatten (optional) (type: boolean) (default: false): Flattens all nested structs into a flat list of columns if set to true
filter (optional) (type: string) (default: empty): An optional SQL filter expression that is applied after conforming.

Outputs#

main - the only output of the mapping