Latest Mapping#
The latest mapping keeps only the latest (newest) record per ID. This is useful
when working with streams of change events, and you only want to keep the newest
event for each ID.
Example#
mappings:
latest_customer_updates:
kind: latest
input: all_customer_updates
versionColumns: ts
keyColumns: customer_id
Fields#
kind(mandatory) (string):latestbroadcast(optional) (type: boolean) (default: false): Hint for broadcasting the result of this mapping for map-side joins.cache(optional) (type: string) (default: NONE): Cache mode for the results of this mapping. Supported values areNONE- Disables caching of teh results of this mappingDISK_ONLY- Caches the results on diskMEMORY_ONLY- Caches the results in memory. If not enough memory is available, records will be uncached.MEMORY_ONLY_SER- Caches the results in memory in a serialized format. If not enough memory is available, records will be uncached.MEMORY_AND_DISK- Caches the results first in memory and then spills to disk.MEMORY_AND_DISK_SER- Caches the results first in memory in a serialized format and then spills to disk.
input(mandatory) (string):versionColumnsSpecifies the columns where the version (or timestamp) is contained. For each ID only the record with the highest value will be kept.keyColumnsSpecifies one or more columns forming a primary key or ID. Different versions of the same entity are then distinguished by theversionColumns
Outputs#
main- the only output of the mapping