mock mapping works similar to the
null mapping in that it creates an empty output. But instead of
explicitly specifying a schema of the empty output, the
mock mapping will fetch the schema from a different mapping.
This mapping is most useful to be used in tests. In addition, it is also possible to manually specify records to be
returned, which makes this mapping even more convenient for mocking.
mappings: # This will mock `some_mapping` some_mapping: kind: mock
mappings: some_other_mapping: kind: mock mapping: some_mapping records: - [1,2,"some_string",""] - [2,null,"cat","black"]
mappings: some_mapping: kind: mock mapping: some_mapping records: - Campaign ID: DIR_36919 LineItemID ID: DIR_260390 SiteID ID: 23374 CreativeID ID: 292668 PlacementID ID: 108460 - Campaign ID: DIR_36919 LineItemID ID: DIR_260390 SiteID ID: 23374 CreativeID ID: 292668 PlacementID ID: 108460
kind(mandatory) (type: string):
broadcast(optional) (type: boolean) (default: false): Hint for broadcasting the result of this mapping for map-side joins.
cache(optional) (type: string) (default: NONE): Cache mode for the results of this mapping. Supported values are
mapping(optional) (type: string): Specifies the name of the mapping to be mocked. If no name is given, then a mapping with the same name will be mocked. Note that this will only work when used as an override mapping in test cases, otherwise an infinite loop would be created by referencing to itself.
records(optional) (type: list:array) (default: empty): An optional list of records to be returned.
mock mapping provides all outputs as the mocked mapping.
Mocking mappings is very useful for creating meaningful tests. But you need to take into account one important fact:
Mocking for testing only works well if Flowman doesn’t need to read real data. Access to real data might occur when
Flowman doesn’t have all schema information in the specification and therefore falls back to Spark doing schema
inference. This is something you always should avoid. The best way to avoid automatic schema inference with Spark
is to explicitly specify schema definitions in all relations and/or to explicitly specify all columns with data types