Mock Relation#
A mock
relation works similar to a empty
relation in the sense that it does only return empty data.
The main difference is that a mock
relation picks up the schema from a different relation. It’s main use case is
within test cases where you want to replace physical data sources by empty mocked data sources with a minimum amount
of work.
Example#
relations:
mocked_relation:
kind: mock
relation: real_relation
relations:
some_relation:
kind: mock
relation: some_relation
records:
- [1,2,"some_string",""]
- [2,null,"cat","black"]
relations:
data_raw:
kind: mock
records:
- Campaign ID: DIR_36919
LineItemID ID: DIR_260390
SiteID ID: 23374
CreativeID ID: 292668
PlacementID ID: 108460
- Campaign ID: DIR_36919
LineItemID ID: DIR_260390
SiteID ID: 23374
CreativeID ID: 292668
PlacementID ID: 108460
Fields#
kind
(mandatory) (string):mock
relation
(optional) (string) (default: empty): Specify the base relation to be mocked. If no relation is specified, a relation with the same name will be mocked. Of course this doesn’t work within the same project on project level. But it works well when themock
relation is created inside a test.records
(optional) (type: list:array) (default: empty): An optional list of records to be returned. Note that this list needs to include values for any partition columns of the mocked relation. The partition values need to be appended at the end.
Output Modes#
A mock
relation supports all output modes, each of them simply discarding all records.
Remarks#
Mocking relations is very useful for creating meaningful tests. But you need to take into account one important fact: Mocking for testing only works well if Flowman doesn’t need to read real data. Access to real data might occur when Flowman doesn’t have all schema information in the specification and therefore falls back to Spark doing schema inference. This is something you always should avoid. The best way to avoid automatic schema inference with Spark is to explicitly specify schema definitions in all relations.