Measure Target#
The measure
target will perform some measurements which then are provided as execution metrics. These measurements
are used to assess data quality. The measures to be taken are specified as measure instances.
Since a measure target needs to be explicitly executed, it will increase your overall job execution time. Since
version 0.30.0, Flowman offers an alternative observe
mapping, which offers similar (but
more limited) capabilities and is much cheaper from an execution time point of view.
Example#
targets:
measures:
kind: measure
measures:
record_stats:
kind: sql
query: "
SELECT
COUNT(*) AS record_count,
MIN(air_temperature) AS min_temperature,
MAX(air_temperature) AS max_temperature
FROM measurements"
These metrics then can be published in a job as follows:
jobs:
main:
targets:
- measures
metrics:
# Add some common labels to all metrics
labels:
force: ${force}
phase: ${phase}
status: ${status}
metrics:
# This metric contains the processing time per output
- name: flowman_output_time
selector:
name: target_runtime
labels:
phase: BUILD
category: target
labels:
output: ${name}
# This metric contains the overall processing time
- name: flowman_processing_time
selector:
name: job_runtime
labels:
phase: BUILD
category: job
# The following metrics have been defined in the "measures" target
- name: record_count
selector:
name: record_count
- name: min_temperature
selector:
name: min_temperature
- name: max_temperature
selector:
name: max_temperature
This example will provide three metrics, record_count
, min_temperature
and max_temperature
, which then can be
sent to a metric sink configured in the namespace.
Provided Metrics#
All metrics defined as named columns are exported with the following labels:
- name
- The name of the measure (i.e. record_stats
above)
- category
- Always set to measure
- kind
- Always set to sql
- namespace
- Name of the namespace (typically default
)
- project
- Name of the project
- version
- Version of the project
Supported Execution Phases#
VERIFY
- The evaluation of all measures will only be performed in theVERIFY
phase
Read more about execution phases.
Dirty Condition#
A measure
target is always dirty for the VERIFY
execution phase.