apache_beam.ml.anomaly.univariate.mad module
Trackers for calculating median absolute deviation in windowed fashion.
- class apache_beam.ml.anomaly.univariate.mad.MadTracker(*args, **kwargs)[source]
Bases:
BaseTracker
Tracks the Median Absolute Deviation (MAD) of a stream of values.
This class calculates the MAD, a robust measure of statistical dispersion, in an online setting.
Similar functionality is available in the River library: https://github.com/online-ml/river/blob/main/river/stats/mad.py
Important
This online version of MAD that does not exactly match its batch counterpart. In a streaming data context, where the true median is initially unknown, we employ an iterative estimation process. For each incoming data point, we first update the estimated median, and then calculate the absolute difference between the data point and this updated median. To maintain computational efficiency, previously calculated absolute differences are not recalculated with each subsequent median update.
- Parameters:
median_tracker – An optional MedianTracker instance for tracking the median of the input values. If None, a default MedianTracker is created.
diff_median_tracker – An optional MedianTracker instance for tracking the median of the absolute deviations from the median. If None, a default MedianTracker is created.
- push(x)[source]
Adds a new value to the tracker and updates the MAD.
- Parameters:
x – The value to be added to the tracked stream.
- get()[source]
Retrieves the current MAD value.
- Returns:
- The MAD of the values within the defined window. Returns NaN if
the window is empty.
- Return type:
- get_median()[source]
Retrieves the current median value.
- Returns:
- The median of the values within the defined window. Returns NaN
if the window is empty.
- Return type:
- MadTracker__spec_type = 'MadTracker'
- classmethod from_spec(spec: Spec, _run_init: bool = True) Self | type[Self]
Generate a Specifiable subclass object based on a spec.
- Parameters:
spec – the specification of a Specifiable subclass object
_run_init – whether to call __init__ or not for the initial instantiation
- Returns:
the Specifiable subclass object
- Return type:
Self
- run_original_init() None
Execute the original __init__ method with its saved arguments.
For instances of the Specifiable class, initialization is deferred (lazy initialization). This function forces the execution of the original __init__ method using the arguments captured during the object’s initial instantiation.
- classmethod spec_type()
- to_spec() Spec
Generate a spec from a Specifiable subclass object.
- Returns:
The specification of the instance.
- Return type:
- classmethod unspecifiable()