apache_beam.ml.anomaly.univariate.mad module

Trackers for calculating median absolute deviation in windowed fashion.

class apache_beam.ml.anomaly.univariate.mad.MadTracker(*args, **kwargs)[source]

Bases: BaseTracker

Tracks the Median Absolute Deviation (MAD) of a stream of values.

This class calculates the MAD, a robust measure of statistical dispersion, in an online setting.

Similar functionality is available in the River library: https://github.com/online-ml/river/blob/main/river/stats/mad.py

Important

This online version of MAD that does not exactly match its batch counterpart. In a streaming data context, where the true median is initially unknown, we employ an iterative estimation process. For each incoming data point, we first update the estimated median, and then calculate the absolute difference between the data point and this updated median. To maintain computational efficiency, previously calculated absolute differences are not recalculated with each subsequent median update.

Parameters:
  • median_tracker – An optional MedianTracker instance for tracking the median of the input values. If None, a default MedianTracker is created.

  • diff_median_tracker – An optional MedianTracker instance for tracking the median of the absolute deviations from the median. If None, a default MedianTracker is created.

push(x)[source]

Adds a new value to the tracker and updates the MAD.

Parameters:

x – The value to be added to the tracked stream.

get()[source]

Retrieves the current MAD value.

Returns:

The MAD of the values within the defined window. Returns NaN if

the window is empty.

Return type:

float

get_median()[source]

Retrieves the current median value.

Returns:

The median of the values within the defined window. Returns NaN

if the window is empty.

Return type:

float

MadTracker__spec_type = 'MadTracker'
classmethod from_spec(spec: Spec, _run_init: bool = True) Self | type[Self]

Generate a Specifiable subclass object based on a spec.

Parameters:
  • spec – the specification of a Specifiable subclass object

  • _run_init – whether to call __init__ or not for the initial instantiation

Returns:

the Specifiable subclass object

Return type:

Self

run_original_init() None

Execute the original __init__ method with its saved arguments.

For instances of the Specifiable class, initialization is deferred (lazy initialization). This function forces the execution of the original __init__ method using the arguments captured during the object’s initial instantiation.

classmethod spec_type()
to_spec() Spec

Generate a spec from a Specifiable subclass object.

Returns:

The specification of the instance.

Return type:

Spec

classmethod unspecifiable()