apache_beam.ml.anomaly.aggregations module

class apache_beam.ml.anomaly.aggregations.LabelAggregation(agg_func: Callable[[Iterable[int]], int], agg_model_id: str | None = None, include_source_predictions: bool = False, normal_label: int = 0, outlier_label: int = 1, missing_label: int = -2)[source]

Bases: AggregationFn, _AggModelIdMixin, _SourcePredictionMixin

Aggregates anomaly predictions based on their labels.

This is an abstract base class for AggregationFn`s that combine multiple `AnomalyPrediction objects into a single AnomalyPrediction based on the labels of the input predictions.

Parameters:
  • agg_func (Callable[[Iterable[int]], int]) – A function that aggregates a collection of anomaly labels (integers) into a single label.

  • agg_model_id (Optional[str]) – The model id used in aggregated predictions. Defaults to None.

  • include_source_predictions (bool) – If True, include the input predictions in the source_predictions of the output. Defaults to False.

apply(predictions: Iterable[AnomalyPrediction]) AnomalyPrediction[source]

Applies the label aggregation function to a list of predictions.

Parameters:

predictions (Iterable[AnomalyPrediction]) – A collection of AnomalyPrediction objects to be aggregated.

Returns:

A single AnomalyPrediction object with the

aggregated label. The aggregated label is determined as follows:

  • If there are any non-missing and non-error labels, the agg_func is applied to aggregate them.

  • If all labels are error labels (None), the aggregated label is also None.

  • If there are a mix of missing and error labels, the aggregated label is the missing_label.

Return type:

AnomalyPrediction

class apache_beam.ml.anomaly.aggregations.ScoreAggregation(agg_func: Callable[[Iterable[float]], float], agg_model_id: str | None = None, include_source_predictions: bool = False)[source]

Bases: AggregationFn, _AggModelIdMixin, _SourcePredictionMixin

Aggregates anomaly predictions based on their scores.

This is an abstract base class for AggregationFn`s that combine multiple `AnomalyPrediction objects into a single AnomalyPrediction based on the scores of the input predictions.

Parameters:
  • agg_func (Callable[[Iterable[float]], float]) – A function that aggregates a collection of anomaly scores (floats) into a single score.

  • agg_model_id (Optional[str]) – The model id used in aggregated predictions. Defaults to None.

  • include_source_predictions (bool) – If True, include the input predictions in the source_predictions of the output. Defaults to False.

apply(predictions: Iterable[AnomalyPrediction]) AnomalyPrediction[source]

Applies the score aggregation function to a list of predictions.

Parameters:

predictions (Iterable[AnomalyPrediction]) – A collection of AnomalyPrediction objects to be aggregated.

Returns:

A single AnomalyPrediction object with the

aggregated score. The aggregated score is determined as follows:

  • If there are any non-missing and non-error scores, the agg_func is applied to aggregate them.

  • If all scores are error scores (None), the aggregated score is also None.

  • If there are a mix of missing (NaN) and error scores (None), the aggregated score is NaN.

Return type:

AnomalyPrediction

class apache_beam.ml.anomaly.aggregations.MajorityVote(*args, **kwargs)[source]

Bases: LabelAggregation

Aggregates anomaly labels using majority voting.

This AggregationFn implements a majority voting strategy to combine anomaly labels from multiple AnomalyPrediction objects. It counts the occurrences of normal and outlier labels and selects the label with the higher count as the aggregated label. In case of a tie, a tie-breaker label is used.

Example

If input labels are [normal, outlier, outlier, normal, outlier], and normal_label=0, outlier_label=1, then the aggregated label will be outlier (1) because outliers have a majority (3 vs 2).

Parameters:
  • normal_label (int) – The integer label for normal predictions. Defaults to 0.

  • outlier_label (int) – The integer label for outlier predictions. Defaults to 1.

  • tie_breaker (int) – The label to return if there is a tie in votes. Defaults to 0 (normal_label).

  • **kwargs – Additional keyword arguments to pass to the base LabelAggregation class.

MajorityVote__spec_type = 'MajorityVote'
classmethod from_spec(spec: Spec, _run_init: bool = True) Self | type[Self]

Generate a Specifiable subclass object based on a spec.

Parameters:
  • spec – the specification of a Specifiable subclass object

  • _run_init – whether to call __init__ or not for the initial instantiation

Returns:

the Specifiable subclass object

Return type:

Self

run_original_init() None

Execute the original __init__ method with its saved arguments.

For instances of the Specifiable class, initialization is deferred (lazy initialization). This function forces the execution of the original __init__ method using the arguments captured during the object’s initial instantiation.

classmethod spec_type()
to_spec() Spec

Generate a spec from a Specifiable subclass object.

Returns:

The specification of the instance.

Return type:

Spec

classmethod unspecifiable()
class apache_beam.ml.anomaly.aggregations.AllVote(*args, **kwargs)[source]

Bases: LabelAggregation

Aggregates anomaly labels using an “all vote” (AND) scheme.

This AggregationFn implements an “all vote” strategy. It aggregates anomaly labels such that the result is considered an outlier only if all input AnomalyPrediction objects are labeled as outliers.

Example

If input labels are [outlier, outlier, outlier], and outlier_label=1, then the aggregated label will be outlier (1). If input labels are [outlier, normal, outlier], and outlier_label=1, then the aggregated label will be normal (0).

Parameters:
  • normal_label (int) – The integer label for normal predictions. Defaults to 0.

  • outlier_label (int) – The integer label for outlier predictions. Defaults to 1.

  • **kwargs – Additional keyword arguments to pass to the base LabelAggregation class.

AllVote__spec_type = 'AllVote'
classmethod from_spec(spec: Spec, _run_init: bool = True) Self | type[Self]

Generate a Specifiable subclass object based on a spec.

Parameters:
  • spec – the specification of a Specifiable subclass object

  • _run_init – whether to call __init__ or not for the initial instantiation

Returns:

the Specifiable subclass object

Return type:

Self

run_original_init() None

Execute the original __init__ method with its saved arguments.

For instances of the Specifiable class, initialization is deferred (lazy initialization). This function forces the execution of the original __init__ method using the arguments captured during the object’s initial instantiation.

classmethod spec_type()
to_spec() Spec

Generate a spec from a Specifiable subclass object.

Returns:

The specification of the instance.

Return type:

Spec

classmethod unspecifiable()
class apache_beam.ml.anomaly.aggregations.AnyVote(*args, **kwargs)[source]

Bases: LabelAggregation

Aggregates anomaly labels using an “any vote” (OR) scheme.

This AggregationFn implements an “any vote” strategy. It aggregates anomaly labels such that the result is considered an outlier if at least one of the input AnomalyPrediction objects is labeled as an outlier.

Example

If input labels are [normal, normal, outlier], and outlier_label=1, then the aggregated label will be outlier (1). If input labels are [normal, normal, normal], and outlier_label=1, then the aggregated label will be normal (0).

Parameters:
  • normal_label (int) – The integer label for normal predictions. Defaults to 0.

  • outlier_label (int) – The integer label for outlier predictions. Defaults to 1.

  • **kwargs – Additional keyword arguments to pass to the base LabelAggregation class.

AnyVote__spec_type = 'AnyVote'
classmethod from_spec(spec: Spec, _run_init: bool = True) Self | type[Self]

Generate a Specifiable subclass object based on a spec.

Parameters:
  • spec – the specification of a Specifiable subclass object

  • _run_init – whether to call __init__ or not for the initial instantiation

Returns:

the Specifiable subclass object

Return type:

Self

run_original_init() None

Execute the original __init__ method with its saved arguments.

For instances of the Specifiable class, initialization is deferred (lazy initialization). This function forces the execution of the original __init__ method using the arguments captured during the object’s initial instantiation.

classmethod spec_type()
to_spec() Spec

Generate a spec from a Specifiable subclass object.

Returns:

The specification of the instance.

Return type:

Spec

classmethod unspecifiable()
class apache_beam.ml.anomaly.aggregations.AverageScore(*args, **kwargs)[source]

Bases: ScoreAggregation

Aggregates anomaly scores by calculating their average.

This AggregationFn computes the average of the anomaly scores from a collection of AnomalyPrediction objects.

Parameters:

**kwargs – Additional keyword arguments to pass to the base ScoreAggregation class.

AverageScore__spec_type = 'AverageScore'
classmethod from_spec(spec: Spec, _run_init: bool = True) Self | type[Self]

Generate a Specifiable subclass object based on a spec.

Parameters:
  • spec – the specification of a Specifiable subclass object

  • _run_init – whether to call __init__ or not for the initial instantiation

Returns:

the Specifiable subclass object

Return type:

Self

run_original_init() None

Execute the original __init__ method with its saved arguments.

For instances of the Specifiable class, initialization is deferred (lazy initialization). This function forces the execution of the original __init__ method using the arguments captured during the object’s initial instantiation.

classmethod spec_type()
to_spec() Spec

Generate a spec from a Specifiable subclass object.

Returns:

The specification of the instance.

Return type:

Spec

classmethod unspecifiable()
class apache_beam.ml.anomaly.aggregations.MaxScore(*args, **kwargs)[source]

Bases: ScoreAggregation

Aggregates anomaly scores by selecting the maximum score.

This AggregationFn selects the highest anomaly score from a collection of AnomalyPrediction objects as the aggregated score.

Parameters:

**kwargs – Additional keyword arguments to pass to the base ScoreAggregation class.

MaxScore__spec_type = 'MaxScore'
classmethod from_spec(spec: Spec, _run_init: bool = True) Self | type[Self]

Generate a Specifiable subclass object based on a spec.

Parameters:
  • spec – the specification of a Specifiable subclass object

  • _run_init – whether to call __init__ or not for the initial instantiation

Returns:

the Specifiable subclass object

Return type:

Self

run_original_init() None

Execute the original __init__ method with its saved arguments.

For instances of the Specifiable class, initialization is deferred (lazy initialization). This function forces the execution of the original __init__ method using the arguments captured during the object’s initial instantiation.

classmethod spec_type()
to_spec() Spec

Generate a spec from a Specifiable subclass object.

Returns:

The specification of the instance.

Return type:

Spec

classmethod unspecifiable()