@Internal public class VarianceFn<T extends java.lang.Number> extends Combine.CombineFn<T,org.apache.beam.sdk.extensions.sql.impl.transform.agg.VarianceAccumulator,T>
Combine.CombineFn for Variance on Number types.
 Calculates Population Variance and Sample Variance using incremental formulas described, for example, by Chan, Golub, and LeVeque in "Algorithms for computing the sample variance: analysis and recommendations", The American Statistician, 37 (1983) pp. 242--247.
If variance is defined like this:
(x[1], ... , x[n])
   mean(x) = sum(x) / n
   ith element from the current mean: deviation(x, i) = x[i] -
       mean(n)
   variance(x) = deviation(x, 1)^2 + ... + deviation(x, n)^2
 Then variance of combined input of 2 samples (x[1], ... , x[n]) and (y[1], ...
 , y[m]) is calculated using this formula:
 
variance(concat(x,y)) = variance(x) + variance(y) + increment, where:
   increment = m/(n(m+n)) * (n/m * sum(x) - sum(y))^2
 This is also applicable for a single element increment, assuming that variance of a single element input is zero
To implement the above formula we keep track of the current variation, sum, and count of elements, and then use the formula whenever new element comes or we need to merge variances for 2 samples.
| Modifier and Type | Method and Description | 
|---|---|
| org.apache.beam.sdk.extensions.sql.impl.transform.agg.VarianceAccumulator | addInput(org.apache.beam.sdk.extensions.sql.impl.transform.agg.VarianceAccumulator currentVariance,
        T rawInput)Adds the given input value to the given accumulator, returning the new accumulator value. | 
| org.apache.beam.sdk.extensions.sql.impl.transform.agg.VarianceAccumulator | createAccumulator()Returns a new, mutable accumulator value, representing the accumulation of zero input values. | 
| T | extractOutput(org.apache.beam.sdk.extensions.sql.impl.transform.agg.VarianceAccumulator accumulator)Returns the output value that is the result of combining all the input values represented by
 the given accumulator. | 
| java.lang.reflect.TypeVariable<?> | getAccumTVariable()Returns the  TypeVariableofAccumT. | 
| Coder<org.apache.beam.sdk.extensions.sql.impl.transform.agg.VarianceAccumulator> | getAccumulatorCoder(CoderRegistry registry,
                   Coder<T> inputCoder)Returns the  Coderto use for accumulatorAccumTvalues, or null if it is not
 able to be inferred. | 
| Coder<OutputT> | getDefaultOutputCoder(CoderRegistry registry,
                     Coder<InputT> inputCoder)Returns the  Coderto use by default for outputOutputTvalues, or null if it
 is not able to be inferred. | 
| java.lang.String | getIncompatibleGlobalWindowErrorMessage()Returns the error message for not supported default values in Combine.globally(). | 
| java.lang.reflect.TypeVariable<?> | getInputTVariable()Returns the  TypeVariableofInputT. | 
| java.lang.reflect.TypeVariable<?> | getOutputTVariable()Returns the  TypeVariableofOutputT. | 
| org.apache.beam.sdk.extensions.sql.impl.transform.agg.VarianceAccumulator | mergeAccumulators(java.lang.Iterable<org.apache.beam.sdk.extensions.sql.impl.transform.agg.VarianceAccumulator> variances)Returns an accumulator representing the accumulation of all the input values accumulated in
 the merging accumulators. | 
| static <V extends java.lang.Number> | newPopulation(Schema.TypeName typeName) | 
| static <V extends java.lang.Number> | newPopulation(SerializableFunction<java.math.BigDecimal,V> decimalConverter) | 
| static <V extends java.lang.Number> | newSample(Schema.TypeName typeName) | 
| static <V extends java.lang.Number> | newSample(SerializableFunction<java.math.BigDecimal,V> decimalConverter) | 
| void | populateDisplayData(DisplayData.Builder builder)Register display data for the given transform or component. | 
apply, compact, defaultValue, getInputType, getOutputTypepublic static <V extends java.lang.Number> VarianceFn newPopulation(Schema.TypeName typeName)
public static <V extends java.lang.Number> VarianceFn newPopulation(SerializableFunction<java.math.BigDecimal,V> decimalConverter)
public static <V extends java.lang.Number> VarianceFn newSample(Schema.TypeName typeName)
public static <V extends java.lang.Number> VarianceFn newSample(SerializableFunction<java.math.BigDecimal,V> decimalConverter)
public org.apache.beam.sdk.extensions.sql.impl.transform.agg.VarianceAccumulator createAccumulator()
Combine.CombineFncreateAccumulator in class Combine.CombineFn<T extends java.lang.Number,org.apache.beam.sdk.extensions.sql.impl.transform.agg.VarianceAccumulator,T extends java.lang.Number>public org.apache.beam.sdk.extensions.sql.impl.transform.agg.VarianceAccumulator addInput(org.apache.beam.sdk.extensions.sql.impl.transform.agg.VarianceAccumulator currentVariance,
                                                                                          T rawInput)
Combine.CombineFnaddInput in class Combine.CombineFn<T extends java.lang.Number,org.apache.beam.sdk.extensions.sql.impl.transform.agg.VarianceAccumulator,T extends java.lang.Number>currentVariance - may be modified and returned for efficiencyrawInput - should not be mutatedpublic org.apache.beam.sdk.extensions.sql.impl.transform.agg.VarianceAccumulator mergeAccumulators(java.lang.Iterable<org.apache.beam.sdk.extensions.sql.impl.transform.agg.VarianceAccumulator> variances)
Combine.CombineFnmergeAccumulators in class Combine.CombineFn<T extends java.lang.Number,org.apache.beam.sdk.extensions.sql.impl.transform.agg.VarianceAccumulator,T extends java.lang.Number>variances - only the first accumulator may be modified and returned for efficiency;
     the other accumulators should not be mutated, because they may be shared with other code
     and mutating them could lead to incorrect results or data corruption.public Coder<org.apache.beam.sdk.extensions.sql.impl.transform.agg.VarianceAccumulator> getAccumulatorCoder(CoderRegistry registry, Coder<T> inputCoder)
CombineFnBase.GlobalCombineFnCoder to use for accumulator AccumT values, or null if it is not
 able to be inferred.
 By default, uses the knowledge of the Coder being used for InputT values
 and the enclosing Pipeline's CoderRegistry to try to infer the Coder for
 AccumT values.
 
This is the Coder used to send data through a communication-intensive shuffle step, so a compact and efficient representation may have significant performance benefits.
getAccumulatorCoder in interface CombineFnBase.GlobalCombineFn<T extends java.lang.Number,org.apache.beam.sdk.extensions.sql.impl.transform.agg.VarianceAccumulator,T extends java.lang.Number>public T extractOutput(org.apache.beam.sdk.extensions.sql.impl.transform.agg.VarianceAccumulator accumulator)
Combine.CombineFnextractOutput in class Combine.CombineFn<T extends java.lang.Number,org.apache.beam.sdk.extensions.sql.impl.transform.agg.VarianceAccumulator,T extends java.lang.Number>accumulator - can be modified for efficiencypublic Coder<OutputT> getDefaultOutputCoder(CoderRegistry registry, Coder<InputT> inputCoder) throws CannotProvideCoderException
CombineFnBase.GlobalCombineFnCoder to use by default for output OutputT values, or null if it
 is not able to be inferred.
 By default, uses the knowledge of the Coder being used for input InputT
 values and the enclosing Pipeline's CoderRegistry to try to infer the Coder
 for OutputT values.
getDefaultOutputCoder in interface CombineFnBase.GlobalCombineFn<InputT,AccumT,OutputT>CannotProvideCoderExceptionpublic java.lang.String getIncompatibleGlobalWindowErrorMessage()
CombineFnBase.GlobalCombineFngetIncompatibleGlobalWindowErrorMessage in interface CombineFnBase.GlobalCombineFn<InputT,AccumT,OutputT>public java.lang.reflect.TypeVariable<?> getInputTVariable()
TypeVariable of InputT.public java.lang.reflect.TypeVariable<?> getAccumTVariable()
TypeVariable of AccumT.public java.lang.reflect.TypeVariable<?> getOutputTVariable()
TypeVariable of OutputT.public void populateDisplayData(DisplayData.Builder builder)
populateDisplayData(DisplayData.Builder) is invoked by Pipeline runners to collect
 display data via DisplayData.from(HasDisplayData). Implementations may call super.populateDisplayData(builder) in order to register display data in the current namespace,
 but should otherwise use subcomponent.populateDisplayData(builder) to use the namespace
 of the subcomponent.
 
By default, does not register any display data. Implementors may override this method to provide their own display data.
populateDisplayData in interface HasDisplayDatabuilder - The builder to populate with display data.HasDisplayData