public abstract static class Group.CombineFieldsByFields<InputT> extends Group.AggregateCombiner<InputT>
PTransform
that does a per-key combine using an aggregation built up by calls to
aggregateField and aggregateFields. The output of this transform will have a schema that is
determined by the output types of all the composed combiners.Modifier and Type | Class and Description |
---|---|
static class |
Group.CombineFieldsByFields.Fanout |
name, resourceHints
Constructor and Description |
---|
CombineFieldsByFields() |
Modifier and Type | Method and Description |
---|---|
<CombineInputT,AccumT,CombineOutputT> |
aggregateField(int inputFieldId,
Combine.CombineFn<CombineInputT,AccumT,CombineOutputT> fn,
Schema.Field outputField)
Build up an aggregation function over the input elements.
|
<CombineInputT,AccumT,CombineOutputT> |
aggregateField(int inputFieldId,
Combine.CombineFn<CombineInputT,AccumT,CombineOutputT> fn,
java.lang.String outputFieldName) |
<CombineInputT,AccumT,CombineOutputT> |
aggregateField(java.lang.String inputFieldName,
Combine.CombineFn<CombineInputT,AccumT,CombineOutputT> fn,
Schema.Field outputField)
Build up an aggregation function over the input elements.
|
<CombineInputT,AccumT,CombineOutputT> |
aggregateField(java.lang.String inputFieldName,
Combine.CombineFn<CombineInputT,AccumT,CombineOutputT> fn,
java.lang.String outputFieldName)
Build up an aggregation function over the input elements.
|
<CombineInputT,AccumT,CombineOutputT> |
aggregateFieldBaseValue(int inputFieldId,
Combine.CombineFn<CombineInputT,AccumT,CombineOutputT> fn,
Schema.Field outputField) |
<CombineInputT,AccumT,CombineOutputT> |
aggregateFieldBaseValue(int inputFieldId,
Combine.CombineFn<CombineInputT,AccumT,CombineOutputT> fn,
java.lang.String outputFieldName) |
<CombineInputT,AccumT,CombineOutputT> |
aggregateFieldBaseValue(java.lang.String inputFieldName,
Combine.CombineFn<CombineInputT,AccumT,CombineOutputT> fn,
Schema.Field outputField) |
<CombineInputT,AccumT,CombineOutputT> |
aggregateFieldBaseValue(java.lang.String inputFieldName,
Combine.CombineFn<CombineInputT,AccumT,CombineOutputT> fn,
java.lang.String outputFieldName) |
<CombineInputT,AccumT,CombineOutputT> |
aggregateFields(FieldAccessDescriptor fieldsToAggregate,
Combine.CombineFn<CombineInputT,AccumT,CombineOutputT> fn,
Schema.Field outputField)
Build up an aggregation function over the input elements.
|
<CombineInputT,AccumT,CombineOutputT> |
aggregateFields(FieldAccessDescriptor fieldsToAggregate,
Combine.CombineFn<CombineInputT,AccumT,CombineOutputT> fn,
java.lang.String outputFieldName)
Build up an aggregation function over the input elements.
|
<CombineInputT,AccumT,CombineOutputT> |
aggregateFields(java.util.List<java.lang.String> inputFieldNames,
Combine.CombineFn<CombineInputT,AccumT,CombineOutputT> fn,
Schema.Field outputField)
Build up an aggregation function over the input elements.
|
<CombineInputT,AccumT,CombineOutputT> |
aggregateFields(java.util.List<java.lang.String> inputFieldNames,
Combine.CombineFn<CombineInputT,AccumT,CombineOutputT> fn,
java.lang.String outputFieldName)
Build up an aggregation function over the input elements.
|
<CombineInputT,AccumT,CombineOutputT> |
aggregateFieldsById(java.util.List<java.lang.Integer> inputFieldIds,
Combine.CombineFn<CombineInputT,AccumT,CombineOutputT> fn,
Schema.Field outputField)
Build up an aggregation function over the input elements by field id.
|
PCollection<Row> |
expand(PCollection<InputT> input)
Override this method to specify how this
PTransform should be expanded on the given
InputT . |
Group.CombineFieldsByFields<InputT> |
withHotKeyFanout(int n) |
Group.CombineFieldsByFields<InputT> |
withHotKeyFanout(SerializableFunction<Row,java.lang.Integer> f) |
Group.CombineFieldsByFields<InputT> |
withKeyField(java.lang.String keyField)
Set the name of the key field in the resulting schema.
|
Group.CombineFieldsByFields<InputT> |
withPrecombining(boolean value)
Enable precombining.
|
Group.CombineFieldsByFields<InputT> |
witValueField(java.lang.String valueField)
Set the name of the value field in the resulting schema.
|
compose, compose, getAdditionalInputs, getDefaultOutputCoder, getDefaultOutputCoder, getDefaultOutputCoder, getKindString, getName, getResourceHints, populateDisplayData, setResourceHints, toString, validate, validate
public Group.CombineFieldsByFields<InputT> withKeyField(java.lang.String keyField)
public Group.CombineFieldsByFields<InputT> witValueField(java.lang.String valueField)
public Group.CombineFieldsByFields<InputT> withPrecombining(boolean value)
This is on by default. In certain cases (e.g. if there are many unique field values and the combiner's intermediate state is larger than the average row size) precombining makes things worse, in which case it can be turned off.
public Group.CombineFieldsByFields<InputT> withHotKeyFanout(int n)
public Group.CombineFieldsByFields<InputT> withHotKeyFanout(SerializableFunction<Row,java.lang.Integer> f)
public <CombineInputT,AccumT,CombineOutputT> Group.CombineFieldsByFields<InputT> aggregateField(java.lang.String inputFieldName, Combine.CombineFn<CombineInputT,AccumT,CombineOutputT> fn, java.lang.String outputFieldName)
This method specifies an aggregation over single field of the input. The union of all calls to aggregateField and aggregateFields will determine the output schema.
Field types in the output schema will be inferred from the provided combine function. Sometimes the field type cannot be inferred due to Java's type erasure. In that case, use the overload that allows setting the output field type explicitly.
public <CombineInputT,AccumT,CombineOutputT> Group.CombineFieldsByFields<InputT> aggregateFieldBaseValue(java.lang.String inputFieldName, Combine.CombineFn<CombineInputT,AccumT,CombineOutputT> fn, java.lang.String outputFieldName)
public <CombineInputT,AccumT,CombineOutputT> Group.CombineFieldsByFields<InputT> aggregateField(int inputFieldId, Combine.CombineFn<CombineInputT,AccumT,CombineOutputT> fn, java.lang.String outputFieldName)
public <CombineInputT,AccumT,CombineOutputT> Group.CombineFieldsByFields<InputT> aggregateFieldBaseValue(int inputFieldId, Combine.CombineFn<CombineInputT,AccumT,CombineOutputT> fn, java.lang.String outputFieldName)
public <CombineInputT,AccumT,CombineOutputT> Group.CombineFieldsByFields<InputT> aggregateField(java.lang.String inputFieldName, Combine.CombineFn<CombineInputT,AccumT,CombineOutputT> fn, Schema.Field outputField)
This method specifies an aggregation over single field of the input. The union of all calls to aggregateField and aggregateFields will determine the output schema.
aggregateField
in class Group.AggregateCombiner<InputT>
public <CombineInputT,AccumT,CombineOutputT> Group.CombineFieldsByFields<InputT> aggregateFieldBaseValue(java.lang.String inputFieldName, Combine.CombineFn<CombineInputT,AccumT,CombineOutputT> fn, Schema.Field outputField)
public <CombineInputT,AccumT,CombineOutputT> Group.CombineFieldsByFields<InputT> aggregateField(int inputFieldId, Combine.CombineFn<CombineInputT,AccumT,CombineOutputT> fn, Schema.Field outputField)
Group.AggregateCombiner
This method specifies an aggregation over single field of the input. The union of all calls to aggregateField and aggregateFields will determine the output schema.
aggregateField
in class Group.AggregateCombiner<InputT>
public <CombineInputT,AccumT,CombineOutputT> Group.CombineFieldsByFields<InputT> aggregateFieldBaseValue(int inputFieldId, Combine.CombineFn<CombineInputT,AccumT,CombineOutputT> fn, Schema.Field outputField)
public <CombineInputT,AccumT,CombineOutputT> Group.CombineFieldsByFields<InputT> aggregateFields(java.util.List<java.lang.String> inputFieldNames, Combine.CombineFn<CombineInputT,AccumT,CombineOutputT> fn, java.lang.String outputFieldName)
This method specifies an aggregation over multiple fields of the input. The union of all calls to aggregateField and aggregateFields will determine the output schema.
Field types in the output schema will be inferred from the provided combine function. Sometimes the field type cannot be inferred due to Java's type erasure. In that case, use the overload that allows setting the output field type explicitly.
public <CombineInputT,AccumT,CombineOutputT> Group.CombineFieldsByFields<InputT> aggregateFields(FieldAccessDescriptor fieldsToAggregate, Combine.CombineFn<CombineInputT,AccumT,CombineOutputT> fn, java.lang.String outputFieldName)
This method specifies an aggregation over multiple fields of the input. The union of all calls to aggregateField and aggregateFields will determine the output schema.
Field types in the output schema will be inferred from the provided combine function. Sometimes the field type cannot be inferred due to Java's type erasure. In that case, use the overload that allows setting the output field type explicitly.
public <CombineInputT,AccumT,CombineOutputT> Group.CombineFieldsByFields<InputT> aggregateFields(java.util.List<java.lang.String> inputFieldNames, Combine.CombineFn<CombineInputT,AccumT,CombineOutputT> fn, Schema.Field outputField)
This method specifies an aggregation over multiple fields of the input. The union of all calls to aggregateField and aggregateFields will determine the output schema.
public <CombineInputT,AccumT,CombineOutputT> Group.CombineFieldsByFields<InputT> aggregateFieldsById(java.util.List<java.lang.Integer> inputFieldIds, Combine.CombineFn<CombineInputT,AccumT,CombineOutputT> fn, Schema.Field outputField)
Group.AggregateCombiner
This method specifies an aggregation over multiple fields of the input. The union of all calls to aggregateField and aggregateFields will determine the output schema.
Field types in the output schema will be inferred from the provided combine function. Sometimes the field type cannot be inferred due to Java's type erasure. In that case, use the overload that allows setting the output field type explicitly.
aggregateFieldsById
in class Group.AggregateCombiner<InputT>
public <CombineInputT,AccumT,CombineOutputT> Group.CombineFieldsByFields<InputT> aggregateFields(FieldAccessDescriptor fieldsToAggregate, Combine.CombineFn<CombineInputT,AccumT,CombineOutputT> fn, Schema.Field outputField)
This method specifies an aggregation over multiple fields of the input. The union of all calls to aggregateField and aggregateFields will determine the output schema.
public PCollection<Row> expand(PCollection<InputT> input)
PTransform
PTransform
should be expanded on the given
InputT
.
NOTE: This method should not be called directly. Instead apply the PTransform
should
be applied to the InputT
using the apply
method.
Composite transforms, which are defined in terms of other transforms, should return the output of one of the composed transforms. Non-composite transforms, which do not apply any transforms internally, should return a new unbound output and register evaluators (via backend-specific registration methods).
expand
in class PTransform<PCollection<InputT>,PCollection<Row>>