Class YamlTransform<InputT extends PInput,OutputT extends POutput>

java.lang.Object
org.apache.beam.sdk.transforms.PTransform<InputT,OutputT>
org.apache.beam.sdk.extensions.yaml.YamlTransform<InputT,OutputT>
Type Parameters:
InputT - the type of the input to this PTransform
OutputT - the type of the output to this PTransform
All Implemented Interfaces:
Serializable, HasDisplayData

public class YamlTransform<InputT extends PInput,OutputT extends POutput> extends PTransform<InputT,OutputT>
Allows one to invoke Beam YAML transforms from Java.

This leverages Beam's cross-langauge transforms. Although python is required to parse and expand the given transforms, the actual implementation may still be in Java.

See Also:
  • Method Details

    • of

      public static YamlTransform<PCollection<Row>,PCollection<Row>> of(String yamlDefinition)
      Creates a new YamlTransform mapping a single input PCollection<Row> to a single PCollection<Row> output.

      Use withMultipleInputs(java.lang.String...) or withMultipleOutputs(java.lang.String...) to indicate that this transform has multiple inputs and/or outputs.

      Parameters:
      yamlDefinition - a YAML string defining this transform.
      Returns:
      a PTransform that applies this YAML to its inputs.
    • source

      public static YamlTransform<PBegin,PCollection<Row>> source(String yamlDefinition)
      Creates a new YamlTransform PBegin a single PCollection<Row> output.
      Parameters:
      yamlDefinition - a YAML string defining this source.
      Returns:
      a PTransform that applies this YAML as a root transform.
    • sink

      public static YamlTransform<PCollection<Row>,PCollection<Row>> sink(String yamlDefinition)
      Creates a new YamlTransform mapping a single input PCollection<Row> to a single PCollection<Row> output.

      Use withMultipleOutputs(java.lang.String...) to indicate that this sink has multiple (or no) or outputs.

      Parameters:
      yamlDefinition - a YAML string defining this sink.
      Returns:
      a PTransform that applies this YAML to its inputs.
    • withMultipleInputs

      public YamlTransform<PCollectionRowTuple,OutputT> withMultipleInputs(String... inputTags)
      Indicates that this YamlTransform expects multiple, named inputs.
      Parameters:
      inputTags - the set of expected input tags to this transform
      Returns:
      a PTransform like this but with a PCollectionRowTuple input type.
    • withMultipleOutputs

      public YamlTransform<InputT,PCollectionRowTuple> withMultipleOutputs(String... outputTags)
      Indicates that this YamlTransform expects multiple, named outputs.
      Parameters:
      outputTags - the set of expected output tags to this transform
      Returns:
      a PTransform like this but with a PCollectionRowTuple output type.
    • expand

      public OutputT expand(InputT input)
      Description copied from class: PTransform
      Override this method to specify how this PTransform should be expanded on the given InputT.

      NOTE: This method should not be called directly. Instead apply the PTransform should be applied to the InputT using the apply method.

      Composite transforms, which are defined in terms of other transforms, should return the output of one of the composed transforms. Non-composite transforms, which do not apply any transforms internally, should return a new unbound output and register evaluators (via backend-specific registration methods).

      Specified by:
      expand in class PTransform<InputT extends PInput,OutputT extends POutput>