Class PythonExternalTransform<InputT extends PInput,OutputT extends POutput>

java.lang.Object
org.apache.beam.sdk.transforms.PTransform<InputT,OutputT>
org.apache.beam.sdk.extensions.python.PythonExternalTransform<InputT,OutputT>
All Implemented Interfaces:
Serializable, HasDisplayData

public class PythonExternalTransform<InputT extends PInput,OutputT extends POutput> extends PTransform<InputT,OutputT>
Wrapper for invoking external Python transforms.
See Also:
  • Method Details

    • from

      public static <InputT extends PInput, OutputT extends POutput> PythonExternalTransform<InputT,OutputT> from(String transformName)
      Instantiates a cross-language wrapper for a Python transform with a given transform name.

      The given fully qualified name will be imported and called to instantiate the transform. Often this is the fully qualified name of a Python PTransform class, in which case the arguments will be passed to its constructor, but any callable will do.

      Two special names, __callable__ and __constructor__ can be used to define a suitable transform inline if none exists.

      When __callable__ is provided, the first argument (or source keyword argument) should be a PythonCallableSource which represents the expand method of the PTransform accepting and returning a PValue (and may also take additional arguments and keyword arguments). For example, one might write

       PythonExternalTransform
           .from("__callable__")
           .withArgs(
               PythonCallable.of("def expand(pcoll, x, y): return pcoll | ..."),
               valueForX,
               valueForY);
       

      When __constructor__ is provided, the first argument (or source keyword argument) should be a PythonCallableSource which will return the desired PTransform when called with the remaining arguments and keyword arguments. Often this will be a PythonCallableSource representing a PTransform class, for example

       PythonExternalTransform
           .from("__constructor__")
           .withArgs(
               PythonCallable.of("class MyPTransform(beam.PTransform): ..."),
               ...valuesForMyPTransformConstructorIfAny);
       
      Type Parameters:
      InputT - Input PCollection type
      OutputT - Output PCollection type
      Parameters:
      transformName - fully qualified transform name.
      Returns:
      A PythonExternalTransform for the given transform name.
    • from

      public static <InputT extends PInput, OutputT extends POutput> PythonExternalTransform<InputT,OutputT> from(String transformName, String expansionService)
      Instantiates a cross-language wrapper for a Python transform with a given transform name.

      See from(String) for the meaning of transformName.

      Type Parameters:
      InputT - Input PCollection type
      OutputT - Output PCollection type
      Parameters:
      transformName - fully qualified transform name.
      expansionService - address and port number for externally launched expansion service
      Returns:
      A PythonExternalTransform for the given transform name.
    • withArgs

      Positional arguments for the Python cross-language transform. If invoked more than once, new arguments will be appended to the previously specified arguments.
      Parameters:
      args - list of arguments.
      Returns:
      updated wrapper for the cross-language transform.
    • withKwarg

      public PythonExternalTransform<InputT,OutputT> withKwarg(String name, Object value)
      Specifies a single keyword argument for the Python cross-language transform. This may be invoked multiple times to add more than one keyword argument.
      Parameters:
      name - argument name.
      value - argument value
      Returns:
      updated wrapper for the cross-language transform.
    • withKwargs

      public PythonExternalTransform<InputT,OutputT> withKwargs(Map<String,Object> kwargs)
      Specifies keyword arguments for the Python cross-language transform. If invoked more than once, new keyword arguments map will be added to the previously prided keyword arguments.
      Returns:
      updated wrapper for the cross-language transform.
    • withKwargs

      public PythonExternalTransform<InputT,OutputT> withKwargs(Row kwargs)
      Specifies keyword arguments as a Row objects.
      Parameters:
      kwargs - keyword arguments as a Row objects. An empty Row represents zero keyword arguments.
      Returns:
      updated wrapper for the cross-language transform.
    • withTypeHint

      public PythonExternalTransform<InputT,OutputT> withTypeHint(Class<?> argType, Schema.FieldType fieldType)
      Specifies the field type of arguments.

      Type hints are especially useful for logical types since type inference does not work well for logical types.

      Parameters:
      argType - A class object for the argument type.
      fieldType - A schema field type for the argument.
      Returns:
      updated wrapper for the cross-language transform.
    • withOutputCoders

      public PythonExternalTransform<InputT,OutputT> withOutputCoders(Map<String,Coder<?>> outputCoders)
      Specifies the keys and Coders of the output PCollections produced by this transform.
      Parameters:
      outputCoders - a mapping from output keys to Coders.
      Returns:
      updated wrapper for the cross-language transform.
    • withOutputCoder

      public PythonExternalTransform<InputT,OutputT> withOutputCoder(Coder<?> outputCoder)
      Specifies the Coder of the output PCollections produced by this transform. Should only be used if this transform produces a single output.
      Parameters:
      outputCoder - output Coder of the transform.
      Returns:
      updated wrapper for the cross-language transform.
    • withExtraPackages

      public PythonExternalTransform<InputT,OutputT> withExtraPackages(List<String> extraPackages)
      Specifies that the given Python packages are required for this transform, which will cause them to be installed in both the construction-time and execution time environment.
      Parameters:
      extraPackages - a list of pip-installable package specifications, such as would be found in a requirements file.
      Returns:
      updated wrapper for the cross-language transform.
    • expand

      public OutputT expand(InputT input)
      Description copied from class: PTransform
      Override this method to specify how this PTransform should be expanded on the given InputT.

      NOTE: This method should not be called directly. Instead apply the PTransform should be applied to the InputT using the apply method.

      Composite transforms, which are defined in terms of other transforms, should return the output of one of the composed transforms. Non-composite transforms, which do not apply any transforms internally, should return a new unbound output and register evaluators (via backend-specific registration methods).

      Specified by:
      expand in class PTransform<InputT extends PInput,OutputT extends POutput>