public static class Regex.Split extends PTransform<PCollection<java.lang.String>,PCollection<java.lang.String>>
Regex.Split<String>
takes a PCollection<String>
and returns a PCollection<String>
with the input string split into individual items in a list. Each item is
then output as a separate string.
This transform runs a Regex as part of a splint the entire input line. The split gives back
an array of items. Each item is output as a separate item in the PCollection<String>
.
Depending on the Regex, a split can be an empty or "" string. You can pass in a parameter if you want empty strings or not.
Example of use:
PCollection<String> words = ...;
PCollection<String> values =
words.apply(Regex.split("\W*"));
name, resourceHints
Constructor and Description |
---|
Split(java.util.regex.Pattern pattern,
boolean outputEmpty) |
Modifier and Type | Method and Description |
---|---|
PCollection<java.lang.String> |
expand(PCollection<java.lang.String> in)
Override this method to specify how this
PTransform should be expanded on the given
InputT . |
compose, compose, getAdditionalInputs, getDefaultOutputCoder, getDefaultOutputCoder, getDefaultOutputCoder, getKindString, getName, getResourceHints, populateDisplayData, setResourceHints, toString, validate, validate
public PCollection<java.lang.String> expand(PCollection<java.lang.String> in)
PTransform
PTransform
should be expanded on the given
InputT
.
NOTE: This method should not be called directly. Instead apply the PTransform
should
be applied to the InputT
using the apply
method.
Composite transforms, which are defined in terms of other transforms, should return the output of one of the composed transforms. Non-composite transforms, which do not apply any transforms internally, should return a new unbound output and register evaluators (via backend-specific registration methods).
expand
in class PTransform<PCollection<java.lang.String>,PCollection<java.lang.String>>