public static class Regex.Split extends PTransform<PCollection<java.lang.String>,PCollection<java.lang.String>>
Regex.Split<String> takes a PCollection<String> and returns a PCollection<String> with the input string split into individual items in a list. Each item is
then output as a separate string.
This transform runs a Regex as part of a splint the entire input line. The split gives back
an array of items. Each item is output as a separate item in the PCollection<String>.
Depending on the Regex, a split can be an empty or "" string. You can pass in a parameter if you want empty strings or not.
Example of use:
PCollection<String> words = ...;
PCollection<String> values =
words.apply(Regex.split("\W*"));
name| Constructor and Description |
|---|
Split(java.util.regex.Pattern pattern,
boolean outputEmpty) |
| Modifier and Type | Method and Description |
|---|---|
PCollection<java.lang.String> |
expand(PCollection<java.lang.String> in)
Applies this
PTransform on the given InputT, and returns its
Output. |
getAdditionalInputs, getDefaultOutputCoder, getDefaultOutputCoder, getDefaultOutputCoder, getKindString, getName, populateDisplayData, toString, validatepublic PCollection<java.lang.String> expand(PCollection<java.lang.String> in)
PTransformPTransform on the given InputT, and returns its
Output.
Composite transforms, which are defined in terms of other transforms, should return the output of one of the composed transforms. Non-composite transforms, which do not apply any transforms internally, should return a new unbound output and register evaluators (via backend-specific registration methods).
expand in class PTransform<PCollection<java.lang.String>,PCollection<java.lang.String>>