Class Regex

java.lang.Object
org.apache.beam.sdk.transforms.Regex

public class Regex extends Object
PTransforms to use Regular Expressions to process elements in a PCollection.

matches(String, int) can be used to see if an entire line matches a Regex. matchesKV(String, int, int) can be used to see if an entire line matches a Regex and output certain groups as a KV.

find(String, int) can be used to see if a portion of a line matches a Regex. matchesKV(String, int, int) can be used to see if a portion of a line matches a Regex and output certain groups as a KV.

Lines that do not match the Regex will not be output.

  • Method Details

    • matches

      public static Regex.Matches matches(String regex)
      Returns a Regex.Matches PTransform that checks if the entire line matches the Regex. Returns the entire line (group 0) as a PCollection.
      Parameters:
      regex - The regular expression to run
    • matches

      public static Regex.Matches matches(Pattern pattern)
      Returns a Regex.Matches PTransform that checks if the entire line matches the Regex. Returns the entire line (group 0) as a PCollection.
      Parameters:
      pattern - The regular expression to run
    • matches

      public static Regex.Matches matches(String regex, int group)
      Returns a Regex.Matches PTransform that checks if the entire line matches the Regex. Returns the group as a PCollection.
      Parameters:
      regex - The regular expression to run
      group - The Regex group to return as a PCollection
    • matches

      public static Regex.Matches matches(Pattern pattern, int group)
      Returns a Regex.Matches PTransform that checks if the entire line matches the Regex. Returns the group as a PCollection.
      Parameters:
      pattern - The regular expression to run
      group - The Regex group to return as a PCollection
    • matches

      public static Regex.MatchesName matches(String regex, String groupName)
      Returns a Regex.MatchesName PTransform that checks if the entire line matches the Regex. Returns the group as a PCollection.
      Parameters:
      regex - The regular expression to run
      groupName - The Regex group name to return as a PCollection
    • matches

      public static Regex.MatchesName matches(Pattern pattern, String groupName)
      Returns a Regex.MatchesName PTransform that checks if the entire line matches the Regex. Returns the group as a PCollection.
      Parameters:
      pattern - The regular expression to run
      groupName - The Regex group name to return as a PCollection
    • allMatches

      public static Regex.AllMatches allMatches(String regex)
      Returns a Regex.AllMatches PTransform that checks if the entire line matches the Regex. Returns all groups as a List<String> in a PCollection.
      Parameters:
      regex - The regular expression to run
    • allMatches

      public static Regex.AllMatches allMatches(Pattern pattern)
      Returns a Regex.AllMatches PTransform that checks if the entire line matches the Regex. Returns all groups as a List<String> in a PCollection.
      Parameters:
      pattern - The regular expression to run
    • matchesKV

      public static Regex.MatchesKV matchesKV(String regex, int keyGroup, int valueGroup)
      Returns a Regex.MatchesKV PTransform that checks if the entire line matches the Regex. Returns the specified groups as the key and value as a PCollection.
      Parameters:
      regex - The regular expression to run
      keyGroup - The Regex group to use as the key
      valueGroup - The Regex group to use the value
    • matchesKV

      public static Regex.MatchesKV matchesKV(Pattern pattern, int keyGroup, int valueGroup)
      Returns a Regex.MatchesKV PTransform that checks if the entire line matches the Regex. Returns the specified groups as the key and value as a PCollection.
      Parameters:
      pattern - The regular expression to run
      keyGroup - The Regex group to use as the key
      valueGroup - The Regex group to use the value
    • matchesKV

      public static Regex.MatchesNameKV matchesKV(String regex, String keyGroupName, String valueGroupName)
      Returns a Regex.MatchesNameKV PTransform that checks if the entire line matches the Regex. Returns the specified groups as the key and value as a PCollection.
      Parameters:
      regex - The regular expression to run
      keyGroupName - The Regex group name to use as the key
      valueGroupName - The Regex group name to use the value
    • matchesKV

      public static Regex.MatchesNameKV matchesKV(Pattern pattern, String keyGroupName, String valueGroupName)
      Returns a Regex.MatchesNameKV PTransform that checks if the entire line matches the Regex. Returns the specified groups as the key and value as a PCollection.
      Parameters:
      pattern - The regular expression to run
      keyGroupName - The Regex group name to use as the key
      valueGroupName - The Regex group name to use the value
    • find

      public static Regex.Find find(String regex)
      Returns a Regex.Find PTransform that checks if a portion of the line matches the Regex. Returns the entire line (group 0) as a PCollection.
      Parameters:
      regex - The regular expression to run
    • find

      public static Regex.Find find(Pattern pattern)
      Returns a Regex.Find PTransform that checks if a portion of the line matches the Regex. Returns the entire line (group 0) as a PCollection.
      Parameters:
      pattern - The regular expression to run
    • find

      public static Regex.Find find(String regex, int group)
      Returns a Regex.Find PTransform that checks if a portion of the line matches the Regex. Returns the group as a PCollection.
      Parameters:
      regex - The regular expression to run
      group - The Regex group to return as a PCollection
    • find

      public static Regex.Find find(Pattern pattern, int group)
      Returns a Regex.Find PTransform that checks if a portion of the line matches the Regex. Returns the group as a PCollection.
      Parameters:
      pattern - The regular expression to run
      group - The Regex group to return as a PCollection
    • find

      public static Regex.FindName find(String regex, String groupName)
      Returns a Regex.FindName PTransform that checks if a portion of the line matches the Regex. Returns the group as a PCollection.
      Parameters:
      regex - The regular expression to run
      groupName - The Regex group name to return as a PCollection
    • find

      public static Regex.FindName find(Pattern pattern, String groupName)
      Returns a Regex.FindName PTransform that checks if a portion of the line matches the Regex. Returns the group as a PCollection.
      Parameters:
      pattern - The regular expression to run
      groupName - The Regex group name to return as a PCollection
    • findAll

      public static Regex.FindAll findAll(String regex)
      Returns a Regex.FindAll PTransform that checks if a portion of the line matches the Regex. Returns all the groups as a List<String> in a PCollection.
      Parameters:
      regex - The regular expression to run
    • findAll

      public static Regex.FindAll findAll(Pattern pattern)
      Returns a Regex.FindAll PTransform that checks if a portion of the line matches the Regex. Returns all the groups as a List<String> in a PCollection.
      Parameters:
      pattern - The regular expression to run
    • findKV

      public static Regex.FindKV findKV(String regex, int keyGroup, int valueGroup)
      Returns a Regex.FindKV PTransform that checks if a portion of the line matches the Regex. Returns the specified groups as the key and value as a PCollection.
      Parameters:
      regex - The regular expression to run
      keyGroup - The Regex group to use as the key
      valueGroup - The Regex group to use the value
    • findKV

      public static Regex.FindKV findKV(Pattern pattern, int keyGroup, int valueGroup)
      Returns a Regex.FindKV PTransform that checks if a portion of the line matches the Regex. Returns the specified groups as the key and value as a PCollection.
      Parameters:
      pattern - The regular expression to run
      keyGroup - The Regex group to use as the key
      valueGroup - The Regex group to use the value
    • findKV

      public static Regex.FindNameKV findKV(String regex, String keyGroupName, String valueGroupName)
      Returns a Regex.FindNameKV PTransform that checks if a portion of the line matches the Regex. Returns the specified groups as the key and value as a PCollection.
      Parameters:
      regex - The regular expression to run
      keyGroupName - The Regex group name to use as the key
      valueGroupName - The Regex group name to use the value
    • findKV

      public static Regex.FindNameKV findKV(Pattern pattern, String keyGroupName, String valueGroupName)
      Returns a Regex.FindNameKV PTransform that checks if a portion of the line matches the Regex. Returns the specified groups as the key and value as a PCollection.
      Parameters:
      pattern - The regular expression to run
      keyGroupName - The Regex group name to use as the key
      valueGroupName - The Regex group name to use the value
    • replaceAll

      public static Regex.ReplaceAll replaceAll(String regex, String replacement)
      Returns a Regex.ReplaceAll PTransform that checks if a portion of the line matches the Regex and replaces all matches with the replacement String. Returns the group as a PCollection.
      Parameters:
      regex - The regular expression to run
      replacement - The string to be substituted for each match
    • replaceAll

      public static Regex.ReplaceAll replaceAll(Pattern pattern, String replacement)
      Returns a Regex.ReplaceAll PTransform that checks if a portion of the line matches the Regex and replaces all matches with the replacement String. Returns the group as a PCollection.
      Parameters:
      pattern - The regular expression to run
      replacement - The string to be substituted for each match
    • replaceFirst

      public static Regex.ReplaceFirst replaceFirst(String regex, String replacement)
      Returns a Regex.ReplaceAll PTransform that checks if a portion of the line matches the Regex and replaces the first match with the replacement String. Returns the group as a PCollection.
      Parameters:
      regex - The regular expression to run
      replacement - The string to be substituted for each match
    • replaceFirst

      public static Regex.ReplaceFirst replaceFirst(Pattern pattern, String replacement)
      Returns a Regex.ReplaceAll PTransform that checks if a portion of the line matches the Regex and replaces the first match with the replacement String. Returns the group as a PCollection.
      Parameters:
      pattern - The regular expression to run
      replacement - The string to be substituted for each match
    • split

      public static Regex.Split split(String regex)
      Returns a Regex.Split PTransform that splits a string on the regular expression and then outputs each item. It will not output empty items. Returns the group as a PCollection. a PCollection.
      Parameters:
      regex - The regular expression to run
    • split

      public static Regex.Split split(Pattern pattern)
      Returns a Regex.Split PTransform that splits a string on the regular expression and then outputs each item. It will not output empty items. Returns the group as a PCollection. a PCollection.
      Parameters:
      pattern - The regular expression to run
    • split

      public static Regex.Split split(String regex, boolean outputEmpty)
      Returns a Regex.Split PTransform that splits a string on the regular expression and then outputs each item. Returns the group as a PCollection.
      Parameters:
      regex - The regular expression to run
      outputEmpty - Should empty be output. True to output empties and false if not.
    • split

      public static Regex.Split split(Pattern pattern, boolean outputEmpty)
      Returns a Regex.Split PTransform that splits a string on the regular expression and then outputs each item. Returns the group as a PCollection.
      Parameters:
      pattern - The regular expression to run
      outputEmpty - Should empty be output. True to output empties and false if not.