java.lang.Object
org.apache.beam.sdk.schemas.transforms.Filter

public class Filter extends Object
A PTransform for filtering a collection of schema types.

Separate Predicates can be registered for different schema fields, and the result is allowed to pass if all predicates return true. The output type is the same as the input type.

For example, consider the following schema type:


 public class Location {
   public double latitude;
   public double longitude;
 }
 
In order to examine only locations in south Manhattan, you would write:

 PCollection<Location> locations = readLocations();
 locations.apply(Filter.create()
    .whereFieldName("latitude", lat -> lat < 40.720 && lat > 40.699)
    .whereFieldName("longitude", long -> long < -73.969 && long > -74.747));
 
Predicates that require examining multiple fields at once are also supported. For example, consider the following class representing a user account:

 class UserAccount {
   public double spendOnBooks;
   public double spendOnMovies;
       ...
 }
 
Say you want to examine only users whos total spend is above $100. You could write:

 PCollection<UserAccount> users = readUsers();
 users.apply(Filter.create()
    .whereFieldNames(Lists.newArrayList("spendOnBooks", "spendOnMovies"),
        row -> return row.getDouble("spendOnBooks") + row.getDouble("spendOnMovies") > 100.00));
 
  • Constructor Details

    • Filter

      public Filter()
  • Method Details