java.lang.Object
org.apache.beam.sdk.schemas.transforms.Join

public class Join extends Object
A transform that performs equijoins across two schema PCollections.

This transform allows joins between two input PCollections simply by specifying the fields to join on. The resulting PCollection<Row> will have two fields named "lhs" and "rhs" respectively, each with the schema of the corresponding input PCollection.

For example, the following demonstrates joining two PCollections using a natural join on the "user" and "country" fields, where both the left-hand and the right-hand PCollections have fields with these names.

 PCollection<Row> joined = pCollection1.apply(Join.innerJoin(pCollection2).using("user", "country"));
 

If the right-hand PCollection contains fields with different names to join against, you can specify them as follows:

PCollection<Row> joined = pCollection1.apply(Join.innerJoin(pCollection2)
       .on(FieldsEqual.left("user", "country").right("otherUser", "otherCountry")));
 

Full outer joins, left outer joins, and right outer joins are also supported.

  • Field Details

  • Constructor Details

    • Join

      public Join()
  • Method Details

    • innerJoin

      public static <LhsT, RhsT> Join.Impl<LhsT,RhsT> innerJoin(PCollection<RhsT> rhs)
      Perform an inner join.
    • fullOuterJoin

      public static <LhsT, RhsT> Join.Impl<LhsT,RhsT> fullOuterJoin(PCollection<RhsT> rhs)
      Perform a full outer join.
    • leftOuterJoin

      public static <LhsT, RhsT> Join.Impl<LhsT,RhsT> leftOuterJoin(PCollection<RhsT> rhs)
      Perform a left outer join.
    • rightOuterJoin

      public static <LhsT, RhsT> Join.Impl<LhsT,RhsT> rightOuterJoin(PCollection<RhsT> rhs)
      Perform a right outer join.
    • innerBroadcastJoin

      public static <LhsT, RhsT> Join.Impl<LhsT,RhsT> innerBroadcastJoin(PCollection<RhsT> rhs)
      Perform an inner join, broadcasting the right side.
    • leftOuterBroadcastJoin

      public static <LhsT, RhsT> Join.Impl<LhsT,RhsT> leftOuterBroadcastJoin(PCollection<RhsT> rhs)
      Perform a left outer join, broadcasting the right side.