@Experimental(value=SCHEMAS) public class Join extends java.lang.Object
PCollection
s.
This transform allows joins between two input PCollections simply by specifying the fields to
join on. The resulting PCollection<Row>
will have two fields named "lhs" and "rhs"
respectively, each with the schema of the corresponding input PCollection.
For example, the following demonstrates joining two PCollections using a natural join on the "user" and "country" fields, where both the left-hand and the right-hand PCollections have fields with these names.
PCollection<Row> joined = pCollection1.apply(Join.innerJoin(pCollection2).using("user", "country"));
If the right-hand PCollection contains fields with different names to join against, you can specify them as follows:
PCollection<Row> joined = pCollection1.apply(Join.innerJoin(pCollection2)
.on(FieldsEqual.left("user", "country").right("otherUser", "otherCountry")));
Full outer joins, left outer joins, and right outer joins are also supported.
Modifier and Type | Class and Description |
---|---|
static class |
Join.FieldsEqual
Predicate object to specify fields to compare when doing an equi-join.
|
static class |
Join.Impl<LhsT,RhsT>
Implementation class .
|
Modifier and Type | Field and Description |
---|---|
static java.lang.String |
LHS_TAG |
static java.lang.String |
RHS_TAG |
Constructor and Description |
---|
Join() |
Modifier and Type | Method and Description |
---|---|
static <LhsT,RhsT> |
fullOuterJoin(PCollection<RhsT> rhs)
Perform a full outer join.
|
static <LhsT,RhsT> |
innerJoin(PCollection<RhsT> rhs)
Perform an inner join.
|
static <LhsT,RhsT> |
leftOuterJoin(PCollection<RhsT> rhs)
Perform a left outer join.
|
static <LhsT,RhsT> |
rightOuterJoin(PCollection<RhsT> rhs)
Perform a right outer join.
|
public static final java.lang.String LHS_TAG
public static final java.lang.String RHS_TAG
public static <LhsT,RhsT> Join.Impl<LhsT,RhsT> innerJoin(PCollection<RhsT> rhs)
public static <LhsT,RhsT> Join.Impl<LhsT,RhsT> fullOuterJoin(PCollection<RhsT> rhs)
public static <LhsT,RhsT> Join.Impl<LhsT,RhsT> leftOuterJoin(PCollection<RhsT> rhs)
public static <LhsT,RhsT> Join.Impl<LhsT,RhsT> rightOuterJoin(PCollection<RhsT> rhs)