@Experimental(value=SCHEMAS) public class Select extends java.lang.Object
PTransform
for selecting a subset of fields from a schema type.
This transforms allows projecting out a subset of fields from a schema type. The output of
this transform is of type Row
, though that can be converted into any other type with
matching schema using the Convert
transform.
For example, consider the following POJO type:
@DefaultSchema(JavaFieldSchema.class)
public class UserEvent {
public String userId;
public String eventId;
public int eventType;
public Location location;
}
@DefaultSchema(JavaFieldSchema.class)
public class Location {
public double latitude;
public double longtitude;
}
Say you want to select just the set of userId, eventId pairs from each element, you would write
the following:
PCollection<UserEvent> events = readUserEvents();
PCollection<Row> rows = event.apply(Select.fieldNames("userId", "eventId"));
It's possible to select a nested field as well. For example, if you want just the location
information from each element:
PCollection<UserEvent> events = readUserEvents();
PCollection<Location> rows = event.apply(Select.fieldNames("location")
.apply(Convert.to(Location.class));
Modifier and Type | Class and Description |
---|---|
static class |
Select.Fields<T> |
static class |
Select.Flattened<T>
A
PTransform representing a flattened schema. |
Constructor and Description |
---|
Select() |
Modifier and Type | Method and Description |
---|---|
static <T> Select.Fields<T> |
create() |
static <T> Select.Fields<T> |
fieldAccess(FieldAccessDescriptor fieldAccessDescriptor)
Select a set of fields described in a
FieldAccessDescriptor . |
static <T> Select.Fields<T> |
fieldIds(java.lang.Integer... ids)
Select a set of top-level field ids from the row.
|
static <T> Select.Fields<T> |
fieldNames(java.lang.String... names)
Select a set of top-level field names from the row.
|
static <T> Select.Flattened<T> |
flattenedSchema()
Selects every leaf-level field.
|
public static <T> Select.Fields<T> create()
public static <T> Select.Fields<T> fieldIds(java.lang.Integer... ids)
public static <T> Select.Fields<T> fieldNames(java.lang.String... names)
public static <T> Select.Fields<T> fieldAccess(FieldAccessDescriptor fieldAccessDescriptor)
FieldAccessDescriptor
.
This allows for nested fields to be selected as well.
public static <T> Select.Flattened<T> flattenedSchema()
Select.Flattened.keepMostNestedFieldName()
and Select.Flattened.withFieldNameAs(java.lang.String, java.lang.String)
.