@Experimental(value=SCHEMAS) public class Select extends java.lang.Object
PTransform for selecting a subset of fields from a schema type.
This transforms allows projecting out a subset of fields from a schema type. The output of
this transform is of type Row, though that can be converted into any other type with
matching schema using the Convert transform.
For example, consider the following POJO type:
@DefaultSchema(JavaFieldSchema.class)
public class UserEvent {
public String userId;
public String eventId;
public int eventType;
public Location location;
}
@DefaultSchema(JavaFieldSchema.class)
public class Location {
public double latitude;
public double longtitude;
}
Say you want to select just the set of userId, eventId pairs from each element, you would write
the following:
PCollection<UserEvent> events = readUserEvents();
PCollection<Row> rows = event.apply(Select.fieldNames("userId", "eventId"));
It's possible to select a nested field as well. For example, if you want just the location
information from each element:
PCollection<UserEvent> events = readUserEvents();
PCollection<Location> rows = event.apply(Select.fieldNames("location")
.apply(Convert.to(Location.class));
| Modifier and Type | Class and Description |
|---|---|
static class |
Select.Fields<T> |
static class |
Select.Flattened<T>
A
PTransform representing a flattened schema. |
| Constructor and Description |
|---|
Select() |
| Modifier and Type | Method and Description |
|---|---|
static <T> Select.Fields<T> |
create() |
static <T> Select.Fields<T> |
fieldAccess(FieldAccessDescriptor fieldAccessDescriptor)
Select a set of fields described in a
FieldAccessDescriptor. |
static <T> Select.Fields<T> |
fieldIds(java.lang.Integer... ids)
Select a set of top-level field ids from the row.
|
static <T> Select.Fields<T> |
fieldNames(java.lang.String... names)
Select a set of top-level field names from the row.
|
static <T> Select.Flattened<T> |
flattenedSchema()
Selects every leaf-level field.
|
public static <T> Select.Fields<T> create()
public static <T> Select.Fields<T> fieldIds(java.lang.Integer... ids)
public static <T> Select.Fields<T> fieldNames(java.lang.String... names)
public static <T> Select.Fields<T> fieldAccess(FieldAccessDescriptor fieldAccessDescriptor)
FieldAccessDescriptor.
This allows for nested fields to be selected as well.
public static <T> Select.Flattened<T> flattenedSchema()
Select.Flattened.keepMostNestedFieldName() and Select.Flattened.withFieldNameAs(java.lang.String, java.lang.String).