Class Select
java.lang.Object
org.apache.beam.sdk.schemas.transforms.Select
A
PTransform for selecting a subset of fields from a schema type.
This transforms allows projecting out a subset of fields from a schema type. The output of
this transform is of type Row, though that can be converted into any other type with
matching schema using the Convert transform.
For example, consider the following POJO type:
@DefaultSchema(JavaFieldSchema.class)
public class UserEvent {
public String userId;
public String eventId;
public int eventType;
public Location location;
}
@DefaultSchema(JavaFieldSchema.class)
public class Location {
public double latitude;
public double longtitude;
}
Say you want to select just the set of userId, eventId pairs from each element, you would write
the following:
PCollection<UserEvent> events = readUserEvents();
PCollection<Row> rows = event.apply(Select.fieldNames("userId", "eventId"));
It's possible to select a nested field as well. For example, if you want just the location
information from each element:
PCollection<UserEvent> events = readUserEvents();
PCollection<Location> rows = event.apply(Select.fieldNames("location")
.apply(Convert.to(Location.class));
-
Nested Class Summary
Nested ClassesModifier and TypeClassDescriptionstatic classstatic classAPTransformrepresenting a flattened schema. -
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionstatic <T> Select.Fields<T> create()static <T> Select.Fields<T> fieldAccess(FieldAccessDescriptor fieldAccessDescriptor) Select a set of fields described in aFieldAccessDescriptor.static <T> Select.Fields<T> Select a set of top-level field ids from the row.static <T> Select.Fields<T> fieldNames(String... names) Select a set of top-level field names from the row.static <T> Select.Flattened<T> Selects every leaf-level field.
-
Constructor Details
-
Select
public Select()
-
-
Method Details
-
create
-
fieldIds
Select a set of top-level field ids from the row. -
fieldNames
Select a set of top-level field names from the row. -
fieldAccess
Select a set of fields described in aFieldAccessDescriptor.This allows for nested fields to be selected as well.
-
flattenedSchema
Selects every leaf-level field. This results in a nested schema being flattened into a single top-level schema. By default nested field names will be concatenated with _ characters, though this can be overridden usingSelect.Flattened.keepMostNestedFieldName()andSelect.Flattened.withFieldNameAs(java.lang.String, java.lang.String).
-