Class Select
java.lang.Object
org.apache.beam.sdk.schemas.transforms.Select
A
PTransform
for selecting a subset of fields from a schema type.
This transforms allows projecting out a subset of fields from a schema type. The output of
this transform is of type Row
, though that can be converted into any other type with
matching schema using the Convert
transform.
For example, consider the following POJO type:
@DefaultSchema(JavaFieldSchema.class)
public class UserEvent {
public String userId;
public String eventId;
public int eventType;
public Location location;
}
@DefaultSchema(JavaFieldSchema.class)
public class Location {
public double latitude;
public double longtitude;
}
Say you want to select just the set of userId, eventId pairs from each element, you would write
the following:
PCollection<UserEvent> events = readUserEvents();
PCollection<Row> rows = event.apply(Select.fieldNames("userId", "eventId"));
It's possible to select a nested field as well. For example, if you want just the location
information from each element:
PCollection<UserEvent> events = readUserEvents();
PCollection<Location> rows = event.apply(Select.fieldNames("location")
.apply(Convert.to(Location.class));
-
Nested Class Summary
Nested ClassesModifier and TypeClassDescriptionstatic class
static class
APTransform
representing a flattened schema. -
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionstatic <T> Select.Fields
<T> create()
static <T> Select.Fields
<T> fieldAccess
(FieldAccessDescriptor fieldAccessDescriptor) Select a set of fields described in aFieldAccessDescriptor
.static <T> Select.Fields
<T> Select a set of top-level field ids from the row.static <T> Select.Fields
<T> fieldNames
(String... names) Select a set of top-level field names from the row.static <T> Select.Flattened
<T> Selects every leaf-level field.
-
Constructor Details
-
Select
public Select()
-
-
Method Details
-
create
-
fieldIds
Select a set of top-level field ids from the row. -
fieldNames
Select a set of top-level field names from the row. -
fieldAccess
Select a set of fields described in aFieldAccessDescriptor
.This allows for nested fields to be selected as well.
-
flattenedSchema
Selects every leaf-level field. This results in a nested schema being flattened into a single top-level schema. By default nested field names will be concatenated with _ characters, though this can be overridden usingSelect.Flattened.keepMostNestedFieldName()
andSelect.Flattened.withFieldNameAs(java.lang.String, java.lang.String)
.
-