Class DynamicDestinations<T,DestinationT>
- All Implemented Interfaces:
Serializable
- Direct Known Subclasses:
PortableBigQueryDestinations
,StorageApiDynamicDestinationsTableRow
For example, consider a PCollection of events, each containing a user-id field. You want to write each user's events to a separate table with a separate schema per user. Since the user-id field is a string, you will represent the destination as a string.
events.apply(BigQueryIO.<UserEvent>write()
.to(new DynamicDestinations<UserEvent, String>() {
public String getDestination(ValueInSingleWindow<UserEvent> element) {
return element.getValue().getUserId();
}
public TableDestination getTable(String user) {
return new TableDestination(tableForUser(user), "Table for user " + user);
}
public TableSchema getSchema(String user) {
return tableSchemaForUser(user);
}
})
.withFormatFunction(new SerializableFunction<UserEvent, TableRow>() {
public TableRow apply(UserEvent event) {
return convertUserEventToTableRow(event);
}
}));
An instance of DynamicDestinations
can also use side inputs using sideInput(PCollectionView)
. The side inputs must be present in getSideInputs()
. Side
inputs are accessed in the global window, so they must be globally windowed.
DestinationT
is expected to provide proper hash and equality members. Ideally it will
be a compact type with an efficient coder, as these objects may be used as a key in a GroupByKey
.
- See Also:
-
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionabstract DestinationT
getDestination
(@Nullable ValueInSingleWindow<T> element) Returns an object that represents at a high level which table is being written to.Returns the coder forDynamicDestinations
.abstract @Nullable TableSchema
getSchema
(DestinationT destination) Returns the table schema for the destination.List
<PCollectionView<?>> Specifies that this object needs access to one or more side inputs.abstract TableDestination
getTable
(DestinationT destination) Returns aTableDestination
object for the destination.getTableConstraints
(DestinationT destination) Returns TableConstraints (including primary and foreign key) to be used when creating the table.protected final <SideInputT>
SideInputTsideInput
(PCollectionView<SideInputT> view) Returns the value of a given side input.
-
Constructor Details
-
DynamicDestinations
public DynamicDestinations()
-
-
Method Details
-
getSideInputs
Specifies that this object needs access to one or more side inputs. This side inputs must be globally windowed, as they will be accessed from the global window. -
sideInput
Returns the value of a given side input. The view must be present ingetSideInputs()
. -
getDestination
Returns an object that represents at a high level which table is being written to. May not return null.The method must return a unique object for different destination tables involved over all BigQueryIO write transforms in the same pipeline. See https://github.com/apache/beam/issues/32335 for details.
-
getDestinationCoder
Returns the coder forDynamicDestinations
. If this is not overridden, thenBigQueryIO
will look in the coder registry for a suitable coder. This must be a deterministic coder, asDynamicDestinations
will be used as a key type in aGroupByKey
. -
getTable
Returns aTableDestination
object for the destination. May not return null. Return value needs to be unique to each destination: may not return the sameTableDestination
for different destinations. -
getSchema
Returns the table schema for the destination. -
getTableConstraints
Returns TableConstraints (including primary and foreign key) to be used when creating the table. Note: this is not currently supported when using FILE_LOADS!.
-