Class BigQueryStorageTableSource<T>
java.lang.Object
org.apache.beam.sdk.io.Source<T>
org.apache.beam.sdk.io.BoundedSource<T>
org.apache.beam.sdk.io.gcp.bigquery.BigQueryStorageTableSource<T>
- All Implemented Interfaces:
Serializable
,HasDisplayData
A
Source
representing reading from a table.- See Also:
-
Nested Class Summary
Nested classes/interfaces inherited from class org.apache.beam.sdk.io.BoundedSource
BoundedSource.BoundedReader<T>
Nested classes/interfaces inherited from class org.apache.beam.sdk.io.Source
Source.Reader<T>
-
Field Summary
FieldsModifier and TypeFieldDescriptionprotected final BigQueryServices
protected final @Nullable com.google.cloud.bigquery.storage.v1.DataFormat
protected final SerializableFunction
<SchemaAndRecord, T> protected final @Nullable ValueProvider
<String> protected final @Nullable ValueProvider
<List<String>> -
Method Summary
Modifier and TypeMethodDescriptionstatic <T> BigQueryStorageTableSource
<T> create
(ValueProvider<TableReference> tableRefProvider, com.google.cloud.bigquery.storage.v1.DataFormat format, @Nullable ValueProvider<List<String>> selectedFields, @Nullable ValueProvider<String> rowRestriction, SerializableFunction<SchemaAndRecord, T> parseFn, Coder<T> outputCoder, BigQueryServices bqServices, boolean projectionPushdownApplied) static <T> BigQueryStorageTableSource
<T> create
(ValueProvider<TableReference> tableRefProvider, @Nullable ValueProvider<List<String>> selectedFields, @Nullable ValueProvider<String> rowRestriction, SerializableFunction<SchemaAndRecord, T> parseFn, Coder<T> outputCoder, BigQueryServices bqServices) createReader
(PipelineOptions options) Returns a newBoundedSource.BoundedReader
that reads from this source.long
getEstimatedSizeBytes
(PipelineOptions options) An estimate of the total size (in bytes) of the data that would be read from this source.Returns theCoder
to use for the data read from this source.protected Table
getTargetTable
(BigQueryOptions options) Returns the table to read from at split time.protected String
getTargetTableId
(BigQueryOptions options) void
populateDisplayData
(DisplayData.Builder builder) Register display data for the given transform or component.split
(long desiredBundleSizeBytes, PipelineOptions options) Splits the source into bundles of approximatelydesiredBundleSizeBytes
.Methods inherited from class org.apache.beam.sdk.io.Source
getDefaultOutputCoder, validate
-
Field Details
-
format
-
selectedFieldsProvider
-
rowRestrictionProvider
-
parseFn
-
outputCoder
-
bqServices
-
-
Method Details
-
create
public static <T> BigQueryStorageTableSource<T> create(ValueProvider<TableReference> tableRefProvider, com.google.cloud.bigquery.storage.v1.DataFormat format, @Nullable ValueProvider<List<String>> selectedFields, @Nullable ValueProvider<String> rowRestriction, SerializableFunction<SchemaAndRecord, T> parseFn, Coder<T> outputCoder, BigQueryServices bqServices, boolean projectionPushdownApplied) -
create
public static <T> BigQueryStorageTableSource<T> create(ValueProvider<TableReference> tableRefProvider, @Nullable ValueProvider<List<String>> selectedFields, @Nullable ValueProvider<String> rowRestriction, SerializableFunction<SchemaAndRecord, T> parseFn, Coder<T> outputCoder, BigQueryServices bqServices) -
populateDisplayData
Description copied from class:Source
Register display data for the given transform or component.populateDisplayData(DisplayData.Builder)
is invoked by Pipeline runners to collect display data viaDisplayData.from(HasDisplayData)
. Implementations may callsuper.populateDisplayData(builder)
in order to register display data in the current namespace, but should otherwise usesubcomponent.populateDisplayData(builder)
to use the namespace of the subcomponent.By default, does not register any display data. Implementors may override this method to provide their own display data.
- Specified by:
populateDisplayData
in interfaceHasDisplayData
- Overrides:
populateDisplayData
in classSource<T>
- Parameters:
builder
- The builder to populate with display data.- See Also:
-
getEstimatedSizeBytes
Description copied from class:BoundedSource
An estimate of the total size (in bytes) of the data that would be read from this source. This estimate is in terms of external storage size, before any decompression or other processing done by the reader.If there is no way to estimate the size of the source implementations MAY return 0L.
- Specified by:
getEstimatedSizeBytes
in classBoundedSource<T>
- Throws:
Exception
-
getTargetTableId
- Throws:
Exception
-
getTargetTable
Returns the table to read from at split time. This is currently never an anonymous table, but it can be a named table which was created to hold the results of a query.- Throws:
Exception
-
getOutputCoder
Description copied from class:Source
Returns theCoder
to use for the data read from this source.- Overrides:
getOutputCoder
in classSource<T>
-
split
public List<org.apache.beam.sdk.io.gcp.bigquery.BigQueryStorageStreamSource<T>> split(long desiredBundleSizeBytes, PipelineOptions options) throws Exception Description copied from class:BoundedSource
Splits the source into bundles of approximatelydesiredBundleSizeBytes
.- Specified by:
split
in classBoundedSource<T>
- Throws:
Exception
-
createReader
Description copied from class:BoundedSource
Returns a newBoundedSource.BoundedReader
that reads from this source.- Specified by:
createReader
in classBoundedSource<T>
- Throws:
IOException
-