@Experimental(value=SOURCE_SINK) public class BigQueryStorageStreamSource<T> extends OffsetBasedSource<T>
Source
representing a single stream in a read session.Modifier and Type | Class and Description |
---|---|
static class |
BigQueryStorageStreamSource.BigQueryStorageStreamReader<T>
A
Source.Reader which reads records from a stream. |
OffsetBasedSource.OffsetBasedReader<T>
BoundedSource.BoundedReader<T>
Source.Reader<T>
Modifier and Type | Method and Description |
---|---|
static <T> BigQueryStorageStreamSource<T> |
create(com.google.cloud.bigquery.storage.v1beta1.Storage.ReadSession readSession,
com.google.cloud.bigquery.storage.v1beta1.Storage.Stream stream,
TableSchema tableSchema,
SerializableFunction<SchemaAndRecord,T> parseFn,
Coder<T> outputCoder,
BigQueryServices bqServices) |
BigQueryStorageStreamSource.BigQueryStorageStreamReader<T> |
createReader(PipelineOptions options)
Returns a new
BoundedSource.BoundedReader that reads from this source. |
OffsetBasedSource<T> |
createSourceForSubrange(long start,
long end)
Returns an
OffsetBasedSource for a subrange of the current source. |
long |
getEstimatedSizeBytes(PipelineOptions options)
An estimate of the total size (in bytes) of the data that would be read from this source.
|
long |
getMaxEndOffset(PipelineOptions options)
Returns the actual ending offset of the current source.
|
Coder<T> |
getOutputCoder()
Returns the
Coder to use for the data read from this source. |
void |
populateDisplayData(DisplayData.Builder builder)
Register display data for the given transform or component.
|
java.util.List<? extends OffsetBasedSource<T>> |
split(long desiredBundleSizeBytes,
PipelineOptions options)
Splits the source into bundles of approximately
desiredBundleSizeBytes . |
getBytesPerOffset, getEndOffset, getMinBundleSize, getStartOffset, toString, validate
getDefaultOutputCoder
public static <T> BigQueryStorageStreamSource<T> create(com.google.cloud.bigquery.storage.v1beta1.Storage.ReadSession readSession, com.google.cloud.bigquery.storage.v1beta1.Storage.Stream stream, TableSchema tableSchema, SerializableFunction<SchemaAndRecord,T> parseFn, Coder<T> outputCoder, BigQueryServices bqServices)
public Coder<T> getOutputCoder()
Source
Coder
to use for the data read from this source.getOutputCoder
in class Source<T>
public void populateDisplayData(DisplayData.Builder builder)
Source
populateDisplayData(DisplayData.Builder)
is invoked by Pipeline runners to collect
display data via DisplayData.from(HasDisplayData)
. Implementations may call super.populateDisplayData(builder)
in order to register display data in the current namespace,
but should otherwise use subcomponent.populateDisplayData(builder)
to use the namespace
of the subcomponent.
By default, does not register any display data. Implementors may override this method to provide their own display data.
populateDisplayData
in interface HasDisplayData
populateDisplayData
in class OffsetBasedSource<T>
builder
- The builder to populate with display data.HasDisplayData
public long getEstimatedSizeBytes(PipelineOptions options)
BoundedSource
If there is no way to estimate the size of the source implementations MAY return 0L.
getEstimatedSizeBytes
in class OffsetBasedSource<T>
public java.util.List<? extends OffsetBasedSource<T>> split(long desiredBundleSizeBytes, PipelineOptions options)
BoundedSource
desiredBundleSizeBytes
.split
in class OffsetBasedSource<T>
public long getMaxEndOffset(PipelineOptions options)
OffsetBasedSource
[startOffset, endOffset)
such that the range
used is [startOffset, min(endOffset, maxEndOffset))
.
As an example in which OffsetBasedSource
is used to implement a file source, suppose
that this source was constructed with an endOffset
of Long.MAX_VALUE
to
indicate that a file should be read to the end. Then this function should determine the actual,
exact size of the file in bytes and return it.
getMaxEndOffset
in class OffsetBasedSource<T>
public OffsetBasedSource<T> createSourceForSubrange(long start, long end)
OffsetBasedSource
OffsetBasedSource
for a subrange of the current source. The subrange [start, end)
must be within the range [startOffset, endOffset)
of the current source,
i.e. startOffset <= start < end <= endOffset
.createSourceForSubrange
in class OffsetBasedSource<T>
public BigQueryStorageStreamSource.BigQueryStorageStreamReader<T> createReader(PipelineOptions options) throws java.io.IOException
BoundedSource
BoundedSource.BoundedReader
that reads from this source.createReader
in class BoundedSource<T>
java.io.IOException