Class UnboundedSourceImpl
java.lang.Object
org.apache.beam.sdk.io.Source<com.google.cloud.pubsublite.proto.SequencedMessage>
org.apache.beam.sdk.io.UnboundedSource<com.google.cloud.pubsublite.proto.SequencedMessage,CheckpointMarkImpl>
org.apache.beam.sdk.io.gcp.pubsublite.internal.UnboundedSourceImpl
- All Implemented Interfaces:
Serializable
,HasDisplayData
public class UnboundedSourceImpl
extends UnboundedSource<com.google.cloud.pubsublite.proto.SequencedMessage,CheckpointMarkImpl>
- See Also:
-
Nested Class Summary
Nested classes/interfaces inherited from class org.apache.beam.sdk.io.UnboundedSource
UnboundedSource.CheckpointMark, UnboundedSource.UnboundedReader<OutputT>
Nested classes/interfaces inherited from class org.apache.beam.sdk.io.Source
Source.Reader<T>
-
Method Summary
Modifier and TypeMethodDescriptionUnboundedSource.UnboundedReader
<com.google.cloud.pubsublite.proto.SequencedMessage> createReader
(PipelineOptions options, @Nullable CheckpointMarkImpl checkpointMark) Create a newUnboundedSource.UnboundedReader
to read from this source, resuming from the given checkpoint if present.Returns aCoder
for encoding and decoding the checkpoints for this source.Coder
<com.google.cloud.pubsublite.proto.SequencedMessage> Returns theCoder
to use for the data read from this source.List
<? extends UnboundedSource<com.google.cloud.pubsublite.proto.SequencedMessage, CheckpointMarkImpl>> split
(int desiredNumSplits, PipelineOptions options) Returns a list ofUnboundedSource
objects representing the instances of this source that should be used when executing the workflow.Methods inherited from class org.apache.beam.sdk.io.UnboundedSource
offsetBasedDeduplicationSupported, requiresDeduping
Methods inherited from class org.apache.beam.sdk.io.Source
getDefaultOutputCoder, populateDisplayData, validate
-
Method Details
-
split
public List<? extends UnboundedSource<com.google.cloud.pubsublite.proto.SequencedMessage,CheckpointMarkImpl>> split(int desiredNumSplits, PipelineOptions options) throws Exception Description copied from class:UnboundedSource
Returns a list ofUnboundedSource
objects representing the instances of this source that should be used when executing the workflow. Each split should return a separate partition of the input data.For example, for a source reading from a growing directory of files, each split could correspond to a prefix of file names.
Some sources are not splittable, such as reading from a single TCP stream. In that case, only a single split should be returned.
Some data sources automatically partition their data among readers. For these types of inputs,
n
identical replicas of the top-level source can be returned.The size of the returned list should be as close to
desiredNumSplits
as possible, but does not have to match exactly. A low number of splits will limit the amount of parallelism in the source.- Specified by:
split
in classUnboundedSource<com.google.cloud.pubsublite.proto.SequencedMessage,
CheckpointMarkImpl> - Throws:
Exception
-
createReader
public UnboundedSource.UnboundedReader<com.google.cloud.pubsublite.proto.SequencedMessage> createReader(PipelineOptions options, @Nullable CheckpointMarkImpl checkpointMark) throws IOException Description copied from class:UnboundedSource
Create a newUnboundedSource.UnboundedReader
to read from this source, resuming from the given checkpoint if present.- Specified by:
createReader
in classUnboundedSource<com.google.cloud.pubsublite.proto.SequencedMessage,
CheckpointMarkImpl> - Throws:
IOException
-
getCheckpointMarkCoder
Description copied from class:UnboundedSource
Returns aCoder
for encoding and decoding the checkpoints for this source.- Specified by:
getCheckpointMarkCoder
in classUnboundedSource<com.google.cloud.pubsublite.proto.SequencedMessage,
CheckpointMarkImpl>
-
getOutputCoder
Description copied from class:Source
Returns theCoder
to use for the data read from this source.- Overrides:
getOutputCoder
in classSource<com.google.cloud.pubsublite.proto.SequencedMessage>
-