public class UnboundedSourceImpl extends UnboundedSource<com.google.cloud.pubsublite.proto.SequencedMessage,CheckpointMarkImpl>
UnboundedSource.CheckpointMark, UnboundedSource.UnboundedReader<OutputT>Source.Reader<T>| Modifier and Type | Method and Description |
|---|---|
UnboundedSource.UnboundedReader<com.google.cloud.pubsublite.proto.SequencedMessage> |
createReader(PipelineOptions options,
@Nullable CheckpointMarkImpl checkpointMark)
Create a new
UnboundedSource.UnboundedReader to read from this source, resuming from the given
checkpoint if present. |
Coder<CheckpointMarkImpl> |
getCheckpointMarkCoder()
Returns a
Coder for encoding and decoding the checkpoints for this source. |
Coder<com.google.cloud.pubsublite.proto.SequencedMessage> |
getOutputCoder()
Returns the
Coder to use for the data read from this source. |
java.util.List<? extends UnboundedSource<com.google.cloud.pubsublite.proto.SequencedMessage,CheckpointMarkImpl>> |
split(int desiredNumSplits,
PipelineOptions options)
Returns a list of
UnboundedSource objects representing the instances of this source
that should be used when executing the workflow. |
requiresDedupinggetDefaultOutputCoder, populateDisplayData, validatepublic java.util.List<? extends UnboundedSource<com.google.cloud.pubsublite.proto.SequencedMessage,CheckpointMarkImpl>> split(int desiredNumSplits, PipelineOptions options) throws java.lang.Exception
UnboundedSourceUnboundedSource objects representing the instances of this source
that should be used when executing the workflow. Each split should return a separate partition
of the input data.
For example, for a source reading from a growing directory of files, each split could correspond to a prefix of file names.
Some sources are not splittable, such as reading from a single TCP stream. In that case, only a single split should be returned.
Some data sources automatically partition their data among readers. For these types of
inputs, n identical replicas of the top-level source can be returned.
The size of the returned list should be as close to desiredNumSplits as possible,
but does not have to match exactly. A low number of splits will limit the amount of parallelism
in the source.
split in class UnboundedSource<com.google.cloud.pubsublite.proto.SequencedMessage,CheckpointMarkImpl>java.lang.Exceptionpublic UnboundedSource.UnboundedReader<com.google.cloud.pubsublite.proto.SequencedMessage> createReader(PipelineOptions options, @Nullable CheckpointMarkImpl checkpointMark) throws java.io.IOException
UnboundedSourceUnboundedSource.UnboundedReader to read from this source, resuming from the given
checkpoint if present.createReader in class UnboundedSource<com.google.cloud.pubsublite.proto.SequencedMessage,CheckpointMarkImpl>java.io.IOExceptionpublic Coder<CheckpointMarkImpl> getCheckpointMarkCoder()
UnboundedSourceCoder for encoding and decoding the checkpoints for this source.getCheckpointMarkCoder in class UnboundedSource<com.google.cloud.pubsublite.proto.SequencedMessage,CheckpointMarkImpl>public Coder<com.google.cloud.pubsublite.proto.SequencedMessage> getOutputCoder()
SourceCoder to use for the data read from this source.getOutputCoder in class Source<com.google.cloud.pubsublite.proto.SequencedMessage>