public class UnboundedReaderImpl extends UnboundedSource.UnboundedReader<com.google.cloud.pubsublite.proto.SequencedMessage>
BACKLOG_UNKNOWN| Modifier and Type | Method and Description |
|---|---|
boolean |
advance()
Advances the reader to the next valid record.
|
void |
close()
Closes the reader.
|
CheckpointMarkImpl |
getCheckpointMark()
Returns a
UnboundedSource.CheckpointMark representing the progress of this UnboundedReader. |
com.google.cloud.pubsublite.proto.SequencedMessage |
getCurrent()
Returns the value of the data item that was read by the last
Source.Reader.start() or Source.Reader.advance() call. |
UnboundedSource<com.google.cloud.pubsublite.proto.SequencedMessage,CheckpointMarkImpl> |
getCurrentSource()
Returns the
UnboundedSource that created this reader. |
Instant |
getCurrentTimestamp()
Returns the timestamp associated with the current data item.
|
long |
getSplitBacklogBytes()
Returns the size of the backlog of unread data in the underlying data source represented by
this split of this source.
|
Instant |
getWatermark()
Returns a timestamp before or at the timestamps of all future elements read by this reader.
|
boolean |
start()
Initializes the reader and advances the reader to the first record.
|
getCurrentRecordId, getTotalBacklogBytespublic com.google.cloud.pubsublite.proto.SequencedMessage getCurrent()
throws java.util.NoSuchElementException
Source.ReaderSource.Reader.start() or Source.Reader.advance() call. The returned value must be effectively immutable and remain valid
indefinitely.
Multiple calls to this method without an intervening call to Source.Reader.advance() should
return the same result.
getCurrent in class Source.Reader<com.google.cloud.pubsublite.proto.SequencedMessage>java.util.NoSuchElementException - if Source.Reader.start() was never called, or if the last
Source.Reader.start() or Source.Reader.advance() returned false.public Instant getCurrentTimestamp() throws java.util.NoSuchElementException
Source.ReaderIf the source does not support timestamps, this should return BoundedWindow.TIMESTAMP_MIN_VALUE.
Multiple calls to this method without an intervening call to Source.Reader.advance() should
return the same result.
getCurrentTimestamp in class Source.Reader<com.google.cloud.pubsublite.proto.SequencedMessage>java.util.NoSuchElementException - if the reader is at the beginning of the input and Source.Reader.start() or Source.Reader.advance() wasn't called, or if the last Source.Reader.start() or Source.Reader.advance() returned false.public void close()
throws java.io.IOException
Source.Readerclose in interface java.lang.AutoCloseableclose in class Source.Reader<com.google.cloud.pubsublite.proto.SequencedMessage>java.io.IOExceptionpublic boolean start()
throws java.io.IOException
UnboundedSource.UnboundedReaderThis method will be called exactly once. The invocation will occur prior to calling UnboundedSource.UnboundedReader.advance() or Source.Reader.getCurrent(). This method may perform expensive operations that are
needed to initialize the reader.
Returns true if a record was read, false if there is no more input
currently available. Future calls to UnboundedSource.UnboundedReader.advance() may return true once more data
is available. Regardless of the return value of start, start will not be
called again on the same UnboundedReader object; it will only be called again when a
new reader object is constructed for the same source, e.g. on recovery.
start in class UnboundedSource.UnboundedReader<com.google.cloud.pubsublite.proto.SequencedMessage>true if a record was read, false if there is no more input available.java.io.IOExceptionpublic boolean advance()
throws java.io.IOException
UnboundedSource.UnboundedReaderReturns true if a record was read, false if there is no more input
available. Future calls to UnboundedSource.UnboundedReader.advance() may return true once more data is
available.
advance in class UnboundedSource.UnboundedReader<com.google.cloud.pubsublite.proto.SequencedMessage>true if a record was read, false if there is no more input available.java.io.IOExceptionpublic Instant getWatermark()
UnboundedSource.UnboundedReaderThis can be approximate. If records are read that violate this guarantee, they will be
considered late, which will affect how they will be processed. See Window for more information on late data and how to
handle it.
However, this value should be as late as possible. Downstream windows may not be able to close until this watermark passes their end.
For example, a source may know that the records it reads will be in timestamp order. In this case, the watermark can be the timestamp of the last record read. For a source that does not have natural timestamps, timestamps can be set to the time of reading, in which case the watermark is the current clock time.
See Window and Trigger for more information on timestamps and
watermarks.
May be called after UnboundedSource.UnboundedReader.advance() or UnboundedSource.UnboundedReader.start() has returned false, but not before
UnboundedSource.UnboundedReader.start() has been called.
getWatermark in class UnboundedSource.UnboundedReader<com.google.cloud.pubsublite.proto.SequencedMessage>public CheckpointMarkImpl getCheckpointMark()
UnboundedSource.UnboundedReaderUnboundedSource.CheckpointMark representing the progress of this UnboundedReader.
If this UnboundedReader does not support checkpoints, it may return a
CheckpointMark which does nothing, like:
public UnboundedSource.CheckpointMark getCheckpointMark() {
return new UnboundedSource.CheckpointMark() {
public void finalizeCheckpoint() throws IOException {
// nothing to do
}
};
}
All elements read between the last time this method was called (or since this reader was
created, if this method has not been called on this reader) until this method is called will
be processed together as a bundle. (An element is considered 'read' if it could be returned
by a call to Source.Reader.getCurrent().)
Once the result of processing those elements and the returned checkpoint have been durably
committed, UnboundedSource.CheckpointMark.finalizeCheckpoint() will be called at most once at some
later point on the returned UnboundedSource.CheckpointMark object. Checkpoint finalization is
best-effort, and checkpoints may not be finalized. If duplicate elements may be produced if
checkpoints are not finalized in a timely manner, UnboundedSource.requiresDeduping()
should be overridden to return true, and UnboundedSource.UnboundedReader.getCurrentRecordId() should
be overridden to return unique record IDs.
A checkpoint will be committed to durable storage only if all all previous checkpoints produced by the same reader have also been committed.
The returned object should not be modified.
May not be called before UnboundedSource.UnboundedReader.start() has been called.
getCheckpointMark in class UnboundedSource.UnboundedReader<com.google.cloud.pubsublite.proto.SequencedMessage>public UnboundedSource<com.google.cloud.pubsublite.proto.SequencedMessage,CheckpointMarkImpl> getCurrentSource()
UnboundedSource.UnboundedReaderUnboundedSource that created this reader. This will not change over the
life of the reader.getCurrentSource in class UnboundedSource.UnboundedReader<com.google.cloud.pubsublite.proto.SequencedMessage>public long getSplitBacklogBytes()
UnboundedSource.UnboundedReaderOne of this or UnboundedSource.UnboundedReader.getTotalBacklogBytes() should be overridden in order to allow the
runner to scale the amount of resources allocated to the pipeline.
getSplitBacklogBytes in class UnboundedSource.UnboundedReader<com.google.cloud.pubsublite.proto.SequencedMessage>