public abstract static class Source.Reader<T>
extends java.lang.Object
implements java.lang.AutoCloseable
This interface is deliberately distinct from Iterator
because
the current model tends to be easier to program and more efficient in practice
for iterating over sources such as files, databases etc. (rather than pure collections).
Reading data from the Source.Reader
must obey the following access pattern:
start()
start()
returned true, any number of calls to getCurrent
*
methodsadvance()
. This may be called regardless
of what the previous start()
/advance()
returned.
advance()
returned true, any number of calls to getCurrent
*
methodsFor example, if the reader is reading a fixed set of data:
try { for (boolean available = reader.start(); available; available = reader.advance()) { T item = reader.getCurrent(); Instant timestamp = reader.getCurrentTimestamp(); ... } } finally { reader.close(); }
If the set of data being read is continually growing:
try { boolean available = reader.start(); while (true) { if (available) { T item = reader.getCurrent(); Instant timestamp = reader.getCurrentTimestamp(); ... resetExponentialBackoff(); } else { exponentialBackoff(); } available = reader.advance(); } } finally { reader.close(); }
Note: this interface is a work-in-progress and may change.
All Reader
functions except getCurrentSource()
do not need to be thread-safe;
they may only be accessed by a single thread at once. However, getCurrentSource()
needs
to be thread-safe, and other functions should assume that its returned value can change
asynchronously.
Constructor and Description |
---|
Reader() |
Modifier and Type | Method and Description |
---|---|
abstract boolean |
advance()
Advances the reader to the next valid record.
|
abstract void |
close()
Closes the reader.
|
abstract T |
getCurrent()
|
abstract Source<T> |
getCurrentSource()
Returns a
Source describing the same input that this Reader currently reads
(including items already read). |
abstract Instant |
getCurrentTimestamp()
Returns the timestamp associated with the current data item.
|
abstract boolean |
start()
Initializes the reader and advances the reader to the first record.
|
public abstract boolean start() throws java.io.IOException
This method should be called exactly once. The invocation should occur prior to calling
advance()
or getCurrent()
. This method may perform expensive operations that
are needed to initialize the reader.
true
if a record was read, false
if there is no more input available.java.io.IOException
public abstract boolean advance() throws java.io.IOException
It is an error to call this without having called start()
first.
true
if a record was read, false
if there is no more input available.java.io.IOException
public abstract T getCurrent() throws java.util.NoSuchElementException
public abstract Instant getCurrentTimestamp() throws java.util.NoSuchElementException
If the source does not support timestamps, this should return
BoundedWindow.TIMESTAMP_MIN_VALUE
.
Multiple calls to this method without an intervening call to advance()
should
return the same result.
public abstract void close() throws java.io.IOException
close
in interface java.lang.AutoCloseable
java.io.IOException
public abstract Source<T> getCurrentSource()
Source
describing the same input that this Reader
currently reads
(including items already read).
Usually, an implementation will simply return the immutable Source
object from
which the current Source.Reader
was constructed, or delegate to the base class.
However, when using or implementing this method on a BoundedSource.BoundedReader
,
special considerations apply, see documentation for
BoundedSource.BoundedReader.getCurrentSource()
.