@Experimental(value=SOURCE_SINK) protected abstract static class BlockBasedSource.BlockBasedReader<T> extends FileBasedSource.FileBasedReader<T>
Reader
that reads records from a BlockBasedSource
. If the source is a
subrange of a file, the blocks that will be read by this reader are those such that the first
byte of the block is within the range [start, end)
.SPLIT_POINTS_UNKNOWN
Modifier | Constructor and Description |
---|---|
protected |
BlockBasedReader(BlockBasedSource<T> source) |
Modifier and Type | Method and Description |
---|---|
T |
getCurrent()
Returns the value of the data item that was read by the last
Source.Reader.start() or
Source.Reader.advance() call. |
abstract BlockBasedSource.Block<T> |
getCurrentBlock()
Returns the current block (the block that was read by the last successful call to
readNextBlock() ). |
abstract long |
getCurrentBlockOffset()
Returns the largest offset such that starting to read from that offset includes the current
block.
|
abstract long |
getCurrentBlockSize()
Returns the size of the current block in bytes as it is represented in the underlying file,
if possible.
|
protected long |
getCurrentOffset()
Returns the starting offset of the
current record ,
which has been read by the last successful Source.Reader.start() or
Source.Reader.advance() call. |
java.lang.Double |
getFractionConsumed()
Returns a value in [0, 1] representing approximately what fraction of the
current source this reader has read so far, or null if such
an estimate is not available. |
protected boolean |
isAtSplitPoint()
Returns true if the reader is at a split point.
|
abstract boolean |
readNextBlock()
Read the next block from the input.
|
protected boolean |
readNextRecord()
Reads the next record from the
current block if
possible. |
advanceImpl, allowsDynamicSplitting, close, getCurrentSource, startImpl, startReading
advance, getSplitPointsConsumed, getSplitPointsRemaining, isDone, isStarted, splitAtFraction, start
getCurrentTimestamp
protected BlockBasedReader(BlockBasedSource<T> source)
public abstract boolean readNextBlock() throws java.io.IOException
java.io.IOException
@Nullable public abstract BlockBasedSource.Block<T> getCurrentBlock()
readNextBlock()
). May return null initially, or if no block has been
successfully read.public abstract long getCurrentBlockSize()
0
if the size of the current block is unknown.
The size returned by this method must be such that for two successive blocks A and B,
offset(A) + size(A) <= offset(B)
. If this is not satisfied, the progress reported
by the BlockBasedReader
will be non-monotonic and will interfere with the quality
(but not correctness) of dynamic work rebalancing.
This method and BlockBasedSource.Block.getFractionOfBlockConsumed()
are used to provide an estimate
of progress within a block (getCurrentBlock().getFractionOfBlockConsumed() *
getCurrentBlockSize()
). It is acceptable for the result of this computation to be 0
,
but progress estimation will be inaccurate.
public abstract long getCurrentBlockOffset()
public final T getCurrent() throws java.util.NoSuchElementException
Source.Reader
Source.Reader.start()
or
Source.Reader.advance()
call. The returned value must be effectively immutable and remain valid
indefinitely.
Multiple calls to this method without an intervening call to Source.Reader.advance()
should
return the same result.
getCurrent
in class Source.Reader<T>
java.util.NoSuchElementException
- if Source.Reader.start()
was never called, or if
the last Source.Reader.start()
or Source.Reader.advance()
returned false
.protected boolean isAtSplitPoint()
BlockBasedReader
is at a split
point if the current record is the first record in a block. In other words, split points
are block boundaries.isAtSplitPoint
in class OffsetBasedSource.OffsetBasedReader<T>
protected final boolean readNextRecord() throws java.io.IOException
current block
if
possible. Will call readNextBlock()
to advance to the next block if not.
The first record read from a block is treated as a split point.
readNextRecord
in class FileBasedSource.FileBasedReader<T>
true
if a record was successfully read, false
if the end of the
channel was reached before successfully reading a new record.java.io.IOException
@Nullable public java.lang.Double getFractionConsumed()
BoundedSource.BoundedReader
current source
this reader has read so far, or null
if such
an estimate is not available.
It is recommended that this method should satisfy the following properties:
Source.Reader.start()
call.
Source.Reader.start()
or Source.Reader.advance()
call that returns false.
By default, returns null to indicate that this cannot be estimated.
BoundedSource.BoundedReader.splitAtFraction(double)
is implemented, this method can be called concurrently to other
methods (including itself), and it is therefore critical for it to be implemented
in a thread-safe way.getFractionConsumed
in class OffsetBasedSource.OffsetBasedReader<T>
protected long getCurrentOffset()
OffsetBasedSource.OffsetBasedReader
current record
,
which has been read by the last successful Source.Reader.start()
or
Source.Reader.advance()
call.
If no such call has been made yet, the return value is unspecified.
See RangeTracker
for description of offset semantics.
getCurrentOffset
in class OffsetBasedSource.OffsetBasedReader<T>