public class ByteKeyRangeTracker extends RestrictionTracker<ByteKeyRange,ByteKey> implements RestrictionTracker.HasProgress
ByteKeys in a
ByteKeyRangein a monotonically increasing fashion. The range is a semi-open bounded interval [startKey, endKey) where the limits are both represented by
Note, one can complete a range by claiming the
ByteKey.EMPTY once one runs out of keys
RestrictionTracker.HasProgress, RestrictionTracker.IsBounded, RestrictionTracker.Progress, RestrictionTracker.TruncateResult<RestrictionT>
|Modifier and Type||Method and Description|
Checks whether the restriction has been fully processed.
Returns a restriction accurately describing the full range of work the current
A representation for the amount of known completed and known remaining work.
Return the boundedness of the current restriction.
Attempts to claim the given key.
Splits current restriction based on
public static ByteKeyRangeTracker of(ByteKeyRange range)
public ByteKeyRange currentRestriction()
DoFn.ProcessElementcall will do, including already completed work.
The current restriction returned by method may be updated dynamically due to due to
concurrent invocation of other methods of the
RestrictionTracker, For example,
This method is required to be implemented.
public SplitResult<ByteKeyRange> trySplit(double fractionOfRemainder)
If splitting the current restriction is possible, the current restriction is split into a
primary and residual restriction pair. This invocation updates the
RestrictionTracker.currentRestriction() to be the primary restriction effectively having the current
DoFn.ProcessElement execution responsible for performing the work that the primary restriction
represents. The residual restriction will be executed in a separate
invocation (likely in a different process). The work performed by executing the primary and
residual restrictions as separate
DoFn.ProcessElement invocations MUST be equivalent to
the work performed as if this split never occurred.
fractionOfRemainder should be used in a best effort manner to choose a primary
and residual restriction based upon the fraction of the remaining work that the current
DoFn.ProcessElement invocation is responsible for. For example, if a
DoFn.ProcessElement was reading a file with a restriction representing the offset range
[100, 200) and has processed up to offset 130 with a
0.7, the primary and residual restrictions returned would be
[100, 179), [179, 200)
currentOffset + fractionOfRemainder * remainingWork = 130 + 0.7 * 70 = 179).
fractionOfRemainder = 0 means a checkpoint is required.
The API is recommended to be implemented for a batch pipeline to improve parallel processing performance.
The API is recommended to be implemented for batch pipeline given that it is very important for pipeline scaling and end to end pipeline execution.
The API is required to be implemented for a streaming pipeline.
fractionOfRemainder- A hint as to the fraction of work the primary restriction should represent based upon the current known remaining amount of work.
SplitResultif a split was possible, otherwise returns
null. If the
fractionOfRemainder == 0, a
nullresult MUST imply that the restriction tracker is done and there is no more work left to do.
public boolean tryClaim(ByteKey key)
Must be larger than the last attempted key. Since this restriction tracker represents a
range over a semi-open bounded interval
[start, end), the last key that was attempted
may have failed but still have consumed the interval
[lastAttemptedKey, end) since this
range tracker processes keys in a monotonically increasing order. Note that passing in
ByteKey.EMPTY claims all keys to the end of range and can only be claimed once.
trueif the key was successfully claimed,
falseif it is outside the current
ByteKeyRangeof this tracker.
public void checkDone() throws java.lang.IllegalStateException
Called by the SDK harness after
Must throw an exception with an informative error message, if there is still any unclaimed work remaining in the restriction.
This method is required to be implemented in order to prevent data loss during SDK processing.
public RestrictionTracker.IsBounded isBounded()
RestrictionTracker.IsBounded.BOUNDED. Otherwise, it should return
It is valid to return
RestrictionTracker.IsBounded.BOUNDED after returning
once the end of a restriction is discovered. It is not valid to return
RestrictionTracker.IsBounded.UNBOUNDED after returning
This method is required to be implemented.
public java.lang.String toString()
public RestrictionTracker.Progress getProgress()
It is up to each restriction tracker to convert between their natural representation of
completed and remaining work and the
double representation. For example:
message bytesthat have processed and the number of messages or number of
message bytesthat are outstanding.
The work completed and work remaining must be of the same scale whether that be number of messages or number of bytes and should never represent two distinct unit types.