public class ProcessNewPartitionsAction extends java.lang.Object
|Constructor and Description|
|Modifier and Type||Method and Description|
Process a single new partition.
public boolean processNewPartition(NewPartition newPartition, DoFn.OutputReceiver<PartitionRecord> receiver)
When a parent partition splits, it receives two or more new partitions. It will write a new row, with the new row ranges as row key, for each new partition. These new partitions can be immediately streamed.
The complicated scenario is merges. Two or more parent partitions will merge into one new partition. Each parent partition receives the same new partition (row range) but each parent partition will have a different continuation token. The parent partitions will all write to the same row key form by the new row range. Each parent will record its continuation token, and watermark. Parent partitions may not receive the message to stop at the same time. So when we try to process the new partition, we need to ensure that all the parent partitions have stopped and recorded their metadata table. We do so by verifying that the row ranges of the parents covers a contiguous block of row range that is same as the new row range.
For example, partition1, A-B, and partition2, B-C, merges into partition3, A-C.
Note that, the algorithm to verify if a merge is valid, also correctly verifies if a split is valid. A split is immediately valid as long as the row exists because there's only one parent that needs to write to that row.
newPartition- new partition to be processed
receiver- to output new partitions