ProcessNewPartitionsAction (Apache Beam 2.66.0)

java.lang.Object
- org.apache.beam.sdk.io.gcp.bigtable.changestreams.action.ProcessNewPartitionsAction

public class ProcessNewPartitionsAction
extends java.lang.Object

Constructor Summary

Constructors
Constructor and Description
`ProcessNewPartitionsAction(ChangeStreamMetrics metrics, MetadataTableDao metadataTableDao, Instant endTime)`

Method Summary

All Methods Instance Methods Concrete Methods
Modifier and Type	Method and Description
`boolean`	`processNewPartition(NewPartition newPartition, DoFn.OutputReceiver<PartitionRecord> receiver)` Process a single new partition.

Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

- Constructor Detail
  - ProcessNewPartitionsAction
```
public ProcessNewPartitionsAction(ChangeStreamMetrics metrics,
                                  MetadataTableDao metadataTableDao,
                                  @Nullable
                                  Instant endTime)
```
- Method Detail
  - processNewPartition
```
public boolean processNewPartition(NewPartition newPartition,
                                   DoFn.OutputReceiver<PartitionRecord> receiver)
```
    Process a single new partition. New partition resulting from split and merges need to be outputted to be streamed. Regardless if it's a split or a merge, we have the same verification process in order to ensure the new partition can actually be streamed.
    When a parent partition splits, it receives two or more new partitions. It will write a new row, with the new row ranges as row key, for each new partition. These new partitions can be immediately streamed.
    The complicated scenario is merges. Two or more parent partitions will merge into one new partition. Each parent partition receives the same new partition (row range) but each parent partition will have a different continuation token. The parent partitions will all write to the same row key form by the new row range. Each parent will record its continuation token, and watermark. Parent partitions may not receive the message to stop at the same time. So when we try to process the new partition, we need to ensure that all the parent partitions have stopped and recorded their metadata table. We do so by verifying that the row ranges of the parents covers a contiguous block of row range that is same as the new row range.
    For example, partition1, A-B, and partition2, B-C, merges into partition3, A-C.
    1. p1 writes to row A-C in metadata table
    2. processNewPartition process A-C seeing that only A-B has been recorded and A-B does not cover A-C. Do Nothing
    3. p2 writes to row A-C in metadata table
    4. processNewPartition process A-C again, seeing that A-B and B-C has been recorded and outputs new partition A-C to be streamed.
    Note that, the algorithm to verify if a merge is valid, also correctly verifies if a split is valid. A split is immediately valid as long as the row exists because there's only one parent that needs to write to that row.
    Parameters:
    
    newPartition - new partition to be processed
    
    receiver - to output new partitions

Class ProcessNewPartitionsAction

Constructor Summary

Method Summary

Methods inherited from class java.lang.Object

Constructor Detail

ProcessNewPartitionsAction

Method Detail

processNewPartition