@Internal public class PartitionReconciler extends java.lang.Object
Example of race condition:
To reconcile this, we identify partitions that haven't been streamed for at least 5 minutes. This is probably an indication that there were some races of CloseStream merge messages.
Constructor and Description |
---|
PartitionReconciler(MetadataTableDao metadataTableDao,
ChangeStreamMetrics metrics) |
Modifier and Type | Method and Description |
---|---|
void |
addIncompleteNewPartitions(NewPartition newPartition)
Capture NewPartition row that cannot merge on its own.
|
void |
addMissingPartitions(java.util.List<com.google.cloud.bigtable.data.v2.models.Range.ByteStringRange> missingPartitions)
Capture partitions that are not currently being streamed.
|
java.util.List<PartitionRecord> |
getPartitionsToReconcile(Instant lowWatermark,
Instant startTime)
For missing partitions, try to organize the mismatched parent tokens in a way to fill the
missing partitions.
|
public PartitionReconciler(MetadataTableDao metadataTableDao, ChangeStreamMetrics metrics)
public void addMissingPartitions(java.util.List<com.google.cloud.bigtable.data.v2.models.Range.ByteStringRange> missingPartitions)
Combine existing missing partitions and current (newly added) missing partitions. If missing partitions have been missing for more than allotted time, it will be reconciled.
It is possible that a missing partition's boundary can change frequently, such that it can take a long time to realize a partition is truly missing. For example, if [C, D) is missing, but there are a lot of splits and merges around [C, D), we may see that sometimes [B,D) is missing, or at other times [C-E) is missing due to split and merge activities of [B-C) and [D-E), while [C-D) is truly missing. The moving boundaries would reset the timer leading to slower reconciliation of the missing partition.
missingPartitions
- partitions not being streamed.public void addIncompleteNewPartitions(NewPartition newPartition)
newPartition
- new partition waiting to be created.public java.util.List<PartitionRecord> getPartitionsToReconcile(Instant lowWatermark, Instant startTime)
If there are parent tokens that when combined form a missing partition, it can be outputted as a merge of the missing partition.
If there are no parent tokens for a missing partition, it will need to be reconciled with adjusted low watermark. This is a catch-all solution. We don't expect to ever get into this situation. Missing partitions should all be mismatched merges that can be reconciled by organizing them correctly.
lowWatermark
- watermark that all reconciled partition should havestartTime
- to help compute optimal reconcile point