Class RowMutationInformation

java.lang.Object
org.apache.beam.sdk.io.gcp.bigquery.RowMutationInformation

public abstract class RowMutationInformation extends Object
This class indicates how to apply a row update to BigQuery. A sequence number must always be supplied to order the updates. Incorrect sequence numbers will result in unexpected state in the BigQuery table.
  • Constructor Details

    • RowMutationInformation

      public RowMutationInformation()
  • Method Details

    • getMutationType

      public abstract RowMutationInformation.MutationType getMutationType()
    • getSequenceNumber

      @Deprecated public abstract @Nullable Long getSequenceNumber()
      Deprecated.
      The sequence number used to drive the order of applied row mutations. @deprecated getChangeSequenceNumber() replaces this field as the BigQuery API instead supports the use of a string.
    • getChangeSequenceNumber

      public abstract String getChangeSequenceNumber()
      The value supplied to the BigQuery _CHANGE_SEQUENCE_NUMBER pseudo-column. See of(MutationType, String) for more details.
    • of

      @Deprecated public static RowMutationInformation of(RowMutationInformation.MutationType mutationType, long sequenceNumber)
      Deprecated.
      Instantiate RowMutationInformation with RowMutationInformation.MutationType and the sequenceNumber. @deprecated - instantiates RowMutationInformation via of(MutationType, String) forwarding the sequenceNumber value using Long.toHexString(long). sequenceNumber values < 0 will throw an error.
    • of

      public static RowMutationInformation of(RowMutationInformation.MutationType mutationType, String changeSequenceNumber)
      Instantiate RowMutationInformation with RowMutationInformation.MutationType and the changeSequenceNumber, which sets the BigQuery API _CHANGE_SEQUENCE_NUMBER pseudo column, enabling custom user-supplied ordering of RowMutations.

      Requirements for the changeSequenceNumber:

      • fixed format String in hexadecimal format
      • do not use hexadecimals encoded from negative numbers
      • each hexadecimal string separated into sections by forward slash: /
      • up to four sections allowed
      • each section is limited to 16 hexadecimal characters: 0-9, A-F, or a-f
      • The allowable range supported are values between 0/0/0/0 and FFFFFFFFFFFFFFFF/FFFFFFFFFFFFFFFF/FFFFFFFFFFFFFFFF/FFFFFFFFFFFFFFFF

      Below are some changeSequenceNumber scenarios:

      Record #1: changeSequenceNumber Record #2: changeSequenceNumber BigQuery API compares as
      "B" "ABC" Record #2 is considered the latest record: 'ABC' > 'B' (i.e. '2748' > '11')
      "FFF/B" "FFF/ABC" Record #2 is considered the latest record: "FFF/B" > "FFF/ABC" (i.e. "4095/2748" > "4095/11")
      "BA/FFFFFFFF" "ABC" Record #2 is considered the latest record: "ABC" > "BA/FFFFFFFF" (i.e. "2748" > "186/4294967295")
      "FFF/ABC" "ABC" Record #1 is considered the latest record: "FFF/ABC" > "ABC" (i.e. "4095/2748" > "2748")
      "FFF" "FFF" Record #1 and #2 change sequence number identical; BigQuery uses system ingestion time to take precedence over previously ingested records.

      Below are some code examples.

      • RowMutationInformation.of(UPSERT, "FFF/ABC")
      • Using Apache Commons Hex.encodeHexString(byte[]) (Java 17+ users can use HexFormat)RowMutationInformation.of(UPSERT, Hex.encodeHexString("2024-04-30 11:19:44 UTC".getBytes(StandardCharsets.UTF_8)))
      • Using Long.toHexString(long): RowMutationInformation.of(DELETE, Long.toHexString(123L))
      See https://cloud.google.com/bigquery/docs/change-data-capture#manage_custom_ordering for more details.