Class IcebergIO.WriteRows

All Implemented Interfaces:
Serializable, HasDisplayData
Enclosing class:
IcebergIO

public abstract static class IcebergIO.WriteRows extends PTransform<PCollection<Row>,IcebergWriteResult>
See Also:
  • Constructor Details

    • WriteRows

      public WriteRows()
  • Method Details

    • to

      public IcebergIO.WriteRows to(org.apache.iceberg.catalog.TableIdentifier identifier)
    • to

      public IcebergIO.WriteRows to(DynamicDestinations destinations)
    • withTriggeringFrequency

      public IcebergIO.WriteRows withTriggeringFrequency(Duration triggeringFrequency)
      Sets the frequency at which data is written to files and a new Snapshot is produced.

      Roughly every triggeringFrequency duration, records are written to data files and appended to the respective table. Each append operation creates a new table snapshot.

      Generally speaking, increasing this duration will result in fewer, larger data files and fewer snapshots.

      This is only applicable when writing an unbounded PCollection (i.e. a streaming pipeline).

    • withDirectWriteByteLimit

      public IcebergIO.WriteRows withDirectWriteByteLimit(Integer directWriteByteLimit)
    • withDistributionMode

      public IcebergIO.WriteRows withDistributionMode(org.apache.iceberg.DistributionMode mode)
      Defines distribution of write data. Supported distributions:
      1. invalid reference
        DistributionMode.NONE
        : don't shuffle rows (default)
      2. invalid reference
        DistributionMode.HASH
        : shuffle rows by partition key before writing data
      invalid reference
      DistributionMode.RANGE
      is not supported yet
    • withAutosharding

      public IcebergIO.WriteRows withAutosharding()
    • expand

      public IcebergWriteResult expand(PCollection<Row> input)
      Description copied from class: PTransform
      Override this method to specify how this PTransform should be expanded on the given InputT.

      NOTE: This method should not be called directly. Instead apply the PTransform should be applied to the InputT using the apply method.

      Composite transforms, which are defined in terms of other transforms, should return the output of one of the composed transforms. Non-composite transforms, which do not apply any transforms internally, should return a new unbound output and register evaluators (via backend-specific registration methods).

      Specified by:
      expand in class PTransform<PCollection<Row>,IcebergWriteResult>