Class AddFiles

All Implemented Interfaces:
Serializable, HasDisplayData

public class AddFiles extends PTransform<PCollection<String>,PCollectionRowTuple>
A transform that takes in a stream of file paths, converts them to Iceberg DataFiles with partition metadata and metrics, then commits them to an Iceberg Table.
See Also:
  • Constructor Details

  • Method Details

    • expand

      public PCollectionRowTuple expand(PCollection<String> input)
      Description copied from class: PTransform
      Override this method to specify how this PTransform should be expanded on the given InputT.

      NOTE: This method should not be called directly. Instead apply the PTransform should be applied to the InputT using the apply method.

      Composite transforms, which are defined in terms of other transforms, should return the output of one of the composed transforms. Non-composite transforms, which do not apply any transforms internally, should return a new unbound output and register evaluators (via backend-specific registration methods).

      Specified by:
      expand in class PTransform<PCollection<String>,PCollectionRowTuple>
    • getFileMetrics

      public static org.apache.iceberg.Metrics getFileMetrics(org.apache.iceberg.io.InputFile file, org.apache.iceberg.FileFormat format, org.apache.iceberg.MetricsConfig config, org.apache.iceberg.mapping.NameMapping mapping) throws IOException
      Throws:
      IOException
    • inferFormat

      public static org.apache.iceberg.FileFormat inferFormat(String path)
      Tries to infer other file formats. Defaults to Parquet.