Class SnowflakeIO.Write<T>

java.lang.Object
org.apache.beam.sdk.transforms.PTransform<PCollection<T>,PDone>
org.apache.beam.sdk.io.snowflake.SnowflakeIO.Write<T>
All Implemented Interfaces:
Serializable, HasDisplayData
Enclosing class:
SnowflakeIO

public abstract static class SnowflakeIO.Write<T> extends PTransform<PCollection<T>,PDone>
Implementation of SnowflakeIO.write().
See Also:
  • Constructor Details

    • Write

      public Write()
  • Method Details

    • withDataSourceConfiguration

      public SnowflakeIO.Write<T> withDataSourceConfiguration(SnowflakeIO.DataSourceConfiguration config)
      Setting information about Snowflake server.
      Parameters:
      config - An instance of SnowflakeIO.DataSourceConfiguration.
    • withDataSourceProviderFn

      public SnowflakeIO.Write<T> withDataSourceProviderFn(SerializableFunction<Void,DataSource> dataSourceProviderFn)
      Setting function that will provide SnowflakeIO.DataSourceConfiguration in runtime.
      Parameters:
      dataSourceProviderFn - a SerializableFunction.
    • to

      public SnowflakeIO.Write<T> to(String table)
      A table name to be written in Snowflake.
      Parameters:
      table - String with the name of the table.
    • to

      public SnowflakeIO.Write<T> to(ValueProvider<String> table)
    • withStagingBucketName

      public SnowflakeIO.Write<T> withStagingBucketName(String stagingBucketName)
      Name of the cloud bucket (GCS by now) to use as tmp location of CSVs during COPY statement.
      Parameters:
      stagingBucketName - String with the name of the bucket.
    • withStagingBucketName

      public SnowflakeIO.Write<T> withStagingBucketName(ValueProvider<String> stagingBucketName)
    • withStorageIntegrationName

      public SnowflakeIO.Write<T> withStorageIntegrationName(String integrationName)
      Name of the Storage Integration in Snowflake to be used. See https://docs.snowflake.com/en/sql-reference/sql/create-storage-integration.html for reference.
      Parameters:
      integrationName - String with the name of the Storage Integration.
    • withStorageIntegrationName

      public SnowflakeIO.Write<T> withStorageIntegrationName(ValueProvider<String> integrationName)
    • withQueryTransformation

      public SnowflakeIO.Write<T> withQueryTransformation(String query)
      A query to be executed in Snowflake.
      Parameters:
      query - String with query.
    • withQueryTransformation

      public SnowflakeIO.Write<T> withQueryTransformation(ValueProvider<String> query)
    • withFileNameTemplate

      public SnowflakeIO.Write<T> withFileNameTemplate(String fileNameTemplate)
      A template name for files saved to GCP.
      Parameters:
      fileNameTemplate - String with template name for files.
    • withUserDataMapper

      public SnowflakeIO.Write<T> withUserDataMapper(SnowflakeIO.UserDataMapper<T> userDataMapper)
      User-defined function mapping user data into CSV lines.
      Parameters:
      userDataMapper - an instance of SnowflakeIO.UserDataMapper.
    • withFlushTimeLimit

      public SnowflakeIO.Write<T> withFlushTimeLimit(Duration triggeringFrequency)
      Sets duration how often staged files will be created and then how often ingested by Snowflake during streaming.
      Parameters:
      triggeringFrequency - time for triggering frequency in Duration type.
      Returns:
    • withSnowPipe

      public SnowflakeIO.Write<T> withSnowPipe(String snowPipe)
      Sets name of SnowPipe which can be created in Snowflake dashboard or cli:
      
       CREATE snowPipeName AS COPY INTO your_table from @yourstage;
       

      The stage in COPY statement should be pointing to the cloud integration with the valid bucket url, ex. for GCS:

      
       CREATE STAGE yourstage
       URL = 'gcs://yourbucket/path/'
       STORAGE_INTEGRATION = your_integration;
       
      
       CREATE STORAGE INTEGRATION your_integration
         TYPE = EXTERNAL_STAGE
         STORAGE_PROVIDER = GCS
         ENABLED = TRUE
         STORAGE_ALLOWED_LOCATIONS = ('gcs://yourbucket/path/')
       
      Parameters:
      snowPipe - name of created SnowPipe in Snowflake dashboard.
      Returns:
    • withSnowPipe

      public SnowflakeIO.Write<T> withSnowPipe(ValueProvider<String> snowPipe)
      Same as withSnowPipe(String, but with a ValueProvider.
      Parameters:
      snowPipe - name of created SnowPipe in Snowflake dashboard.
      Returns:
    • withShardsNumber

      public SnowflakeIO.Write<T> withShardsNumber(Integer shardsNumber)
      Number of shards that are created per window.
      Parameters:
      shardsNumber - defined number of shards or 1 by default.
      Returns:
    • withFlushRowLimit

      public SnowflakeIO.Write<T> withFlushRowLimit(Integer rowsCount)
      Sets number of row limit that will be saved to the staged file and then loaded to Snowflake. If the number of rows will be lower than the limit it will be loaded with current number of rows after certain time specified by setting withFlushTimeLimit(Duration triggeringFrequency)
      Parameters:
      rowsCount - Number of rows that will be in one file staged for loading. Default: 10000.
      Returns:
    • withWriteDisposition

      public SnowflakeIO.Write<T> withWriteDisposition(WriteDisposition writeDisposition)
      A disposition to be used during writing to table phase.
      Parameters:
      writeDisposition - an instance of WriteDisposition.
    • withCreateDisposition

      public SnowflakeIO.Write<T> withCreateDisposition(CreateDisposition createDisposition)
      A disposition to be used during table preparation.
      Parameters:
      createDisposition - - an instance of CreateDisposition.
    • withTableSchema

      public SnowflakeIO.Write<T> withTableSchema(SnowflakeTableSchema tableSchema)
      Table schema to be used during creating table.
      Parameters:
      tableSchema - - an instance of SnowflakeTableSchema.
    • withSnowflakeServices

      public SnowflakeIO.Write<T> withSnowflakeServices(SnowflakeServices snowflakeServices)
      A snowflake service SnowflakeServices implementation which is supposed to be used.
      Parameters:
      snowflakeServices - an instance of SnowflakeServices.
    • withQuotationMark

      public SnowflakeIO.Write<T> withQuotationMark(String quotationMark)
      Sets Snowflake-specific quotations around strings.
      Parameters:
      quotationMark - with possible single quote ', double quote " or nothing. Default value is single quotation '.
      Returns:
    • withDebugMode

      public SnowflakeIO.Write<T> withDebugMode(StreamingLogLevel debugLevel)
      The option to verbose info (or only errors) of loaded files while streaming. It is not set by default because it may influence performance. For details: insert report REST API.
      Parameters:
      debugLevel - error or info debug level from enum StreamingLogLevel
      Returns:
    • expand

      public PDone expand(PCollection<T> input)
      Description copied from class: PTransform
      Override this method to specify how this PTransform should be expanded on the given InputT.

      NOTE: This method should not be called directly. Instead apply the PTransform should be applied to the InputT using the apply method.

      Composite transforms, which are defined in terms of other transforms, should return the output of one of the composed transforms. Non-composite transforms, which do not apply any transforms internally, should return a new unbound output and register evaluators (via backend-specific registration methods).

      Specified by:
      expand in class PTransform<PCollection<T>,PDone>
    • streamToTable

      protected ParDo.SingleOutput<List<String>,Void> streamToTable(SnowflakeServices snowflakeServices, ValueProvider<String> stagingBucketDir)