Class DataGeneratorTableProvider

java.lang.Object
org.apache.beam.sdk.extensions.sql.meta.provider.InMemoryMetaTableProvider
org.apache.beam.sdk.extensions.sql.meta.provider.datagen.DataGeneratorTableProvider
All Implemented Interfaces:
TableProvider

@AutoService(TableProvider.class) public class DataGeneratorTableProvider extends InMemoryMetaTableProvider
The service entry point for the 'datagen' table type.

This provider allows for the creation of SQL-configurable test data sources. Tables of this type are defined using the CREATE EXTERNAL TABLE statement.

The provider supports generating both bounded data (for batch pipelines) using the "number-of-rows" property, and unbounded data (for streaming pipelines) using the "rows-per-second" property.


 CREATE EXTERNAL TABLE user_clicks (
 event_id BIGINT,
 user_id VARCHAR,
 click_timestamp TIMESTAMP,
 score DOUBLE
 )
 TYPE 'datagen'
 TBLPROPERTIES '{
 "rows-per-second": "100",

 "fields.event_id.kind": "sequence",
 "fields.event_id.start": "1",
 "fields.event_id.end": "1000000",

 "fields.user_id.kind": "random",
 "fields.user_id.length": "12",

 "fields.click_timestamp.kind": "random",
 "fields.click_timestamp.max-past": "60000",

 "fields.score.kind": "random",
 "fields.score.min": "0.0",
 "fields.score.max": "1.0",
 "fields.score.null-rate": "0.1"
 }'
 
  • Constructor Details

    • DataGeneratorTableProvider

      public DataGeneratorTableProvider()
  • Method Details

    • getTableType

      public String getTableType()
      Description copied from interface: TableProvider
      Gets the table type this provider handles.
    • buildBeamSqlTable

      public BeamSqlTable buildBeamSqlTable(Table table)
      Instantiates the DataGeneratorTable when a CREATE EXTERNAL TABLE statement with TYPE 'datagen' is executed.