java.lang.Object

org.apache.beam.sdk.io.gcp.bigquery.BigQuerySchemaIOProvider

All Implemented Interfaces:: SchemaIOProvider

@Internal @AutoService(SchemaIOProvider.class) public class BigQuerySchemaIOProvider extends Object implements SchemaIOProvider

An implementation of SchemaIOProvider for reading and writing to BigQuery with BigQueryIO. For a description of configuration options and other defaults, see configurationSchema().

Constructor Summary

Constructors

Constructor

Description

BigQuerySchemaIOProvider()
Method Summary

Modifier and Type

Method

Description

Schema

configurationSchema()

Returns the expected schema of the configuration object.

org.apache.beam.sdk.io.gcp.bigquery.BigQuerySchemaIOProvider.BigQuerySchemaIO

from(String location, Row configuration, @Nullable Schema dataSchema)

Produces a SchemaIO given a String representing the data's location, the schema of the data that resides there, and some IO-specific configuration object.

String

identifier()

Returns an id that uniquely represents this IO.

PCollection.IsBounded

isBounded()

Indicates whether the PCollections produced by this transform will contain a bounded or unbounded number of elements.

boolean

requiresDataSchema()

Indicates whether this transform requires a specified data schema.

Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

Constructor Details
- BigQuerySchemaIOProvider
  
  public BigQuerySchemaIOProvider()
Method Details
- identifier
  
  public String identifier()
  
  Returns an id that uniquely represents this IO.
  
  Specified by:
  
  identifier in interface SchemaIOProvider
- configurationSchema
  
  public Schema configurationSchema()
  Returns the expected schema of the configuration object. Note this is distinct from the schema of the data source itself. The fields are as follows:
  
  table: Nullable String - Used for reads and writes. Specifies a table to read or write to, in the format described in BigQueryHelpers.parseTableSpec(java.lang.String). Used as an input to BigQueryIO.TypedRead.from(String) or BigQueryIO.Write.to(String).
  query: Nullable String - Used for reads. Specifies a query to read results from using the BigQuery Standard SQL dialect. Used as an input to BigQueryIO.TypedRead.fromQuery(String).
  queryLocation: Nullable String - Used for reads. Specifies a BigQuery geographic location where the query job will be executed. Used as an input to BigQueryIO.TypedRead.withQueryLocation(String).
  createDisposition: Nullable String - Used for writes. Specifies whether a table should be created if it does not exist. Valid inputs are "Never" and "IfNeeded", corresponding to values of BigQueryIO.Write.CreateDisposition. Used as an input to BigQueryIO.Write.withCreateDisposition(BigQueryIO.Write.CreateDisposition).
  Relevant default values for these transforms that are not configurable fields are as follows:
  
  ReadMethod - The input to BigQueryIO.TypedRead.withMethod(BigQueryIO.TypedRead.Method). Defaults to EXPORT, since that is the only method that currently offers Beam Schema support.
  WriteMethod - The input to BigQueryIO.Write.withMethod(BigQueryIO.Write.Method). Currently defaults to STORAGE_WRITE_API.
  Specified by:
  
  configurationSchema in interface SchemaIOProvider
- from
  
  public org.apache.beam.sdk.io.gcp.bigquery.BigQuerySchemaIOProvider.BigQuerySchemaIO from(String location, Row configuration, @Nullable Schema dataSchema)
  
  Produces a SchemaIO given a String representing the data's location, the schema of the data that resides there, and some IO-specific configuration object.
  For BigQuery IO, only the configuration object is used. Location and data schema have no effect. Specifying a table and dataset is done through appropriate fields in the configuration object, and the data schema is automatically generated from either the input PCollection or schema of the BigQuery table.
  
  Specified by:
  
  from in interface SchemaIOProvider
- requiresDataSchema
  
  public boolean requiresDataSchema()
  
  Indicates whether this transform requires a specified data schema.
  
  Specified by:
  
  requiresDataSchema in interface SchemaIOProvider
  
  Returns:
  
  false
- isBounded
  
  public PCollection.IsBounded isBounded()
  
  Indicates whether the PCollections produced by this transform will contain a bounded or unbounded number of elements.
  
  Specified by:
  
  isBounded in interface SchemaIOProvider
  
  Returns:
  
  Bounded

Class BigQuerySchemaIOProvider

Constructor Summary

Method Summary

Methods inherited from class java.lang.Object

Constructor Details

BigQuerySchemaIOProvider

Method Details

identifier

configurationSchema

from

requiresDataSchema

isBounded