CREATE EXTERNAL TABLE

Beam SQL’s CREATE EXTERNAL TABLE statement registers a virtual table that maps to an external storage system. For some storage systems, CREATE EXTERNAL TABLE does not create a physical table until a write occurs. After the physical table exists, you can access the table with the SELECT, JOIN, and INSERT INTO statements.

The CREATE EXTERNAL TABLE statement includes a schema and extended clauses.

Syntax

CREATE EXTERNAL TABLE [ IF NOT EXISTS ] tableName (tableElement [, tableElement ]*)
TYPE type
[LOCATION location]
[TBLPROPERTIES tblProperties]

simpleType: TINYINT | SMALLINT | INTEGER | BIGINT | FLOAT | DOUBLE | DECIMAL | BOOLEAN | DATE | TIME | TIMESTAMP | CHAR | VARCHAR

fieldType: simpleType | MAP<simpleType, fieldType> | ARRAY<fieldType> | ROW<tableElement [, tableElement ]*>

tableElement: columnName fieldType [ NOT NULL ]

BigQuery

Syntax

CREATE EXTERNAL TABLE [ IF NOT EXISTS ] tableName (tableElement [, tableElement ]*)
TYPE bigquery
LOCATION '[PROJECT_ID]:[DATASET].[TABLE]'

Read Mode

Beam SQL supports reading columns with simple types (simpleType) and arrays of simple types (ARRAY<simpleType>).

Write Mode

if the table does not exist, Beam creates the table specified in location when the first record is written. If the table does exist, the specified columns must match the existing table.

Schema

Schema-related errors will cause the pipeline to crash. The Map type is not supported. Beam SQL types map to BigQuery Standard SQL types as follows:

Beam SQL Type BigQuery Standard SQL Type
TINYINT, SMALLINT, INTEGER, BIGINT   INT64
FLOAT, DOUBLE, DECIMAL FLOAT64
BOOLEAN BOOL
DATE DATE
TIME TIME
TIMESTAMP TIMESTAMP
CHAR, VARCHAR STRING
MAP (not supported)
ARRAY ARRAY
ROW STRUCT

Example

CREATE EXTERNAL TABLE users (id INTEGER, username VARCHAR)
TYPE bigquery
LOCATION 'testing-integration:apache.users'

Pub/Sub

Syntax

CREATE EXTERNAL TABLE [ IF NOT EXISTS ] tableName
  (
   event_timestamp TIMESTAMP,
   attributes MAP<VARCHAR, VARCHAR>,
   payload ROW<tableElement [, tableElement ]*>
  )
TYPE pubsub
LOCATION 'projects/[PROJECT]/topics/[TOPIC]'
TBLPROPERTIES '{"timestampAttributeKey": "key", "deadLetterQueue": "projects/[PROJECT]/topics/[TOPIC]"}'

Read Mode

PubsubIO is currently limited to read access only.

Write Mode

Not supported. PubSubIO is currently limited to read access only in Beam SQL.

Schema

Pub/Sub messages have metadata associated with them, and you can reference this metadata in your queries. For each message, Pub/Sub exposes its publish time and a map of user-provided attributes in addition to the payload (unstructured in the general case). This information must be preserved and accessible from the SQL statements. Currently, this means that PubsubIO tables require you to declare a special set of columns, as shown below.

Supported Payload

Example

CREATE EXTERNAL TABLE locations (event_timestamp TIMESTAMP, attributes MAP<VARCHAR, VARCHAR>, payload ROW<id INTEGER, location VARCHAR>)
TYPE pubsub
LOCATION 'projects/testing-integration/topics/user-location'

Kafka

KafkaIO is experimental in Beam SQL.

Syntax

CREATE EXTERNAL TABLE [ IF NOT EXISTS ] tableName (tableElement [, tableElement ]*)
TYPE kafka
LOCATION 'kafka://localhost:2181/brokers'
TBLPROPERTIES '{"bootstrap.servers":"localhost:9092", "topics": ["topic1", "topic2"]}'

Read Mode

Read Mode supports reading from a topic.

Write Mode

Write Mode supports writing to a topic.

Supported Payload

Schema

Only simple types are supported.

Text

TextIO is experimental in Beam SQL. Read Mode and Write Mode do not currently access the same underlying data.

Syntax

CREATE EXTERNAL TABLE [ IF NOT EXISTS ] tableName (tableElement [, tableElement ]*)
TYPE text
LOCATION '/home/admin/orders'
TBLPROPERTIES '{"format: "Excel"}'

Read Mode

Read Mode supports reading from a file.

Write Mode

Write Mode supports writing to a set of files. TextIO creates file on writes.

Supported Payload

Schema

Only simple types are supported.

Example

CREATE EXTERNAL TABLE orders (id INTEGER, price INTEGER)
TYPE text
LOCATION '/home/admin/orders'