apache_beam.io.gcp.pubsub module

Google Cloud PubSub sources and sinks.

Cloud Pub/Sub sources and sinks are currently supported only in streaming pipelines, during remote execution.

This API is currently under development and is subject to change.

class apache_beam.io.gcp.pubsub.ReadStringsFromPubSub(topic=None, subscription=None, id_label=None)[source]

Bases: apache_beam.transforms.ptransform.PTransform

A PTransform for reading utf-8 string payloads from Cloud Pub/Sub.

Initializes ReadStringsFromPubSub.

topic

Cloud Pub/Sub topic in the form “projects/<project>/topics/ <topic>”. If provided, subscription must be None.

subscription

Existing Cloud Pub/Sub subscription to use in the form “projects/<project>/subscriptions/<subscription>”. If not specified, a temporary subscription will be created from the specified topic. If provided, topic must be None.

id_label

The attribute on incoming Pub/Sub messages to use as a unique record identifier. When specified, the value of this attribute (which can be any string that uniquely identifies the record) will be used for deduplication of messages. If not provided, we cannot guarantee that no duplicate data will be delivered on the Pub/Sub stream. In this case, deduplication of the stream will be strictly best effort.

get_windowing(unused_inputs)[source]
expand(pvalue)[source]
class apache_beam.io.gcp.pubsub.WriteStringsToPubSub(topic)[source]

Bases: apache_beam.transforms.ptransform.PTransform

A PTransform for writing utf-8 string payloads to Cloud Pub/Sub.

Initializes WriteStringsToPubSub.

topic

Cloud Pub/Sub topic in the form “/topics/<project>/<topic>”.

expand(pcoll)[source]