Class PubsubUnboundedSink

java.lang.Object
org.apache.beam.sdk.transforms.PTransform<PCollection<PubsubMessage>,PDone>
org.apache.beam.sdk.io.gcp.pubsub.PubsubUnboundedSink
All Implemented Interfaces:
Serializable, HasDisplayData

public class PubsubUnboundedSink extends PTransform<PCollection<PubsubMessage>,PDone>
A PTransform which streams messages to Pubsub.
  • The underlying implementation is just a GroupByKey followed by a ParDo which publishes as a side effect. (In the future we want to design and switch to a custom UnboundedSink implementation so as to gain access to system watermark and end-of-pipeline cleanup.)
  • We try to send messages in batches while also limiting send latency.
  • No stats are logged. Rather some counters are used to keep track of elements and batches.
  • Though some background threads are used by the underlying netty system all actual Pubsub calls are blocking. We rely on the underlying runner to allow multiple DoFn instances to execute concurrently and hide latency.
  • A failed bundle will cause messages to be resent. Thus we rely on the Pubsub consumer to dedup messages.
See Also: