Class PubsubIO.Write<T>

java.lang.Object
org.apache.beam.sdk.transforms.PTransform<PCollection<T>,PDone>
org.apache.beam.sdk.io.gcp.pubsub.PubsubIO.Write<T>
All Implemented Interfaces:
Serializable, HasDisplayData
Enclosing class:
PubsubIO

public abstract static class PubsubIO.Write<T> extends PTransform<PCollection<T>,PDone>
Implementation of write methods.
See Also:
  • Constructor Details

    • Write

      public Write()
  • Method Details

    • to

      public PubsubIO.Write<T> to(String topic)
      Publishes to the specified topic.

      See PubsubIO.PubsubTopic.fromPath(String) for more details on the format of the topic string.

    • to

      public PubsubIO.Write<T> to(ValueProvider<String> topic)
      Like topic() but with a ValueProvider.
    • to

      Provides a function to dynamically specify the target topic per message. Not compatible with any of the other to methods. If to(java.lang.String) is called again specifying a topic, then this topicFunction will be ignored.
    • withClientFactory

      public PubsubIO.Write<T> withClientFactory(PubsubClient.PubsubClientFactory factory)
      The default client to write to Pub/Sub is the PubsubJsonClient, created by the
      invalid reference
      PubsubJsonClient.PubsubJsonClientFactory
      . This function allows to change the Pub/Sub client by providing another PubsubClient.PubsubClientFactory like the
      invalid reference
      PubsubGrpcClientFactory
      .
    • withMaxBatchSize

      public PubsubIO.Write<T> withMaxBatchSize(int batchSize)
      Writes to Pub/Sub are batched to efficiently send data. The value of the attribute will be a number representing the number of Pub/Sub messages to queue before sending off the bulk request. For example, if given 1000 the write sink will wait until 1000 messages have been received, or the pipeline has finished, whichever is first.

      Pub/Sub has a limitation of 10mb per individual request/batch. This attribute was requested dynamic to allow larger Pub/Sub messages to be sent using this source. Thus allowing customizable batches and control of number of events before the 10mb size limit is hit.

    • withMaxBatchBytesSize

      public PubsubIO.Write<T> withMaxBatchBytesSize(int maxBatchBytesSize)
      Writes to Pub/Sub are limited by 10mb in general. This attribute controls the maximum allowed bytes to be sent to Pub/Sub in a single batched message.
    • withOrderingKey

      public PubsubIO.Write<T> withOrderingKey()
      Writes to Pub/Sub with each record's ordering key. A subscription with message ordering enabled will receive messages published in the same region with the same ordering key in the order in which they were received by the service. Note that the order in which Beam publishes records to the service remains unspecified.
      See Also:
    • withTimestampAttribute

      public PubsubIO.Write<T> withTimestampAttribute(String timestampAttribute)
      Writes to Pub/Sub and adds each record's timestamp to the published messages in an attribute with the specified name. The value of the attribute will be a number representing the number of milliseconds since the Unix epoch. For example, if using the Joda time classes, Instant(long) can be used to parse this value.

      If the output from this sink is being read by another Beam pipeline, then PubsubIO.Read.withTimestampAttribute(String) can be used to ensure the other source reads these timestamps from the appropriate attribute.

    • withIdAttribute

      public PubsubIO.Write<T> withIdAttribute(String idAttribute)
      Writes to Pub/Sub, adding each record's unique identifier to the published messages in an attribute with the specified name. The value of the attribute is an opaque string.

      If the output from this sink is being read by another Beam pipeline, then PubsubIO.Read.withIdAttribute(String) can be used to ensure that* the other source reads these unique identifiers from the appropriate attribute.

    • withPubsubRootUrl

      public PubsubIO.Write<T> withPubsubRootUrl(String pubsubRootUrl)
    • withErrorHandler

      public PubsubIO.Write<T> withErrorHandler(ErrorHandler<BadRecord,?> badRecordErrorHandler)
      Writes any serialization failures out to the Error Handler. See ErrorHandler for details on how to configure an Error Handler. Error Handlers are not well supported when writing to topics with schemas, and it is not recommended to configure an error handler if the target topic has a schema.
    • withValidation

      public PubsubIO.Write<T> withValidation()
      Enable validation of the PubSub Write.
    • expand

      public PDone expand(PCollection<T> input)
      Description copied from class: PTransform
      Override this method to specify how this PTransform should be expanded on the given InputT.

      NOTE: This method should not be called directly. Instead apply the PTransform should be applied to the InputT using the apply method.

      Composite transforms, which are defined in terms of other transforms, should return the output of one of the composed transforms. Non-composite transforms, which do not apply any transforms internally, should return a new unbound output and register evaluators (via backend-specific registration methods).

      Specified by:
      expand in class PTransform<PCollection<T>,PDone>
    • validate

      public void validate(PipelineOptions options)
      Description copied from class: PTransform
      Called before running the Pipeline to verify this transform is fully and correctly specified.

      By default, does nothing.

      Overrides:
      validate in class PTransform<PCollection<T>,PDone>
    • populateDisplayData

      public void populateDisplayData(DisplayData.Builder builder)
      Description copied from class: PTransform
      Register display data for the given transform or component.

      populateDisplayData(DisplayData.Builder) is invoked by Pipeline runners to collect display data via DisplayData.from(HasDisplayData). Implementations may call super.populateDisplayData(builder) in order to register display data in the current namespace, but should otherwise use subcomponent.populateDisplayData(builder) to use the namespace of the subcomponent.

      By default, does not register any display data. Implementors may override this method to provide their own display data.

      Specified by:
      populateDisplayData in interface HasDisplayData
      Overrides:
      populateDisplayData in class PTransform<PCollection<T>,PDone>
      Parameters:
      builder - The builder to populate with display data.
      See Also: