Class PubsubIO
PTransform
s for Cloud Pub/Sub streams. These transforms create and consume
unbounded PCollections
.
Using local emulator
In order to use local emulator for Pubsub you should use
PubsubOptions#setPubsubRootUrl(String)
method to set host and port of your local emulator.
Permissions
Permission requirements depend on the PipelineRunner
that is used to execute the Beam
pipeline. Please refer to the documentation of corresponding PipelineRunners
for more details.
Updates to the I/O connector code
For any significant updates to this I/O connector, please consider involving corresponding code reviewers mentioned here.Example PubsubIO read usage
// Read from a specific topic; a subscription will be created at pipeline start time.
PCollection<PubsubMessage> messages = PubsubIO.readMessages().fromTopic(topic);
// Read from a subscription.
PCollection<PubsubMessage> messages = PubsubIO.readMessages().fromSubscription(subscription);
// Read messages including attributes. All PubSub attributes will be included in the PubsubMessage.
PCollection<PubsubMessage> messages = PubsubIO.readMessagesWithAttributes().fromTopic(topic);
// Examples of reading different types from PubSub.
PCollection<String> strings = PubsubIO.readStrings().fromTopic(topic);
PCollection<MyProto> protos = PubsubIO.readProtos(MyProto.class).fromTopic(topic);
PCollection<MyType> avros = PubsubIO.readAvros(MyType.class).fromTopic(topic);
Example PubsubIO write usage
Data can be written to a single topic or to a dynamic set of topics. In order to write to a single topic, thePubsubIO.Write.to(String)
method can be used. For example:
avros.apply(PubsubIO.writeAvros(MyType.class).to(topic));
protos.apply(PubsubIO.writeProtos(MyProto.class).to(topic));
strings.apply(PubsubIO.writeStrings().to(topic));
Dynamic topic destinations can be accomplished by specifying a function to extract the topic from
the record using the PubsubIO.Write.to(SerializableFunction)
method. For example:
avros.apply(PubsubIO.writeAvros(MyType.class).
to((ValueInSingleWindow<Event> quote) -> {
String country = quote.getCountry();
return "projects/myproject/topics/events_" + country;
});
Dynamic topics can also be specified by writing PubsubMessage
objects containing the
topic and writing using the writeMessagesDynamic()
method. For example:
events.apply(MapElements.into(new TypeDescriptor<PubsubMessage>() {})
.via(e -> new PubsubMessage(
e.toByteString(), Collections.emptyMap()).withTopic(e.getCountry())))
.apply(PubsubIO.writeMessagesDynamic());
Custom timestamps
All messages read from PubSub have a stable publish timestamp that is independent of when the message is read from the PubSub topic. By default, the publish time is used as the timestamp for all messages read and the watermark is based on that. If there is a different logical timestamp to be used, that timestamp must be published in a PubSub attribute and specified usingPubsubIO.Read.withTimestampAttribute(java.lang.String)
. See the Javadoc for that method for the timestamp format.-
Nested Class Summary
Nested ClassesModifier and TypeClassDescriptionstatic class
Class representing a Cloud Pub/Sub Subscription.static class
Class representing a Cloud Pub/Sub Topic.static class
Implementation of read methods.static class
Implementation of write methods. -
Field Summary
Fields -
Method Summary
Modifier and TypeMethodDescriptionstatic PubsubIO.Read
<GenericRecord> readAvroGenericRecords
(Schema avroSchema) Returns aPTransform
that continuously reads binary encoded Avro messages into the AvroGenericRecord
type.static <T> PubsubIO.Read
<T> Returns APTransform
that continuously reads binary encoded Avro messages of the given type from a Google Cloud Pub/Sub stream.static <T> PubsubIO.Read
<T> readAvrosWithBeamSchema
(Class<T> clazz) Returns aPTransform
that continuously reads binary encoded Avro messages of the specific type.static PubsubIO.Read
<PubsubMessage> Returns APTransform
that continuously reads from a Google Cloud Pub/Sub stream.static PubsubIO.Read
<PubsubMessage> Returns APTransform
that continuously reads from a Google Cloud Pub/Sub stream.static PubsubIO.Read
<PubsubMessage> Returns APTransform
that continuously reads from a Google Cloud Pub/Sub stream.static PubsubIO.Read
<PubsubMessage> Returns APTransform
that continuously reads from a Google Cloud Pub/Sub stream.static <T> PubsubIO.Read
<T> readMessagesWithAttributesWithCoderAndParseFn
(Coder<T> coder, SimpleFunction<PubsubMessage, T> parseFn) Returns APTransform
that continuously reads from a Google Cloud Pub/Sub stream, mapping eachPubsubMessage
, with attributes, into type T using the supplied parse function and coder.static <T> PubsubIO.Read
<T> readMessagesWithCoderAndParseFn
(Coder<T> coder, SimpleFunction<PubsubMessage, T> parseFn) Returns APTransform
that continuously reads from a Google Cloud Pub/Sub stream, mapping eachPubsubMessage
into type T using the supplied parse function and coder.static PubsubIO.Read
<PubsubMessage> Returns APTransform
that continuously reads from a Google Cloud Pub/Sub stream.static PubsubIO.Read
<DynamicMessage> readProtoDynamicMessages
(Descriptors.Descriptor descriptor) Similar toreadProtoDynamicMessages(ProtoDomain, String)
but for when theDescriptors.Descriptor
is already known.static PubsubIO.Read
<DynamicMessage> readProtoDynamicMessages
(ProtoDomain domain, String fullMessageName) Returns aPTransform
that continuously reads binary encoded protobuf messages for the type specified byfullMessageName
.static <T extends Message>
PubsubIO.Read<T> readProtos
(Class<T> messageClass) Returns APTransform
that continuously reads binary encoded protobuf messages of the given type from a Google Cloud Pub/Sub stream.static PubsubIO.Read
<String> Returns APTransform
that continuously reads UTF-8 encoded strings from a Google Cloud Pub/Sub stream.static <T> PubsubIO.Write
<T> writeAvros
(Class<T> clazz) Returns APTransform
that writes binary encoded Avro messages of a given type to a Google Cloud Pub/Sub stream.static <T> PubsubIO.Write
<T> writeAvros
(Class<T> clazz, SerializableFunction<ValueInSingleWindow<T>, Map<String, String>> attributeFn) Returns APTransform
that writes binary encoded Avro messages of a given type to a Google Cloud Pub/Sub stream.static PubsubIO.Write
<PubsubMessage> Returns APTransform
that writes to a Google Cloud Pub/Sub stream.static PubsubIO.Write
<PubsubMessage> Enables dynamic destination topics.static <T extends Message>
PubsubIO.Write<T> writeProtos
(Class<T> messageClass) Returns APTransform
that writes binary encoded protobuf messages of a given type to a Google Cloud Pub/Sub stream.static <T extends Message>
PubsubIO.Write<T> writeProtos
(Class<T> messageClass, SerializableFunction<ValueInSingleWindow<T>, Map<String, String>> attributeFn) Returns APTransform
that writes binary encoded protobuf messages of a given type to a Google Cloud Pub/Sub stream.static PubsubIO.Write
<String> Returns APTransform
that writes UTF-8 encoded strings to a Google Cloud Pub/Sub stream.
-
Field Details
-
ENABLE_CUSTOM_PUBSUB_SINK
- See Also:
-
ENABLE_CUSTOM_PUBSUB_SOURCE
- See Also:
-
-
Method Details
-
readMessages
Returns APTransform
that continuously reads from a Google Cloud Pub/Sub stream. The messages will only contain apayload
, but noattributes
. -
readMessagesWithMessageId
Returns APTransform
that continuously reads from a Google Cloud Pub/Sub stream. The messages will only contain apayload
with themessageId
from PubSub, but noattributes
. -
readMessagesWithAttributes
Returns APTransform
that continuously reads from a Google Cloud Pub/Sub stream. The messages will contain both apayload
andattributes
. -
readMessagesWithAttributesAndMessageId
Returns APTransform
that continuously reads from a Google Cloud Pub/Sub stream. The messages will contain both apayload
andattributes
, along with themessageId
from PubSub. -
readMessagesWithAttributesAndMessageIdAndOrderingKey
Returns APTransform
that continuously reads from a Google Cloud Pub/Sub stream. The messages will contain apayload
,attributes
, along with themessageId
and {PubsubMessage#getOrderingKey() orderingKey} from PubSub. -
readStrings
Returns APTransform
that continuously reads UTF-8 encoded strings from a Google Cloud Pub/Sub stream. -
readProtos
Returns APTransform
that continuously reads binary encoded protobuf messages of the given type from a Google Cloud Pub/Sub stream. -
readProtoDynamicMessages
public static PubsubIO.Read<DynamicMessage> readProtoDynamicMessages(ProtoDomain domain, String fullMessageName) Returns aPTransform
that continuously reads binary encoded protobuf messages for the type specified byfullMessageName
.This is primarily here for cases where the message type cannot be known at compile time. If it can be known, prefer
readProtos(Class)
, asDynamicMessage
tends to perform worse than concrete types.Beam will infer a schema for the
DynamicMessage
schema. Note that some proto schema features are not supported by all sinks.- Parameters:
domain
- TheProtoDomain
that contains the target message and its dependencies.fullMessageName
- The full name of the message for lookup indomain
.
-
readProtoDynamicMessages
public static PubsubIO.Read<DynamicMessage> readProtoDynamicMessages(Descriptors.Descriptor descriptor) Similar toreadProtoDynamicMessages(ProtoDomain, String)
but for when theDescriptors.Descriptor
is already known. -
readAvros
Returns APTransform
that continuously reads binary encoded Avro messages of the given type from a Google Cloud Pub/Sub stream. -
readMessagesWithCoderAndParseFn
public static <T> PubsubIO.Read<T> readMessagesWithCoderAndParseFn(Coder<T> coder, SimpleFunction<PubsubMessage, T> parseFn) Returns APTransform
that continuously reads from a Google Cloud Pub/Sub stream, mapping eachPubsubMessage
into type T using the supplied parse function and coder. -
readMessagesWithAttributesWithCoderAndParseFn
public static <T> PubsubIO.Read<T> readMessagesWithAttributesWithCoderAndParseFn(Coder<T> coder, SimpleFunction<PubsubMessage, T> parseFn) Returns APTransform
that continuously reads from a Google Cloud Pub/Sub stream, mapping eachPubsubMessage
, with attributes, into type T using the supplied parse function and coder. Similar toreadMessagesWithCoderAndParseFn(Coder, SimpleFunction)
, but with the with addition of making the message attributes available to the ParseFn. -
readAvroGenericRecords
Returns aPTransform
that continuously reads binary encoded Avro messages into the AvroGenericRecord
type.Beam will infer a schema for the Avro schema. This allows the output to be used by SQL and by the schema-transform library.
-
readAvrosWithBeamSchema
Returns aPTransform
that continuously reads binary encoded Avro messages of the specific type.Beam will infer a schema for the Avro schema. This allows the output to be used by SQL and by the schema-transform library.
-
writeMessages
Returns APTransform
that writes to a Google Cloud Pub/Sub stream. -
writeMessagesDynamic
Enables dynamic destination topics. ThePubsubMessage
elements are each expected to contain a destination topic, which can be set usingPubsubMessage.withTopic(java.lang.String)
. IfPubsubIO.Write.to(java.lang.String)
is called, that will be used instead to generate the topic and the value returned byPubsubMessage.getTopic()
will be ignored. -
writeStrings
Returns APTransform
that writes UTF-8 encoded strings to a Google Cloud Pub/Sub stream. -
writeProtos
Returns APTransform
that writes binary encoded protobuf messages of a given type to a Google Cloud Pub/Sub stream. -
writeProtos
public static <T extends Message> PubsubIO.Write<T> writeProtos(Class<T> messageClass, SerializableFunction<ValueInSingleWindow<T>, Map<String, String>> attributeFn) Returns APTransform
that writes binary encoded protobuf messages of a given type to a Google Cloud Pub/Sub stream. -
writeAvros
Returns APTransform
that writes binary encoded Avro messages of a given type to a Google Cloud Pub/Sub stream. -
writeAvros
public static <T> PubsubIO.Write<T> writeAvros(Class<T> clazz, SerializableFunction<ValueInSingleWindow<T>, Map<String, String>> attributeFn) Returns APTransform
that writes binary encoded Avro messages of a given type to a Google Cloud Pub/Sub stream.
-