CARVIEW |
Class PubsubIO
- java.lang.Object
-
- org.apache.beam.sdk.io.gcp.pubsub.PubsubIO
-
public class PubsubIO extends java.lang.Object
Read and WritePTransform
s for Cloud Pub/Sub streams. These transforms create and consume unboundedPCollections
.Using local emulator
In order to use local emulator for Pubsub you should use
PubsubOptions#setPubsubRootUrl(String)
method to set host and port of your local emulator.Permissions
Permission requirements depend on the
PipelineRunner
that is used to execute the Beam pipeline. Please refer to the documentation of correspondingPipelineRunners
for more details.Updates to the I/O connector code
For any significant updates to this I/O connector, please consider involving corresponding code reviewers mentioned here.Example PubsubIO read usage
// Read from a specific topic; a subscription will be created at pipeline start time. PCollection<PubsubMessage> messages = PubsubIO.readMessages().fromTopic(topic); // Read from a subscription. PCollection<PubsubMessage> messages = PubsubIO.readMessages().fromSubscription(subscription); // Read messages including attributes. All PubSub attributes will be included in the PubsubMessage. PCollection<PubsubMessage> messages = PubsubIO.readMessagesWithAttributes().fromTopic(topic); // Examples of reading different types from PubSub. PCollection<String> strings = PubsubIO.readStrings().fromTopic(topic); PCollection<MyProto> protos = PubsubIO.readProtos(MyProto.class).fromTopic(topic); PCollection<MyType> avros = PubsubIO.readAvros(MyType.class).fromTopic(topic);
Example PubsubIO write usage
Data can be written to a single topic or to a dynamic set of topics. In order to write to a single topic, thePubsubIO.Write.to(String)
method can be used. For example:
Dynamic topic destinations can be accomplished by specifying a function to extract the topic from the record using theavros.apply(PubsubIO.writeAvros(MyType.class).to(topic)); protos.apply(PubsubIO.writeProtos(MyProto.class).to(topic)); strings.apply(PubsubIO.writeStrings().to(topic));
PubsubIO.Write.to(SerializableFunction)
method. For example:
Dynamic topics can also be specified by writingavros.apply(PubsubIO.writeAvros(MyType.class). to((ValueInSingleWindow<Event> quote) -> { String country = quote.getCountry(); return "projects/myproject/topics/events_" + country; });
PubsubMessage
objects containing the topic and writing using thewriteMessagesDynamic()
method. For example:events.apply(MapElements.into(new TypeDescriptor<PubsubMessage>() {}) .via(e -> new PubsubMessage( e.toByteString(), Collections.emptyMap()).withTopic(e.getCountry()))) .apply(PubsubIO.writeMessagesDynamic());
Custom timestamps
All messages read from PubSub have a stable publish timestamp that is independent of when the message is read from the PubSub topic. By default, the publish time is used as the timestamp for all messages read and the watermark is based on that. If there is a different logical timestamp to be used, that timestamp must be published in a PubSub attribute and specified usingPubsubIO.Read.withTimestampAttribute(java.lang.String)
. See the Javadoc for that method for the timestamp format.
-
-
Nested Class Summary
Nested Classes Modifier and Type Class and Description static class
PubsubIO.PubsubSubscription
Class representing a Cloud Pub/Sub Subscription.static class
PubsubIO.PubsubTopic
Class representing a Cloud Pub/Sub Topic.static class
PubsubIO.Read<T>
Implementation of read methods.static class
PubsubIO.Write<T>
Implementation of write methods.
-
Field Summary
Fields Modifier and Type Field and Description static java.lang.String
ENABLE_CUSTOM_PUBSUB_SINK
static java.lang.String
ENABLE_CUSTOM_PUBSUB_SOURCE
-
Method Summary
All Methods Static Methods Concrete Methods Modifier and Type Method and Description static PubsubIO.Read<GenericRecord>
readAvroGenericRecords(Schema avroSchema)
Returns aPTransform
that continuously reads binary encoded Avro messages into the AvroGenericRecord
type.static <T> PubsubIO.Read<T>
readAvros(java.lang.Class<T> clazz)
Returns APTransform
that continuously reads binary encoded Avro messages of the given type from a Google Cloud Pub/Sub stream.static <T> PubsubIO.Read<T>
readAvrosWithBeamSchema(java.lang.Class<T> clazz)
Returns aPTransform
that continuously reads binary encoded Avro messages of the specific type.static PubsubIO.Read<PubsubMessage>
readMessages()
Returns APTransform
that continuously reads from a Google Cloud Pub/Sub stream.static PubsubIO.Read<PubsubMessage>
readMessagesWithAttributes()
Returns APTransform
that continuously reads from a Google Cloud Pub/Sub stream.static PubsubIO.Read<PubsubMessage>
readMessagesWithAttributesAndMessageId()
Returns APTransform
that continuously reads from a Google Cloud Pub/Sub stream.static PubsubIO.Read<PubsubMessage>
readMessagesWithAttributesAndMessageIdAndOrderingKey()
Returns APTransform
that continuously reads from a Google Cloud Pub/Sub stream.static <T> PubsubIO.Read<T>
readMessagesWithAttributesWithCoderAndParseFn(Coder<T> coder, SimpleFunction<PubsubMessage,T> parseFn)
Returns APTransform
that continuously reads from a Google Cloud Pub/Sub stream, mapping eachPubsubMessage
, with attributes, into type T using the supplied parse function and coder.static <T> PubsubIO.Read<T>
readMessagesWithCoderAndParseFn(Coder<T> coder, SimpleFunction<PubsubMessage,T> parseFn)
Returns APTransform
that continuously reads from a Google Cloud Pub/Sub stream, mapping eachPubsubMessage
into type T using the supplied parse function and coder.static PubsubIO.Read<PubsubMessage>
readMessagesWithMessageId()
Returns APTransform
that continuously reads from a Google Cloud Pub/Sub stream.static PubsubIO.Read<com.google.protobuf.DynamicMessage>
readProtoDynamicMessages(com.google.protobuf.Descriptors.Descriptor descriptor)
Similar toreadProtoDynamicMessages(ProtoDomain, String)
but for when theDescriptors.Descriptor
is already known.static PubsubIO.Read<com.google.protobuf.DynamicMessage>
readProtoDynamicMessages(ProtoDomain domain, java.lang.String fullMessageName)
Returns aPTransform
that continuously reads binary encoded protobuf messages for the type specified byfullMessageName
.static <T extends com.google.protobuf.Message>
PubsubIO.Read<T>readProtos(java.lang.Class<T> messageClass)
Returns APTransform
that continuously reads binary encoded protobuf messages of the given type from a Google Cloud Pub/Sub stream.static PubsubIO.Read<java.lang.String>
readStrings()
Returns APTransform
that continuously reads UTF-8 encoded strings from a Google Cloud Pub/Sub stream.static <T> PubsubIO.Write<T>
writeAvros(java.lang.Class<T> clazz)
Returns APTransform
that writes binary encoded Avro messages of a given type to a Google Cloud Pub/Sub stream.static <T> PubsubIO.Write<T>
writeAvros(java.lang.Class<T> clazz, SerializableFunction<ValueInSingleWindow<T>,java.util.Map<java.lang.String,java.lang.String>> attributeFn)
Returns APTransform
that writes binary encoded Avro messages of a given type to a Google Cloud Pub/Sub stream.static PubsubIO.Write<PubsubMessage>
writeMessages()
Returns APTransform
that writes to a Google Cloud Pub/Sub stream.static PubsubIO.Write<PubsubMessage>
writeMessagesDynamic()
Enables dynamic destination topics.static <T extends com.google.protobuf.Message>
PubsubIO.Write<T>writeProtos(java.lang.Class<T> messageClass)
Returns APTransform
that writes binary encoded protobuf messages of a given type to a Google Cloud Pub/Sub stream.static <T extends com.google.protobuf.Message>
PubsubIO.Write<T>writeProtos(java.lang.Class<T> messageClass, SerializableFunction<ValueInSingleWindow<T>,java.util.Map<java.lang.String,java.lang.String>> attributeFn)
Returns APTransform
that writes binary encoded protobuf messages of a given type to a Google Cloud Pub/Sub stream.static PubsubIO.Write<java.lang.String>
writeStrings()
Returns APTransform
that writes UTF-8 encoded strings to a Google Cloud Pub/Sub stream.
-
-
-
Field Detail
-
ENABLE_CUSTOM_PUBSUB_SINK
public static final java.lang.String ENABLE_CUSTOM_PUBSUB_SINK
- See Also:
- Constant Field Values
-
ENABLE_CUSTOM_PUBSUB_SOURCE
public static final java.lang.String ENABLE_CUSTOM_PUBSUB_SOURCE
- See Also:
- Constant Field Values
-
-
Method Detail
-
readMessages
public static PubsubIO.Read<PubsubMessage> readMessages()
Returns APTransform
that continuously reads from a Google Cloud Pub/Sub stream. The messages will only contain apayload
, but noattributes
.
-
readMessagesWithMessageId
public static PubsubIO.Read<PubsubMessage> readMessagesWithMessageId()
Returns APTransform
that continuously reads from a Google Cloud Pub/Sub stream. The messages will only contain apayload
with themessageId
from PubSub, but noattributes
.
-
readMessagesWithAttributes
public static PubsubIO.Read<PubsubMessage> readMessagesWithAttributes()
Returns APTransform
that continuously reads from a Google Cloud Pub/Sub stream. The messages will contain both apayload
andattributes
.
-
readMessagesWithAttributesAndMessageId
public static PubsubIO.Read<PubsubMessage> readMessagesWithAttributesAndMessageId()
Returns APTransform
that continuously reads from a Google Cloud Pub/Sub stream. The messages will contain both apayload
andattributes
, along with themessageId
from PubSub.
-
readMessagesWithAttributesAndMessageIdAndOrderingKey
public static PubsubIO.Read<PubsubMessage> readMessagesWithAttributesAndMessageIdAndOrderingKey()
Returns APTransform
that continuously reads from a Google Cloud Pub/Sub stream. The messages will contain apayload
,attributes
, along with themessageId
and {PubsubMessage#getOrderingKey() orderingKey} from PubSub.
-
readStrings
public static PubsubIO.Read<java.lang.String> readStrings()
Returns APTransform
that continuously reads UTF-8 encoded strings from a Google Cloud Pub/Sub stream.
-
readProtos
public static <T extends com.google.protobuf.Message> PubsubIO.Read<T> readProtos(java.lang.Class<T> messageClass)
Returns APTransform
that continuously reads binary encoded protobuf messages of the given type from a Google Cloud Pub/Sub stream.
-
readProtoDynamicMessages
public static PubsubIO.Read<com.google.protobuf.DynamicMessage> readProtoDynamicMessages(ProtoDomain domain, java.lang.String fullMessageName)
Returns aPTransform
that continuously reads binary encoded protobuf messages for the type specified byfullMessageName
.This is primarily here for cases where the message type cannot be known at compile time. If it can be known, prefer
readProtos(Class)
, asDynamicMessage
tends to perform worse than concrete types.Beam will infer a schema for the
DynamicMessage
schema. Note that some proto schema features are not supported by all sinks.- Parameters:
domain
- TheProtoDomain
that contains the target message and its dependencies.fullMessageName
- The full name of the message for lookup indomain
.
-
readProtoDynamicMessages
public static PubsubIO.Read<com.google.protobuf.DynamicMessage> readProtoDynamicMessages(com.google.protobuf.Descriptors.Descriptor descriptor)
Similar toreadProtoDynamicMessages(ProtoDomain, String)
but for when theDescriptors.Descriptor
is already known.
-
readAvros
public static <T> PubsubIO.Read<T> readAvros(java.lang.Class<T> clazz)
Returns APTransform
that continuously reads binary encoded Avro messages of the given type from a Google Cloud Pub/Sub stream.
-
readMessagesWithCoderAndParseFn
public static <T> PubsubIO.Read<T> readMessagesWithCoderAndParseFn(Coder<T> coder, SimpleFunction<PubsubMessage,T> parseFn)
Returns APTransform
that continuously reads from a Google Cloud Pub/Sub stream, mapping eachPubsubMessage
into type T using the supplied parse function and coder.
-
readMessagesWithAttributesWithCoderAndParseFn
public static <T> PubsubIO.Read<T> readMessagesWithAttributesWithCoderAndParseFn(Coder<T> coder, SimpleFunction<PubsubMessage,T> parseFn)
Returns APTransform
that continuously reads from a Google Cloud Pub/Sub stream, mapping eachPubsubMessage
, with attributes, into type T using the supplied parse function and coder. Similar toreadMessagesWithCoderAndParseFn(Coder, SimpleFunction)
, but with the with addition of making the message attributes available to the ParseFn.
-
readAvroGenericRecords
public static PubsubIO.Read<GenericRecord> readAvroGenericRecords(Schema avroSchema)
Returns aPTransform
that continuously reads binary encoded Avro messages into the AvroGenericRecord
type.Beam will infer a schema for the Avro schema. This allows the output to be used by SQL and by the schema-transform library.
-
readAvrosWithBeamSchema
public static <T> PubsubIO.Read<T> readAvrosWithBeamSchema(java.lang.Class<T> clazz)
Returns aPTransform
that continuously reads binary encoded Avro messages of the specific type.Beam will infer a schema for the Avro schema. This allows the output to be used by SQL and by the schema-transform library.
-
writeMessages
public static PubsubIO.Write<PubsubMessage> writeMessages()
Returns APTransform
that writes to a Google Cloud Pub/Sub stream.
-
writeMessagesDynamic
public static PubsubIO.Write<PubsubMessage> writeMessagesDynamic()
Enables dynamic destination topics. ThePubsubMessage
elements are each expected to contain a destination topic, which can be set usingPubsubMessage.withTopic(java.lang.String)
. IfPubsubIO.Write.to(java.lang.String)
is called, that will be used instead to generate the topic and the value returned byPubsubMessage.getTopic()
will be ignored.
-
writeStrings
public static PubsubIO.Write<java.lang.String> writeStrings()
Returns APTransform
that writes UTF-8 encoded strings to a Google Cloud Pub/Sub stream.
-
writeProtos
public static <T extends com.google.protobuf.Message> PubsubIO.Write<T> writeProtos(java.lang.Class<T> messageClass)
Returns APTransform
that writes binary encoded protobuf messages of a given type to a Google Cloud Pub/Sub stream.
-
writeProtos
public static <T extends com.google.protobuf.Message> PubsubIO.Write<T> writeProtos(java.lang.Class<T> messageClass, SerializableFunction<ValueInSingleWindow<T>,java.util.Map<java.lang.String,java.lang.String>> attributeFn)
Returns APTransform
that writes binary encoded protobuf messages of a given type to a Google Cloud Pub/Sub stream.
-
writeAvros
public static <T> PubsubIO.Write<T> writeAvros(java.lang.Class<T> clazz)
Returns APTransform
that writes binary encoded Avro messages of a given type to a Google Cloud Pub/Sub stream.
-
writeAvros
public static <T> PubsubIO.Write<T> writeAvros(java.lang.Class<T> clazz, SerializableFunction<ValueInSingleWindow<T>,java.util.Map<java.lang.String,java.lang.String>> attributeFn)
Returns APTransform
that writes binary encoded Avro messages of a given type to a Google Cloud Pub/Sub stream.
-
-