All Classes and Interfaces
Class
Description
BeamRelNode to replace
Project and Filter node.Abstract base for runners that execute a
Combine.PerKey.A final combiner that takes in
AccumT and produces OutputT.Adapter interface that allows using a
CombineFnBase.GlobalCombineFn to either produce
the AccumT as output or to combine several accumulators into an OutputT.A partial combiner that takes in
InputT and produces AccumT.Abstract base class for iterators that process Spark input data and produce corresponding output
values.
Factory class for creating instances that will handle different functions of DoFns.
Factory class for creating instances that will handle each type of record within a change stream
query.
A transform to add new nullable fields to a PCollection's schema.
Inner PTransform for AddFields.
A
ClientInterceptor that attaches a provided SDK Harness ID to outgoing messages.This class adds pseudo-key with a given cardinality.
A transform to add UUIDs to each message to be written to Pub/Sub Lite.
A
Phaser which never terminates.A composite
Trigger that fires when all of its sub-triggers are ready.A composite
Trigger that executes its sub-triggers in order.A composite
Trigger that fires once after at least one of its sub-triggers have fired.A
Trigger that fires at some point after a specified number of input elements have
arrived.A
Trigger trigger that fires at a specified point in processing time, relative to when
input first arrives.FOR INTERNAL USE ONLY.
AfterWatermark triggers fire based on progress of the system watermark.A watermark trigger targeted relative to the end of the window.
An aggregate function that can be executed as part of a SQL query.
Wrapper
Combine.CombineFns for aggregation function calls.Builds a MongoDB AggregateIterable object.
AmqpIO supports AMQP 1.0 protocol using the Apache QPid Proton-J library.
A
PTransform to read/receive messages using AMQP 1.0 protocol.A
PTransform to send messages using AMQP 1.0 protocol.A coder for AMQP message.
A
CoderProviderRegistrar for standard types used with AmqpIO.A
PTransform using the Cloud AI Natural language processing capability.ApiIOError is a data class for storing details about an error.Options that allow setting the application name.
PTransforms for estimating the number of distinct elements in a PCollection, or
the number of distinct values associated with each key in a PCollection of KVs.PTransform for estimating the number of distinct elements in a PCollection.PTransforms for computing the approximate number of distinct elements in a stream.Implements the
Combine.CombineFn of ApproximateDistinct transforms.Implementation of
ApproximateDistinct.globally().Coder for
HyperLogLogPlus class.Implementation of
ApproximateDistinct.perKey().PTransforms for getting an idea of a PCollection's data distribution using
approximate N-tiles (e.g.The
ApproximateQuantilesCombineFn combiner gives an idea of the distribution of a
collection of values using approximate N-tiles.Deprecated.
CombineFn that computes an estimate of the number of distinct values that were
combined.A heap utility class to efficiently track the largest added elements.
PTransform for estimating the number of distinct elements in a PCollection.PTransform for estimating the number of distinct values associated with each key in a
PCollection of KVs.Converts Arrow schema to Beam row schema.
An
ArtifactRetrievalService that uses FileSystems as its backing storage.A pairing of a newly created artifact type and an output stream that will be readable at that
type.
Provides a concrete location to which artifacts can be staged on retrieval.
PTransform for serializing objects to JSON Strings./** * Jet
Processor implementation for Beam's Windowing primitive.Assign Windows function.
Assign Window translator.
Async handler that automatically retries unprocessed records in case of a partial success.
Statistics on the batch request.
Asynchronously compute the earliest partition watermark and stores it in memory.
A
Coder that serializes and deserializes the AttributeValue objects.Enables users to specify their own `JMS` backlog reporters enabling
JmsIO to report
UnboundedSource.UnboundedReader.getTotalBacklogBytes().A
SchemaProvider for AutoValue classes.FieldValueTypeSupplier that's based on AutoValue getters.Utilities for managing AutoValue schemas.
A
Coder using Avro binary format.Create
DatumReader and DatumWriter for given schemas.Specialized
AvroDatumFactory for GenericRecord.Specialized
AvroDatumFactory for java classes transforming to avro through reflection.Specialized
AvroDatumFactory for SpecificRecord.AvroCoder specialisation for GenericRecord, needed for cross-language transforms.
Coder registrar for AvroGenericCoder.
Coder translator for AvroGenericCoder.
Utility methods for converting Avro
GenericRecord objects to dynamic protocol message,
for use with the Storage write API.PTransforms for reading and writing Avro files.Deprecated.
See
AvroIO.parseAllGenericRecords(SerializableFunction) for details.Implementation of
AvroIO.read(java.lang.Class<T>) and AvroIO.readGenericRecords(org.apache.avro.Schema).Deprecated.
See
AvroIO.readAll(Class) for details.Implementation of
AvroIO.readFiles(java.lang.Class<T>).Deprecated.
Users can achieve the same by providing this transform in a
ParDo before using write in AvroIO AvroIO.write(Class).Implementation of
AvroIO.write(java.lang.Class<T>).This class is used as the default return value of
AvroIO.write(java.lang.Class<T>)Avro 1.8 ships with joda time conversions only.
Avro 1.8 invalid input: '&' 1.9 ship joda time conversions.
A
SchemaProvider for AVRO generated SpecificRecords and POJOs.An implementation of
SchemaIOProvider for reading and writing Avro files with AvroIO.A
FileBasedSink for Avro files.Do not use in pipelines directly: most users should use
AvroIO.Read.A
BlockBasedSource.BlockBasedReader for reading blocks from Avro files.TableProvider for AvroIO for consumption by Beam SQL.Utils to convert AVRO records to Beam rows.
Wrapper for fixed byte fields.
A
FileWriteSchemaTransformFormatProvider for avro format.Builder factory for AWS
SdkPojo to avoid using reflection to instantiate a builder.A Jackson
Module that registers a JsonSerializer and JsonDeserializer for
AwsCredentialsProvider and some subclasses.Options used to configure Amazon Web Services specific options such as credentials and region.
Attempt to load default region.
Return
DefaultCredentialsProvider as default provider.A registrar containing the default AWS options.
Schema provider for AWS
SdkPojo models using the provided field metadata (@see SdkPojo.sdkFields()) rather than reflection.Utilities for working with AWS Serializables.
AutoService registrar for the AzureBlobStoreFileSystem.A Jackson
Module that registers a JsonSerializer and JsonDeserializer for
Azure credential providers.Attempts to load Azure credentials.
A registrar containing the default Azure options.
An adapter for converting between Apache Beam and Google API client representations of backoffs.
A
ReadableState cell containing a bag of values.Basic implementation of
BeamSqlTable.A factory for creating
JcsmpSessionService instances.A class that manages REST calls to the Solace Element Management Protocol (SEMP) using basic
authentication.
A factory for creating
BasicAuthSempClient instances.Class for Batch, Sink and Stream CDAP wrapper classes that use it to provide common details.
StateRequestHandler that uses a BatchSideInputHandlerFactory.SideInputGetter to access side inputs.Returns the value for the side input with the given PCollection id from the runner.
Class for creating context object of different CDAP classes with batch sink type.
Class for creating context object of different CDAP classes with batch source type.
PTransformOverrideFactories that expands to correctly implement
stateful ParDo using window-unaware BatchViewOverrides.GroupByKeyAndSortValuesOnly to linearize
processing per key.A key-preserving
DoFn that explodes an iterable that has been grouped by key and
window.Batch TransformTranslator interface.
This rule is essentially a wrapper around Calcite's
AggregateProjectMergeRule.BeamRelNode to replace a Aggregate node.Rule to detect the window/trigger settings.
Aggregation rule that doesn't include projection.
This is a shell tset environment which is used on as a central driver model to fit what beam
expects.
The Twister2 worker that will execute the job logic once the job is submitted from the run
method.
Built-in aggregations functions for COUNT/MAX/MIN/SUM/AVG/VAR_POP/VAR_SAMP.
Built-in Analytic Functions for the aggregation analytics functionality.
BeamBuiltinFunctionClass interface.
Adapter from
BeamSqlTable to a calcite Table.Planner rule to merge a
BeamCalcRel with a BeamCalcRel.BeamRelNode to replace
Project and Filter node.WrappedList translates
List on access.WrappedMap translates
Map on access.WrappedRow translates
Row on access.A
RelOptRule that converts a LogicalCalc into a chain of AbstractBeamCalcRel nodes via CalcRelSplitter.A
BeamJoinRel which does CoGBK JoinRule to convert
LogicalJoin node to BeamCoGBKJoinRel node.VolcanoCost represents the cost of a plan node.Implementation of
RelOptCostFactory that creates
BeamCostModels.BeamRelNode to replace a
Enumerable node.An adapter class that allows one to apply Apache Beam PTransforms directly to Flink DataSets.
An adapter class that allows one to apply Apache Beam PTransforms directly to Flink DataStreams.
A gRPC multiplexer for a specific
Endpoints.ApiServiceDescriptor.An outbound data buffering aggregator with size-based buffer and time-based buffer if
corresponding options are set.
A Beam
BoundedSource for Impulse Source.BeamRelNode to replace a Intersect node.ConverterRule to replace Intersect with BeamIntersectRel.BeamRelNode to replace a
TableModify node.BeamRelNode to replace a
TableScan node.customized data type in Beam.
This is very similar to
JoinAssociateRule.This is exactly similar to
JoinPushThroughJoinRule.An abstract
BeamRelNode to implement Join Rels.Collections of
PTransform and DoFn used to perform JOIN operation.Transform to execute Join as Lookup.
A Kafka topic that saves records as CSV format.
BeamKafkaTable represent a Kafka topic, as source or target.Convention for Beam SQL.
BeamRelNode to replace a Match node.ConverterRule to replace Match with BeamMatchRel.BeamRelNode to replace a Minus node.ConverterRule to replace Minus with BeamMinusRel.BeamPCollectionTable converts a PCollection<Row> as a virtual table, then a
downstream query can query directly.customized data type in Beam.
A
RelNode that can also give a PTransform that implements the expression.Utility methods for converting Beam
Row objects to dynamic protocol message, for use with
the Storage write API.RuleSet used in BeamQueryPlanner.Delegate for Set operators:
BeamUnionRel, BeamIntersectRel and
BeamMinusRel.Set operator type.
Collections of
PTransform and DoFn used to perform Set operations.Transform a
BeamSqlRow to a KV<BeamSqlRow, BeamSqlRow>.Filter function used for Set operators.
A
BeamJoinRel which does sideinput JoinRule to convert
LogicalJoin node to BeamSideInputJoinRel node.A
BeamJoinRel which does Lookup JoinRule to convert
LogicalJoin node to BeamSideInputLookupJoinRel node.BeamRelNode to replace a Sort node.ConverterRule to replace Sort with BeamSortRel.BeamSqlCli provides methods to execute Beam SQL with an interactive client.Example pipeline that uses Google Cloud Data Catalog to retrieve the table metadata.
Pipeline options to specify the query and the output for the example.
Contains the metadata of tables/UDF functions, and exposes APIs to
query/validate/optimize/translate SQL statements.
BeamSqlEnv's Builder.
A test PTransform to display output in console.
Options used to configure BeamSQL.
AutoService registrar for BeamSqlPipelineOptions.Utilities for
BeamRelNode.A seekable table converts a JOIN operator to an inline lookup.
This interface defines a Beam Sql Table.
This interface defines Beam SQL Table Filter.
Interface to create a UDF in Beam SQL.
Custom StoppableFunction for backward compatibility.
BeamRelNode to replace
TableFunctionScan.This is the conveter rule that converts a Calcite
TableFunctionScan to Beam
TableFunctionScanRel.This class stores row count statistics.
Utility methods for working with
BeamTable.BeamRelNode to implement an uncorrelated Uncollect, aka UNNEST.BeamRelNode to replace a Union.BeamRelNode to replace a Values node.ConverterRule to replace Values with BeamValuesRel.BeamRelNode to replace a Window node.A Fn Status service which can collect run-time status information from SDK harnesses for
debugging purpose.
A
BigDecimalCoder encodes a BigDecimal as an integer scale encoded with VarIntCoder and a BigInteger encoded using BigIntegerCoder.Provides converters from
BigDecimal to other numeric types based on the input Schema.TypeName.A
BigEndianIntegerCoder encodes Integers in 4 bytes, big-endian.A
BigEndianLongCoder encodes Longs in 8 bytes, big-endian.A
BigEndianShortCoder encodes Shorts in 2 bytes, big-endian.A
BigIntegerCoder encodes a BigInteger as a byte array containing the big endian
two's-complement representation, encoded via ByteArrayCoder.A wrapper class to call Bigquery API calls.
A
CoderProviderRegistrar for standard types used with BigQueryIO.An implementation of
TypedSchemaTransformProvider for BigQuery Storage Read API jobs
configured via BigQueryDirectReadSchemaTransformProvider.BigQueryDirectReadSchemaTransformConfiguration.A
SchemaTransform for BigQuery Storage Read API, configured with BigQueryDirectReadSchemaTransformProvider.BigQueryDirectReadSchemaTransformConfiguration and instantiated by BigQueryDirectReadSchemaTransformProvider.Configuration for reading from BigQuery with Storage Read API.
Configuration for reading from BigQuery.
An implementation of
TypedSchemaTransformProvider for BigQuery read jobs configured using
BigQueryExportReadSchemaTransformConfiguration.An implementation of
SchemaTransform for BigQuery read jobs configured using BigQueryExportReadSchemaTransformConfiguration.An implementation of
TypedSchemaTransformProvider for BigQuery write jobs configured
using BigQueryWriteConfiguration.A set of helper functions and classes used by
BigQueryIO.Model definition for BigQueryInsertError.
A
Coder that encodes BigQuery BigQueryInsertError objects.PTransforms for reading and writing BigQuery tables.Implementation of
BigQueryIO.read().Implementation of
BigQueryIO.read(SerializableFunction).Determines the method used to read data from BigQuery.
An enumeration type for the priority of a query.
Implementation of
BigQueryIO.write().An enumeration type for the BigQuery create disposition strings.
Determines the method used to insert data in BigQuery.
An enumeration type for the BigQuery schema update options strings.
An enumeration type for the BigQuery write disposition strings.
A matcher to verify data in BigQuery by processing given query and comparing with content's
checksum.
Properties needed when using Google BigQuery with the Apache Beam SDK.
An implementation of
SchemaIOProvider for reading and writing to BigQuery with BigQueryIO.Exception to signal that BigQuery schema retrieval failed.
An interface for real, mock, or fake implementations of Cloud BigQuery services.
Container for reading data from streaming endpoints.
An interface to get, create and delete Cloud BigQuery datasets and tables.
An interface for the Cloud BigQuery load service.
An interface representing a client object for making calls to the BigQuery Storage API.
An interface for appending records to a Storage API write stream.
An interface to get, create and flush Cloud BigQuery STORAGE API write streams.
An implementation of
BigQueryServices that actually communicates with the cloud BigQuery
service.Helper class to create perworker metrics for BigQuery Sink stages.
A
Source representing reading from a table.An implementation of
TypedSchemaTransformProvider for BigQuery Storage Write API jobs
configured via BigQueryWriteConfiguration.A
SchemaTransform for BigQuery Storage Write API, configured with BigQueryWriteConfiguration and instantiated by BigQueryStorageWriteApiSchemaTransformProvider.BigQuery table provider.
Utility methods for BigQuery related operations.
Options for how to convert BigQuery data to Beam data.
Builder for
BigQueryUtils.ConversionOptions.Controls whether to truncate timestamps to millisecond precision lossily, or to crash when
truncation would result.
Options for how to convert BigQuery schemas to Beam schemas.
Builder for
BigQueryUtils.SchemaConversionOptions.Configuration for writing to BigQuery with SchemaTransforms.
Builder for
BigQueryWriteConfiguration.A BigQuery Write SchemaTransformProvider that routes to either
BigQueryFileLoadsSchemaTransformProvider or BigQueryStorageWriteApiSchemaTransformProvider.This is probably a temporary solution to what is a bigger migration from
cloud-bigtable-client-core to java-bigtable.
Override the configuration of Cloud Bigtable data and admin client.
Configuration for a Cloud Bigtable client.
Transforms for reading from and writing to Google Cloud Bigtable.Overwrite options to determine what to do if change stream name is being reused and there
exists metadata of the same change stream name.
A
PTransform that reads from Google Cloud Bigtable.A
PTransform that writes to Google Cloud Bigtable.A
PTransform that writes to Google Cloud Bigtable and emits a BigtableWriteResult for each batch written.An implementation of
TypedSchemaTransformProvider for Bigtable Read jobs configured via
BigtableReadSchemaTransformProvider.BigtableReadSchemaTransformConfiguration.Configuration for reading from Bigtable.
The result of writing a batch of rows to Bigtable.
A coder for
BigtableWriteResult.An implementation of
TypedSchemaTransformProvider for Bigtable Write jobs configured via
BigtableWriteSchemaTransformProvider.BigtableWriteSchemaTransformConfiguration.Configuration for writing to Bigtable.
Coder for
BitSet.Construct BlobServiceClientBuilder from Azure pipeline options.
Options used to configure Microsoft Azure Blob Storage.
A
BlockBasedSource is a FileBasedSource where a file consists of blocks of
records.A
Block represents a block of records that can be read.A
Reader that reads records from a BlockBasedSource.Holds an RDD or values for deferred conversion to an RDD if needed.
PTransform that reads a bounded amount of data from an UnboundedSource, specified
as one or both of a maximum number of elements or a maximum period of time to read.A
Source that reads a finite amount of input and, because of that, supports some
additional operations.A
Reader that reads a bounded amount of input and supports some additional operations,
such as progress estimation and dynamic work rebalancing.Jet
Processor implementation for reading from a bounded Beam
source.Internal: For internal use only and not for public consumption.
Implementation of
BoundedTrie.Internal: For internal use only and not for public consumption.
A
BoundedWindow represents window information assigned to data elements.An interface for elements buffered during a checkpoint when using @RequiresStableInput.
Sorter that will use in memory sorting until the values can't fit into memory and will
then fall back to external sorting.Contains configuration for the sorter.
A
DoFnRunner which buffers data for supporting DoFn.RequiresStableInput.A thread safe
StreamObserver which uses a bounded queue to pass elements to a processing
thread responsible for interacting with the underlying CallStreamObserver.Hash Functions.
BuiltinStringFunctions.
TrigonometricFunctions.
An immutable collection of elements which are part of a
PCollection.A handler which is invoked when the SDK returns
BeamFnApi.DelayedBundleApplications as
part of the bundle completion.Utility methods for creating
BundleCheckpointHandlers.A
BundleCheckpointHandler which uses TimerInternals.TimerData and ValueState to reschedule BeamFnApi.DelayedBundleApplication.A handler for the runner when a finalization request has been received.
Utility methods for creating
BundleFinalizationHandlers.A handler for bundle progress messages, both during bundle execution and on its completion.
A handler which is invoked whenever an active bundle is split.
Serializable byte array.
A
Coder for byte[].Give a Java type, returns the Java type expected for use with Row.
Takes a
StackManipulation that returns a value.Row is going to call the setter with its internal Java type, however the user object being set
might have a different type internally.
A naming strategy for ByteBuddy classes.
A class representing a key consisting of an array of bytes.
A class representing a range of
ByteKeys.An estimator to provide an estimate on the byte throughput of the outputted elements.
An estimator to provide an estimate on the throughput of the outputted elements.
A duplicate of
ByteStringCoder that uses the Apache Beam vendored protobuf.A
Coder for ByteString objects based on their encoded Protocol Buffer form.Benchmarks for
ByteStringOutputStream.These benchmarks below provide good details as to the cost of creating a new buffer vs copying
a subset of the existing one and re-using the larger one.
Helper functions to evaluate the completeness of collection of ByteStringRanges.
ByteToWindow function.
ByteToWindow function.
ByteToWindow function.
Transforms for reading and writing request/response associations to a cache.
A simple POJO that holds both cache read and write
PTransforms.SideInputReader that caches results for costly
Materializations.SideInputReader that caches materialized views.A wrapper around a
Factory that assumes the schema parameter never changes.Abstract wrapper for
CalciteConnection to simplify extension.Wrapper for
CalciteFactory.The core component to handle through a SQL statement, from explain execution plan, to generate a
Beam pipeline.
Utility methods for Calcite related operations.
A LogicalType corresponding to TIME_WITH_LOCAL_TIME_ZONE.
CalcRelSplitter operates on a
Calc with multiple RexCall sub-expressions that
cannot all be implemented by a single concrete RelNode.Type of relational expression.
A collection of
WindowFns that windows values into calendar-based windows such as spans
of days, months, or years.A
WindowFn that windows elements into periods measured by days.A
WindowFn that windows elements into periods measured by months.A
WindowFn that windows elements into periods measured by years.Caller interfaces user custom code intended for API calls.Informs whether a call to an API should backoff.
A simplified
ThreadSafe blocking queue that can be cancelled freeing any blocked Threads and preventing future Threads from blocking.The exception thrown when a
CoderRegistry or CoderProvider cannot provide a
Coder that has been requested.Indicates the reason that
Coder inference failed.An IO to read and write from/to Apache Cassandra
Specify the mutation type: either write or delete.
A
PTransform to read data from Apache Cassandra.A
PTransform to read data from Apache Cassandra.A
PTransform to mutate into Apache Cassandra.Set of utilities for casting rows between schemas.
Describes compatibility errors during casting.
Narrowing changes type without guarantee to preserve data.
Interface for statically validating casts.
Widening changes to type that can represent any possible value of the original type.
Represents a named and configurable container for managing tables.
Top-level authority that manages
Catalogs.A Calcite
Schema that corresponds to a CatalogManager.Over-arching registrar to capture available
Catalogs.A Calcite
Schema that corresponds to a Catalog.A
CdapIO is a Transform for reading data from source or writing data to sink of a Cdap
Plugin.A
PTransform to read from CDAP source.A
PTransform to write to CDAP sink.A
CEPCall instance represents an operation (node) that contains an operator and a list of
operands.A
CEPFieldRef instance represents a node that points to a specified field in a
Row.CEPKind corresponds to Calcite's SqlKind.CEPLiteral represents a literal node.The
CEPMeasure class represents the Measures clause and contains information about output
columns.CEPOperation is the base class for the evaluation operations defined in the
DEFINE syntax of MATCH_RECOGNIZE.The
CEPOperator records the operators (i.e.Core pattern class that stores the definition of a single pattern.
Some utility methods for transforming Calcite's constructs into our own Beam constructs (for
serialization purpose).
This class is responsible for processing individual ChangeStreamRecord.
Data access object to list and read stream partitions of a table.
Responsible for making change stream queries for a given partition.
Class to aggregate metrics related functionality.
Class to aggregate metrics related functionality.
Represents a Spanner Change Stream Record.
Holds internal execution metrics / metadata for the processed
ChangeStreamRecord.Decorator class over a
ResultSet that provides telemetry for the streamed records.Represents telemetry metadata gathered during the consumption of a change stream query.
Single place for defining the constants used in the
Spanner.readChangeStreams()
connector.Checkpoint data to make it available in future pipeline runs.
Checkpoint dir tree.
Helpers for reporting checkpoint durations.
A child partition represents a new partition that should be queried.
Represents a ChildPartitionsRecord.
This class is part of the process for
ReadChangeStreamPartitionDoFn SDF.Encoder for TIME and DATETIME values, according to civil_time encoding.
A read-only
FileSystem implementation looking up resources using a ClassLoader.AutoService registrar for the ClassLoaderFileSystem.An IO to write to ClickHouse.
A
PTransform to write to ClickHouse.Writes Rows and field values using
ClickHousePipedOutputStream.Factory to build and configure any
AwsClientBuilder using a specific ClientConfiguration or the globally provided settings in AwsOptions as fallback.Default implementation of
ClientBuilderFactory.Trust provider to skip certificate verification.
AWS client configuration.
Access to the current time.
A receiver of streamed data that can be closed.
An
AutoCloseable that wraps a resource that needs to be cleaned up but does not implement
AutoCloseable itself.An exception that wraps errors thrown while a resource is being closed.
A function that knows how to clean up after a resource.
A
ThrowingConsumer that can be closed.A representation of an arbitrary Java object to be instantiated by Dataflow workers.
Utilities for converting an object to a
CloudObject.A translator that takes an object and creates a
CloudObject which can be converted back
to the original object.A class providing transforms between Cloud Pub/Sub and Pub/Sub Lite message types.
Properties needed when using Google CloudResourceManager with the Apache Beam SDK.
Factory class for implementations of
AnnotateImages.Accepts
ByteString (encoded image contents) with optional DoFn.SideInput with a Map of ImageContext to
the image.Accepts
String (image URI on GCS) with optional DoFn.SideInput with a Map of ImageContext to
the image.A
Sink for Spark's
metric system reporting metrics (including Beam step metrics) to a CSV file.A
Sink for Spark's
metric system reporting metrics (including Beam step metrics) to Graphite.A
Coder<T> defines how to encode and decode values of type T into
byte streams.Deprecated.
To implement a coder, do not use any
Coder.Context.Exception thrown by
Coder.verifyDeterministic() if the encoding is not deterministic,
including details of why the encoding is not deterministic.Coder authors have the ability to automatically have their Coder registered with
the Dataflow Runner by creating a ServiceLoader entry and a concrete implementation of
this interface.An
Exception thrown if there is a problem encoding or decoding a value.Serialization utility class.
Serialization utility class.
A function for converting a byte array pair to a key-value pair.
Properties for use in
Coder tests.An
ElementByteSizeObserver that records the observed element sizes for testing
purposes.A
CoderProvider provides Coders.Coder creators have the ability to automatically have their coders
registered with this SDK by creating a ServiceLoader entry and a concrete implementation
of this interface.Static utility methods for creating and working with
CoderProviders.This class is used to estimate the size in bytes of a given element.
Flink
TypeInformation for Beam Coders.Flink
TypeSerializer for Beam Coders.A row result of a
CoGroupByKey.A
Coder for CoGbkResults.A schema for the results of a
CoGroupByKey.A transform that performs equijoins across multiple schema
PCollections.Defines the set of fields to extract for the join key, as well as other per-input join options.
A
PTransform that calculates the cross-product join.The implementing PTransform.
A
PTransform that performs a CoGroupByKey on a tuple of tables.Defines a column type from a Cloud Spanner table with the following information: column name,
column type, flag indicating if column is primary key and column position in the table.
PTransforms for combining PCollection elements globally and per-key.A
CombineFn that uses a subclass of Combine.AccumulatingCombineFn.Accumulator as its
accumulator type.The type of mutable accumulator values used by this
AccumulatingCombineFn.An abstract subclass of
Combine.CombineFn for implementing combiners that are more easily and
efficiently expressed as binary operations on doubles.An abstract subclass of
Combine.CombineFn for implementing combiners that are more easily
expressed as binary operations.An abstract subclass of
Combine.CombineFn for implementing combiners that are more easily and
efficiently expressed as binary operations on intsAn abstract subclass of
Combine.CombineFn for implementing combiners that are more easily and
efficiently expressed as binary operations on longs.A
CombineFn<InputT, AccumT, OutputT> specifies how to combine a collection of input
values of type InputT into a single output value of type OutputT.Combine.Globally<InputT, OutputT> takes a PCollection<InputT> and returns a
PCollection<OutputT> whose elements are the result of combining all the elements in
each window of the input PCollection, using a specified CombineFn<InputT, AccumT, OutputT>.Combine.GloballyAsSingletonView<InputT, OutputT> takes a PCollection<InputT>
and returns a PCollectionView<OutputT> whose elements are the result of combining all
the elements in each window of the input PCollection, using a specified CombineFn<InputT, AccumT, OutputT>.GroupedValues<K, InputT, OutputT> takes a PCollection<KV<K, Iterable<InputT>>>,
such as the result of GroupByKey, applies a specified CombineFn<InputT, AccumT, OutputT> to each of the input KV<K, Iterable<InputT>>
elements to produce a combined output KV<K, OutputT> element, and returns a
PCollection<KV<K, OutputT>> containing all the combined output elements.Holds a single value value of type
V which may or may not be present.PerKey<K, InputT, OutputT> takes a PCollection<KV<K, InputT>>, groups it by
key, applies a combining function to the InputT values associated with each key to
produce a combined OutputT value, and returns a PCollection<KV<K, OutputT>>
representing a map from each distinct key of the input PCollection to the corresponding
combined value.Like
Combine.PerKey, but sharding the combining of hot keys.Deprecated.
For internal use only; no backwards-compatibility guarantees.
For internal use only; no backwards-compatibility guarantees.
Static utility methods that create combine function instances.
A tuple of outputs produced by a composed combine functions.
A builder class to construct a composed
CombineFnBase.GlobalCombineFn.A composed
Combine.CombineFn that applies multiple CombineFns.A composed
CombineWithContext.CombineFnWithContext that applies multiple CombineFnWithContexts.Utilities for testing
CombineFns.This class contains combine functions that have access to
PipelineOptions and side inputs
through CombineWithContext.Context.A combine function that has access to
PipelineOptions and side inputs through
CombineWithContext.Context.Information accessible to all methods in
CombineFnWithContext and
KeyedCombineFnWithContext.An internal interface for signaling that a
GloballyCombineFn or a
PerKeyCombineFn needs to access CombineWithContext.Context.A
ReadableState cell defined by a Combine.CombineFn, accepting multiple input values,
combining them as specified into accumulators, and producing a single output value.A Source that reads from compressed files.
Reader for a
CompressedSource.Deprecated.
Use
Compression insteadFactory interface for creating channels that decompress the content of an underlying channel.
Various compression types for reading/writing files.
Class for building
PluginConfig object of the specific class .A
DeserializerProvider that uses Confluent Schema Registry to resolve a
Deserializers and Coder given a subject.Enumeration of debezium connectors.
Print to console.
Write to console.
PTransform writing PCollection to the console.Pair of a bit of user code (a "closure") and the
Requirements needed to run it.A function from an input to an output that may additionally access
Contextful.Fn.Context when
computing the result.An accessor for additional capabilities available in
Contextful.Fn.apply(InputT, org.apache.beam.sdk.transforms.Contextful.Fn.Context).PTransforms that read text files and collect contextual information of the elements in
the input.Implementation of
ContextualTextIO.read().Implementation of
ContextualTextIO.readFiles().A range of contiguous event sequences and the latest timestamp of the events in the range.
A pool of control clients that brokers incoming SDK harness connections (in the form of
InstructionRequestHandlers.A sink for
InstructionRequestHandlers keyed by worker id.A source of
InstructionRequestHandlers.A set of utilities for converting between different objects supporting schemas.
Helper functions for converting between equivalent schema types.
Return value after converting a schema.
A
BoundedSource reading from Comos.Create a cosmos client from the pipeline options.
PTransforms to count the elements in a PCollection.A metric that reports a single long value and can be incremented or decremented.
Implementation of
Counter.Returns the count of TRUE values for expression.
Pipeline visitors that fills a lookup table of
PValue to number of consumers.Most users should use
GenerateSequence instead.The checkpoint for an unbounded
CountingSource is simply the last value produced.A custom coder for
CounterMark.Combine.CombineFn for Covariance on Number types.A
PipelineRunner that applies no overrides and throws an exception on calls to Pipeline.run().Create<T> takes a collection of elements of type T known when the pipeline is
constructed and returns a PCollection<T> containing the elements.A
PTransform that creates a PCollection whose elements have associated
timestamps.A
PTransform that creates a PCollection from a set of in-memory objects.A
PTransform that creates a PCollection whose elements have associated
windowing metadata.A
DataflowRunner marker class for creating a PCollectionView.Enum containing all supported dispositions for table.
An abstract class that contains common configuration options for creating resources.
An abstract builder for
CreateOptions.A standard configuration options with builder.
Builder for
CreateOptions.StandardCreateOptions.Create an input stream from Queue.
Spark streaming overrides for various view (side input) transforms.
Creates a primitive
PCollectionView.Creates any tables needed before performing streaming writes to the tables.
Construct an oauth credential to be used by the SDK and the SDK workers.
Parameters abstract class to expose the transforms to an external SDK.
PTransforms for reading and writing CSV files.PTransform for writing CSV files.PTransform for Parsing CSV Record Strings into Schema-mapped target types.CsvIOParseError is a data class to store errors from CSV record processing.A
Sink for Spark's
metric system reporting metrics (including Beam step metrics) to a CSV file.A
FileWriteSchemaTransformFormatProvider for CSV format.An implementation of
TypedSchemaTransformProvider for CsvIO.write(java.lang.String, org.apache.commons.csv.CSVFormat).Configuration for writing to BigQuery with Storage Write API.
Builder for
CsvWriteTransformProvider.CsvWriteConfiguration.An abstract base class that implements all methods of
Coder except Coder.encode(T, java.io.OutputStream)
and Coder.decode(java.io.InputStream).Describes a customer.
An optional component to use with the
RetryHttpRequestInitializer in order to provide
custom errors for failing http calls.A Builder which allows building immutable CustomHttpErrors object.
A simple Tuple class for creating a list of HttpResponseMatcher and HttpResponseCustomError to
print for the responses.
A helper class for supporting sources defined as
Source.Interface that table providers can implement if they require custom table name resolution.
A policy for custom record timestamps where timestamps within a partition are expected to be
roughly monotonically increasing with a cap on out of order event delays (say 1 minute).
A Custom X509TrustManager that trusts a user provided CA and default CA's.
Utility class for wiring up Jet DAGs based on Beam pipelines.
Listener that can be registered with a
DAGBuilder in order to be notified when edges
are being registered.Factory class to create data access objects to perform change stream queries and access the
metadata tables.
Pipeline options for Data Catalog table provider.
Uses DataCatalog to get the source type and schema for a table.
A data change record encodes modifications to Cloud Spanner rows.
This class is part of the process for
ReadChangeStreamPartitionDoFn SDF.Wrapper around the generated
Dataflow client to provide common functionality.Specialized implementation of
GroupByKey for translating Redistribute transform into
Dataflow service protos.Registers
DataflowGroupByKey.DataflowGroupByKeyTranslator.An exception that is thrown if the unique job name constraint of the Dataflow service is broken
because an existing job with the same job name is currently active.
An exception that is thrown if the existing job has already been updated within the Dataflow
service and is no longer able to be updated.
A
RuntimeException that contains information about a DataflowPipelineJob.Internal.
Returns the default Dataflow client built from the passed in PipelineOptions.
Creates a
Stager object using the class specified in DataflowPipelineDebugOptions.getStagerClass().Sets Integer value based on old, deprecated field (
DataflowPipelineDebugOptions.getUnboundedReaderMaxReadTimeSec()).A DataflowPipelineJob represents a job submitted to Dataflow using
DataflowRunner.Options that can be used to configure the
DataflowRunner.Set of available Flexible Resource Scheduling goals.
Returns a default staging location under
GcpOptions.getGcpTempLocation().Register the
DataflowPipelineOptions.Register the
DataflowRunner.DataflowPipelineTranslator knows how to translate Pipeline objects into Cloud
Dataflow Service API Jobs.The result of a job translation.
Options that are used to configure the Dataflow pipeline worker pool.
Type of autoscaling algorithm to use.
Options for controlling profiling of pipeline execution.
Configuration the for profiling agent.
A
PipelineRunner that executes the operations in the pipeline by first translating them
to the Dataflow representation using the DataflowPipelineTranslator and then submitting
them to a Dataflow service for execution.A marker
DoFn for writing the contents of a PCollection to a streaming PCollectionView backend implementation.An instance of this class can be passed to the
DataflowRunner to add user defined hooks
to be invoked at various times during pipeline execution.Populates versioning and other information for
DataflowRunner.Signals there was an error retrieving information about a job from the Cloud Dataflow Service.
[Internal] Options for configuring StreamingDataflowWorker.
EnableStreamingEngine defaults to false unless one of the experiment is set.
Read global get config request period from system property
'windmill.global_config_refresh_period'.
Read counter reporting period from system property 'windmill.harness_update_reporting_period'.
Factory for creating local Windmill address.
Read 'MaxStackTraceToReport' from system property 'windmill.max_stack_trace_to_report' or
Integer.MAX_VALUE if unspecified.
Read 'PeriodicStatusPageOutputDirector' from system property
'windmill.periodic_status_page_directory' or null if unspecified.
Factory for setting value of WindmillServiceStreamingRpcBatchLimit based on environment.
A
DataflowPipelineJob that is returned when --templateRunner is set.Helpers for cloud communication.
Options that are used exclusively within the Dataflow worker harness.
Deprecated.
This interface will no longer be the source of truth for worker logging configuration
once jobs are executed using a dedicated SDK harness instead of user code being co-located
alongside Dataflow worker code.
The set of log levels that can be used on the Dataflow worker.
Defines a log level override for a specific class, package, or name.
Wrapper for invoking external Python
DataframeTransform.The main PTransform that encapsulates the data generation logic.
A stateful DoFn that converts a sequence of Longs into structured Rows.
Represents a 'datagen' table within a Beam SQL pipeline.
The service entry point for the 'datagen' table type.
Wrapper for
DataInputView.Wrapper for
DataOutputView.Holder for Spark RDD/DStream.
DatastoreIO provides an API for reading from and writing to Google Cloud Datastore over different
versions of the Cloud Datastore Client libraries.DatastoreV1 provides an API to Read, Write and Delete PCollections of
Google Cloud Datastore version v1 Entity objects.A
PTransform that deletes Entities from Cloud Datastore.A
PTransform that deletes Entities from Cloud Datastore and returns
DatastoreV1.WriteSuccessSummary for each successful write.A
PTransform that deletes Entities associated with the given Keys from Cloud Datastore and returns DatastoreV1.WriteSuccessSummary for each successful delete.A
PTransform that reads the result rows of a Cloud Datastore query as Entity
objects.A
PTransform that writes Entity objects to Cloud Datastore.Summary object produced when a number of writes are successfully written to Datastore in a
single Mutation.
A
PTransform that writes Entity objects to Cloud Datastore and returns DatastoreV1.WriteSuccessSummary for each successful write.An implementation of
SchemaIOProvider for reading and writing payloads with DatastoreIO.An abstraction to create schema aware IOs.
TableProvider for DatastoreIO for consumption by Beam SQL.DataStreams.DataStreamDecoder treats multiple ByteStrings as a single input stream decoding
values with the supplied iterator.An adapter which converts an
InputStream to a PrefetchableIterator of T
values using the specified Coder.An adapter which wraps an
DataStreams.OutputChunkConsumer as an OutputStream.A callback which is invoked whenever the
DataStreams.outbound(org.apache.beam.sdk.fn.stream.DataStreams.OutputChunkConsumer<org.apache.beam.vendor.grpc.v1p69p0.com.google.protobuf.ByteString>) OutputStream becomes full.A date without a time-zone.
A datetime without a time-zone.
Utility class which exposes an implementation
DebeziumIO.read() and a Debezium configuration.A POJO describing a Debezium configuration.
Implementation of
DebeziumIO.read().A schema-aware transform provider for
DebeziumIO.Exposes
DebeziumIO.Read as an external transform for cross-language usage.A receiver of encoded data, decoding it and passing it onto a downstream consumer.
Remove values with duplicate ids.
A set of
PTransforms which deduplicate input records over a time domain and threshold.Deduplicates keyed values using the key over a specified time domain and threshold.
Deduplicates values over a specified time domain and threshold.
A
PTransform that uses a SerializableFunction to obtain a representative value
for each input element used for deduplication.Default represents a set of annotations that can be used to annotate getter properties on
PipelineOptions with information representing the default value to be returned if no
value is specified.This represents that the default of the option is the specified boolean primitive value.
This represents that the default of the option is the specified byte primitive value.
This represents that the default of the option is the specified char primitive value.
This represents that the default of the option is the specified
Class value.This represents that the default of the option is the specified double primitive value.
This represents that the default of the option is the specified enum.
This represents that the default of the option is the specified float primitive value.
Value must be of type
DefaultValueFactory and have a default constructor.This represents that the default of the option is the specified int primitive value.
This represents that the default of the option is the specified long primitive value.
This represents that the default of the option is the specified short primitive value.
This represents that the default of the option is the specified
String value.Default implementation of
AutoScaler.Construct BlobServiceClientBuilder with given values of Azure client properties.
The
DefaultCoder annotation specifies a Coder class to handle encoding and
decoding instances of the annotated class.A
CoderProviderRegistrar that registers a CoderProvider which can use the
@DefaultCoder annotation to provide coder providers that creates
Coders.A
CoderProvider that uses the @DefaultCoder annotation to provide coder providers that create Coders.The
CoderCloudObjectTranslatorRegistrar containing the default collection of Coder Cloud Object Translators.Implementation of a
ExecutableStageContext.A default
FileBasedSink.FilenamePolicy for windowed and unwindowed files.Encapsulates constructor parameters to
DefaultFilenamePolicy.A Coder for
DefaultFilenamePolicy.Params.Factory for a default value for Google Cloud region according to
https://cloud.google.com/compute/docs/gcloud-compute/#default-properties.
The default way to construct a
GoogleAdsClient.A
JobBundleFactory for which the implementation can specify a custom EnvironmentFactory for environment management.A container for EnvironmentFactory and its corresponding Grpc servers.
Holder for an
SdkHarnessClient along with its associated state and data servers.A
PipelineOptionsRegistrar containing the PipelineOptions subclasses available by
default.Construct S3ClientBuilder with default values of S3 client properties like path style access,
accelerated mode, etc.
Registers the "s3" uri schema to be handled by
S3FileSystem.The
DefaultSchema annotation specifies a SchemaProvider class to handle obtaining
a schema and row for the specified class.SchemaProvider for default schemas.Registrar for default schemas.
Default global sequence combiner.
This default implementation of
BeamSqlTableFilter interface.A trigger that is equivalent to
Repeatedly.forever(AfterWatermark.pastEndOfWindow()).An interface used with the
Default.InstanceFactory annotation to specify the class that
will be an instance factory to produce default values for a given getter on PipelineOptions.A
DelegateCoder<T, IntermediateT> wraps a Coder for IntermediateT and
encodes/decodes values of type T by converting to/from IntermediateT and then
encoding/decoding using the underlying Coder<IntermediateT>.A
CodingFunction<InputT, OutputT> is a serializable
function from InputT to OutputT that may throw any Exception.Implementation of
Counter that delegates to the instance for the current context.Implementation of
Distribution that delegates to the instance for the current context.Implementation of
Gauge that delegates to the instance for the current context.Implementation of
Histogram that delegates to the instance for the current context.Descriptions are used to generate human readable output when the
--help command is
specified.Provides a configured
Deserializer instance and its associated Coder.This class processes
DetectNewPartitionsDoFn.This class is responsible for scheduling partitions.
A SplittableDoFn (SDF) that is responsible for scheduling partitions to be queried.
This restriction tracker delegates most of its behavior to an internal
TimestampRangeTracker.Metadata of the progress of
DetectNewPartitionsDoFn from the metadata
table.The DicomIO connectors allows Beam pipelines to make calls to the Dicom API of the Google Cloud
Healthcare API (https://cloud.google.com/healthcare/docs/how-tos#dicom-guide).
This class makes a call to the retrieve metadata endpoint
(https://cloud.google.com/healthcare/docs/how-tos/dicomweb#retrieving_metadata).
Options that can be used to configure the
DirectRunner.A
DefaultValueFactory that returns the result of Runtime.availableProcessors()
from the DirectOptions.AvailableParallelismFactory.create(PipelineOptions) method.Registers the
DirectOptions.Registers the
DirectRunner.The result of running a
Pipeline with the DirectRunner.A
StreamObserver which uses synchronization on the underlying CallStreamObserver
to provide thread safety.Internal-only options for tweaking the behavior of the
PipelineOptions.DirectRunner in ways that users
should never do.Static display data associated with a pipeline component.
Utility to build up display data from a component and its included subcomponents.
Unique identifier for a display data item within a component.
Items are the unit of display data.Specifies an
DisplayData.Item to register as display data.Structured path of registered display data within a component hierarchy.
Display data type.
Distinct<T> takes a PCollection<T> and returns a PCollection<T> that has
all distinct elements of the input.A
Distinct PTransform that uses a SerializableFunction to obtain a
representative value for each input element.A metric that reports information about the distribution of reported values.
Implementation of
Distribution.The result of a
Distribution metric.A
PTransform connecting to Cloud DLP (https://cloud.google.com/dlp/docs/libraries) and
deidentifying text according to provided settings.A
PTransform connecting to Cloud DLP (https://cloud.google.com/dlp/docs/libraries) and
inspecting text for identifying data according to provided settings.A
PTransform connecting to Cloud DLP (https://cloud.google.com/dlp/docs/libraries) and
inspecting text for identifying data according to provided settings.An
EnvironmentFactory that creates docker containers by shelling out to docker.Provider for DockerEnvironmentFactory.
The argument to
ParDo providing the code to use to process elements of the input PCollection.Annotation for declaring that a state parameter is always fetched.
Annotation on a splittable
DoFn
specifying that the DoFn performs a bounded amount of work per input element, so
applying it to a bounded PCollection will produce also a bounded PCollection.A parameter that is accessible during
@StartBundle, @ProcessElement and @FinishBundle that allows the caller
to register a callback that will be invoked after the bundle has been successfully completed
and the runner has commit the output.An instance of a function that will be invoked after bundle finalization.
Parameter annotation for the input element for
DoFn.ProcessElement, DoFn.GetInitialRestriction, DoFn.GetSize, DoFn.SplitRestriction, DoFn.GetInitialWatermarkEstimatorState, DoFn.NewWatermarkEstimator, and DoFn.NewTracker
methods.Annotation for specifying specific fields that are accessed in a Schema PCollection.
Annotation for the method to use to finish processing a batch of elements.
Annotation for the method that maps an element to an initial restriction for a splittable
DoFn.Annotation for the method that maps an element and restriction to initial watermark estimator
state for a splittable
DoFn.Annotation for the method that returns the coder to use for the restriction of a splittable
DoFn.Annotation for the method that returns the corresponding size for an element and restriction
pair.
Annotation for the method that returns the coder to use for the watermark estimator state of a
splittable
DoFn.Parameter annotation for dereferencing input element key in
KV pair.Receives tagged output for a multi-output function.
Annotation for the method that creates a new
RestrictionTracker for the restriction of
a splittable DoFn.Annotation for the method that creates a new
WatermarkEstimator for the watermark state
of a splittable DoFn.Annotation for registering a callback for a timer.
Annotation for registering a callback for a timerFamily.
Annotation for the method to use for performing actions on window expiration.
Receives values of the given type.
When used as a return value of
DoFn.ProcessElement, indicates whether there is more work to
be done for the current element.Annotation for the method to use for processing elements.
Annotation that may be added to a
DoFn.ProcessElement, DoFn.OnTimer, or DoFn.OnWindowExpiration method to indicate that the runner must ensure that the observable contents
of the input PCollection or mutable state must be stable upon retries.Annotation that may be added to a
DoFn.ProcessElement method to indicate that the runner
must ensure that the observable contents of the input PCollection is sorted by time, in
ascending order.Parameter annotation for the restriction for
DoFn.GetSize, DoFn.SplitRestriction, DoFn.GetInitialWatermarkEstimatorState, DoFn.NewWatermarkEstimator, and DoFn.NewTracker
methods.Annotation for the method to use to prepare an instance for processing bundles of elements.
Parameter annotation for the SideInput for a
DoFn.ProcessElement method.Annotation for the method that splits restriction of a splittable
DoFn into multiple parts to
be processed in parallel.Annotation for the method to use to prepare an instance for processing a batch of elements.
Annotation for declaring and dereferencing state cells.
Annotation for the method to use to clean up this instance before it is discarded.
Parameter annotation for the TimerMap for a
DoFn.ProcessElement method.Annotation for declaring and dereferencing timers.
Parameter annotation for the input element timestamp for
DoFn.ProcessElement, DoFn.GetInitialRestriction, DoFn.GetSize, DoFn.SplitRestriction, DoFn.GetInitialWatermarkEstimatorState, DoFn.NewWatermarkEstimator, and DoFn.NewTracker
methods.Annotation for the method that truncates the restriction of a splittable
DoFn into a bounded one.Annotation on a splittable
DoFn
specifying that the DoFn performs an unbounded amount of work per input element, so
applying it to a bounded PCollection will produce an unbounded PCollection.Parameter annotation for the watermark estimator state for the
DoFn.NewWatermarkEstimator
method.DoFn function.
Flink operator for executing
DoFns.A
WindowedValueReceiver that can buffer its outputs.Implementation of
DoFnOperator.OutputManagerFactory that creates an DoFnOperator.BufferedOutputManager
that can write to multiple logical outputs by Flink side output.Common
DoFn.OutputReceiver and DoFn.MultiOutputReceiver classes.DoFnRunner decorator which registers
MetricsContainerImpl.DoFnRunner decorator which registers MetricsContainerImpl.Represents information about how a DoFn extracts schemas.
The builder object.
Deprecated.
Use
TestPipeline with the DirectRunner.Deprecated.
Use
TestPipeline with the DirectRunner.A
DoubleCoder encodes Double values in 8 bytes using Java serialization.A transform to drop fields from a schema.
Implementation class for DropFields.
A specialization of
FileBasedSink.DynamicDestinations for AvroIO.This class provides the most general way of specifying dynamic BigQuery table destinations.
Some helper classes that derive from
FileBasedSink.DynamicDestinations.A
Coder using Google Protocol Buffers binary format.IO to read from and write to DynamoDB tables.
Read data from DynamoDB using
DynamoDBIO.Read.getScanRequestFn() and emit an element of type DynamoDBIO.Read
for each ScanResponse using the mapping function DynamoDBIO.Read.getScanResponseMapperFn().Write a PCollection data into DynamoDB.
Transforms for reading and writing data from/to Elasticsearch.
A
BoundedSource reading from Elasticsearch.A
PTransform writing Bulk API entities created by ElasticsearchIO.DocToBulk to
an Elasticsearch cluster.A POJO describing a connection configuration to Elasticsearch.
A
PTransform converting docs to their Bulk API counterparts.A
PTransform reading data from Elasticsearch.A POJO encapsulating a configuration for retry behavior when issuing requests to ES.
A
PTransform writing data to Elasticsearch.Manipulates test data used by the
ElasticsearchIO integration tests.Pipeline options for elasticsearch tests.
Map to tuple function.
An
EnvironmentFactory that communicates to a FnHarness which is executing in the
same process.Provider of EmbeddedEnvironmentFactory.
Passing null values to Spark's Java API may cause problems because of Guava preconditions.
Options for allowing or disallowing filepatterns that match no resources in
FileSystems.match(java.util.List<java.lang.String>).A wrapper around a
Throwable for use with coders.An encoded
BoundedWindow used within Runners to track window information without needing
to decode the window.Flink
TypeComparator for Beam values that have been
encoded to byte data by a Coder.TypeSerializer for values that were encoded using a Coder.Flink
TypeInformation for Beam values that have been encoded to byte data by a Coder.Encoders utility class.Encoder / expression utils that are called from generated code.
Represents an error during encoding (serializing) a class.
Represents an error during encoding (serializing) a class.
This
Schema.LogicalType represent an enumeration over a fixed set of values.This class represents a single enum value.
Creates
environments which communicate to an SdkHarnessClient.Provider for a
EnvironmentFactory and ServerFactory for the environment.ErrorContainer interface.
An Error Handler is a utility object used for plumbing error PCollections to a configured sink
Error Handlers must be closed before a pipeline is run to properly pipe error collections to the
sink, and the pipeline will be rejected if any handlers aren't closed.
A default, placeholder error handler that exists to allow usage of .addErrorCollection()
without effects.
The
EvaluationContext is the result of a pipeline translation and can be used to evaluate / run the pipeline.The EvaluationContext allows us to define pipeline instructions and translate between
PObject<T>s or PCollection<T>s and Ts or DStreams/RDDs of Ts.Classes extending this interface will be called by
OrderedEventProcessor to examine every
incoming event.The interface that enables querying of a graph of independently executable stages and the inputs
and outputs of those stages.
The context required in order to execute
stages.Creates
ExecutableStageContext instances.This operator is the streaming equivalent of the
FlinkExecutableStageFunction.Drives the execution of a
Pipeline by scheduling work.The state of the driver.
Options for configuring the
ScheduledExecutorService used throughout the Java runtime.Returns the default
ScheduledExecutorService to use within the Apache Beam SDK.A
gRPC Server for an ExpansionService.A service that allows pipeline expand transforms from a remote SDK.
A registrar that creates
TransformProvider instances from RunnerApi.FunctionSpecs.Exposes Java transforms via
ExternalTransformRegistrar.Options used to configure the
ExpansionService.Loads the ExpansionService config.
Loads the allow list from
ExpansionServiceOptions.getJavaClassLookupAllowlistFile(), defaulting to an empty
JavaClassLookupTransformProvider.AllowList.Apache Beam provides a number of experimental features that can be enabled with this flag.
An
EnvironmentFactory which requests workers via the given URL in the Environment.Provider of ExternalEnvironmentFactory.
Exposes
PubsubIO.Read as an external transform for cross-language usage.Parameters class to expose the transform to an external SDK.
Does an external sort of the provided values.
ExternalSorter.Options contains configuration of the sorter.Sorter type.
Provides mechanism for acquiring locks related to the job.
An interface for building a transform from an externally provided configuration.
A registrar which contains a mapping from URNs to available
ExternalTransformBuilders.Exposes
PubsubIO.Write as an external transform for cross-language usage.Parameters class to expose the transform to an external SDK.
A Factory interface for schema-related objects for a specific Java type.
Alternative implementation of
PipelineResult used to avoid throwing Exceptions in certain
situations.An immutable tuple of value, timestamp, window, and pane.
A coder for
FailsafeValueInSingleWindow.A generic failure of an SQL transform.
Class FailureCollectorWrapper is a class for collecting ValidationFailure.
A fake implementation of BigQuery's query service..
An implementation of
BigQueryServices.BigQueryServerStream which takes a List as the Iterable to simulate a server stream.A fake dataset service that can be serialized, for use in testReadFromTable.
A fake implementation of BigQuery's job service.
FhirBundleParameter represents a FHIR bundle in JSON format to be executed on a FHIR store.
FhirIO provides an API for reading and writing resources to Google Cloud Healthcare Fhir API.Deidentify FHIR resources from a FHIR store to a destination FHIR store.
A function that schedules a deidentify operation and monitors the status.
The type Execute bundles.
ExecuteBundlesResult contains both successfully executed bundles and information help debugging
failed executions (eg metadata invalid input: '&' error msgs).
Export FHIR resources from a FHIR store to new line delimited json files on GCS or BigQuery.
A function that schedules an export operation and monitors the status.
Writes each bundle of elements to a new-line delimited JSON file on GCS and issues a
fhirStores.import Request for that file.
The enum Content structure.
The type Read.
The type Result.
The type Search.
The type Write.
The type Result.
The enum Write method.
The type FhirIOPatientEverything for querying a FHIR Patient resource's compartment.
PatientEverythingParameter defines required attributes for a FHIR GetPatientEverything request
in
FhirIOPatientEverything.The Result for a
FhirIOPatientEverything request.FhirSearchParameter represents the query parameters for a FHIR search request, used as a
parameter for
FhirIO.Search.FhirSearchParameterCoder is the coder for
FhirSearchParameter, which takes a coder for
type T.Used inside of a
DoFn to describe which fields in a schema
type need to be accessed for processing.Description of a single field.
Builder class.
Qualifier for a list selector.
Qualifier for a map selector.
OneOf union for a collection selector.
The kind of qualifier.
Parser for textual field-access selector.
This class provides an empty implementation of
FieldSpecifierNotationListener,
which can be extended to create a listener which only needs to handle a subset
of the available methods.This class provides an empty implementation of
FieldSpecifierNotationVisitor,
which can be extended to create a visitor which only needs to handle a subset
of the available methods.This interface defines a complete listener for a parse tree produced by
FieldSpecifierNotationParser.This interface defines a complete generic visitor for a parse tree produced
by
FieldSpecifierNotationParser.Utilities for converting between
Schema field types and TypeDescriptors that
define Java objects which can represent these field types.For internal use only; no backwards-compatibility guarantees.
For internal use only; no backwards-compatibility guarantees.
For internal use only; no backwards-compatibility guarantees.
Represents type information for a Java type that will be used to infer a Schema type.
A naming policy for schema fields.
Abstract class for file-based output.
Deprecated.
use
Compression.A class that allows value-dependent writes in
FileBasedSink.A naming policy for output files.
Result of a single bundle write.
A coder for
FileBasedSink.FileResult objects.Provides hints about how to generate output files, such as a suggested filename suffix (e.g.
Implementations create instances of
WritableByteChannel used by FileBasedSink
and related classes to allow decorating, or otherwise transforming, the raw data that
would normally be written directly to the WritableByteChannel passed into FileBasedSink.WritableByteChannelFactory.create(WritableByteChannel).Abstract operation that manages the process of writing to
FileBasedSink.Abstract writer that writes a bundle to a
FileBasedSink.A common base class for all file-based
Sources.A
reader that implements code common to readers of
FileBasedSources.A given
FileBasedSource represents a file resource of one of these types.Matcher to verify checksum of the contents of an
ShardedFile in E2E test.General-purpose transforms for working with files: listing files (matching), reading and writing.
Implementation of
FileIO.match().Implementation of
FileIO.matchAll().Describes configuration for matching filepatterns, such as
EmptyMatchTreatment and
continuous watching for matching files.A utility class for accessing a potentially compressed file.
Implementation of
FileIO.readMatches().Enum to control how directories are handled.
Specifies how to write elements to individual files in
FileIO.write() and FileIO.writeDynamic().Implementation of
FileIO.write() and FileIO.writeDynamic().A policy for generating names for shard files.
Interface that provides a
PTransform that reads in a PCollection of FileIO.ReadableFiles and outputs the data represented as a PCollection of Rows.Flink
metrics reporter for writing
metrics to a file specified via the "metrics.reporter.file.path" config key (assuming an alias of
"file" for this reporter in the "metrics.reporters" setting).File staging related options.
File system interface in Beam.
A registrar that creates
FileSystem instances from PipelineOptions.Clients facing
FileSystem utility.The configuration for building file writing transforms using
SchemaTransform and SchemaTransformProvider.Configures extra details related to writing CSV formatted files.
Configures extra details related to writing Parquet formatted files.
Configures extra details related to writing XML formatted files.
Provides a
PTransform that writes a PCollection of Rows and outputs a
PCollection of the file names according to a registered AutoService FileWriteSchemaTransformFormatProvider
implementation.FileWriteSchemaTransformFormatProviders contains FileWriteSchemaTransformFormatProvider implementations.A
TypedSchemaTransformProvider implementation for writing a Row PCollection to file systems, driven by a FileWriteSchemaTransformConfiguration.Fill gaps in timeseries.
Argument to withInterpolateFunction function.
A
PTransform for filtering a collection of schema types.PTransforms for filtering from a PCollection the elements satisfying a predicate,
or satisfying an inequality with a given value based on the elements' natural ordering.Implementation of the filter.
Utilities that convert between a SQL filter expression and an Iceberg
Expression.Builds a MongoDB FindQuery object.
FirestoreIO provides an API for reading from and writing to Google Cloud
Firestore.FirestoreV1 provides an API which provides lifecycle managed PTransforms for Cloud Firestore
v1 API.Concrete class representing a
PTransform<PCollection<BatchGetDocumentsRequest>, PTransform<BatchGetDocumentsResponse>> which will read from Firestore.A type safe builder for
FirestoreV1.BatchGetDocuments allowing configuration and instantiation.Concrete class representing a
PTransform<PCollection<Write>, PCollection<FirestoreV1.WriteFailure
> which will write to Firestore.A type safe builder for
FirestoreV1.BatchWriteWithDeadLetterQueue allowing configuration and
instantiation.A type safe builder for
FirestoreV1.BatchWriteWithSummary allowing configuration and
instantiation.Exception that is thrown if one or more
Writes is unsuccessful
with a non-retryable status code.Concrete class representing a
PTransform<PCollection<ListCollectionIdsRequest>, PTransform<ListCollectionIdsResponse>> which will read from Firestore.A type safe builder for
FirestoreV1.ListCollectionIds allowing configuration and instantiation.Concrete class representing a
PTransform<PCollection<ListDocumentsRequest>, PTransform<ListDocumentsResponse
>> which will read from Firestore.A type safe builder for
FirestoreV1.ListDocuments allowing configuration and instantiation.Concrete class representing a
PTransform<PCollection<PartitionQueryRequest>, PTransform<RunQueryRequest>>
which will read from Firestore.A type safe builder for
FirestoreV1.PartitionQuery allowing configuration and instantiation.Type safe builder factory for read operations.
Concrete class representing a
PTransform<PCollection<RunQueryRequest>, PTransform<RunQueryResponse>> which
will read from Firestore.A type safe builder for
FirestoreV1.RunQuery allowing configuration and instantiation.Type safe builder factory for write operations.
Failure details for an attempted
Write.Summary object produced when a number of writes are successfully written to Firestore in a
single BatchWrite.
A LogicalType representing a fixed-length byte array.
Fixed precision numeric types used to represent jdbc NUMERIC and DECIMAL types.
A LogicalType representing a fixed-length string.
A
WindowFn that windows values into fixed-size timestamp-based windows.PTransforms for mapping a simple function that returns iterables over the elements of a
PCollection and merging the results.A
PTransform that adds exception handling to FlatMapElements.Flatten<T> takes multiple PCollection<T>s bundled into a
PCollectionList<T> and returns a single PCollection<T> containing all the elements in
all the input PCollections.FlattenIterables<T> takes a PCollection<Iterable<T>> and returns a
PCollection<T> that contains all the elements from each iterable.A
PTransform that flattens a PCollectionList into a PCollection
containing all the elements of all the PCollections in its input.Jet
Processor implementation for Beam's Flatten primitive.Jet
Processor supplier that will provide instances of FlattenP.An implementation of
TypedSchemaTransformProvider for Flatten.Flatten translator.
Category tag for tests that use a
Flatten where the input PCollectionList
contains PCollections heterogeneous coders.Flink
FlatMapFunction for implementing Window.Assign.A translator that translates bounded portable pipelines into executable Flink pipelines.
Batch translation context.
Predicate to determine whether a URN is a Flink native transform.
Transform translation interface.
A Flink
Source implementation that wraps a
Beam BoundedSource.A Flink
SourceReader implementation
that reads from the assigned FlinkSourceSplits by using Beam BoundedReaders.StateInternals that uses a Flink OperatorStateBackend to manage the broadcast
state.Result of a detached execution of a
Pipeline with Flink.Encapsulates a
DoFn inside a Flink RichMapPartitionFunction.Singleton class that contains one
ExecutableStageContext.Factory per job.Flink operator that passes its input DataSet through an SDK-executed
ExecutableStage.A Flink function that demultiplexes output from a
FlinkExecutableStageFunction.Utilities for Flink execution environments.
Explode
WindowedValue that belongs to multiple windows into multiple "single window"
values, so we can safely group elements by (K, W) tuples.A map function that outputs the input element without any change.
Job Invoker for the
FlinkRunner.Driver program that starts a job server for the Flink runner.
Flink runner-specific Configuration for the jobServer.
Utility functions for dealing with key encoding.
Special version of
FlinkReduceFunction that supports merging windows.Helper class for holding a
MetricsContainerImpl and forwarding Beam metrics to Flink
accumulators and metrics.The base helper class for holding a
MetricsContainerImpl and forwarding
Beam metrics to Flink accumulators and metrics.Entry point for starting an embedded Flink cluster.
A
FlatMapFunction function that filters out those elements that don't belong in this
output.Reduce function for non-merging GBK implementation.
A
StepContext for Flink Batch Runner execution.This is the first step for executing a
Combine.PerKey on
Flink.Options which can be used to configure the Flink Runner.
Maximum bundle size factory.
Maximum bundle time factory.
Runs a Pipeline on Flink via
FlinkRunner.Flink job entry point to launch a Beam pipeline by executing an external SDK driver program.
Interface for portable Flink translators.
A handle used to execute a translated pipeline.
The context used for pipeline translation.
Result of executing a portable
Pipeline with Flink.Various utilies related to portability.
This is the second part for executing a
Combine.PerKey on
Flink, the second part is FlinkReduceFunction.A
PipelineRunner that executes the operations in the pipeline by first translating them
to a Flink Plan and then executing them either locally or on a Flink cluster, depending on the
configuration.AutoService registrar - will register FlinkRunner and FlinkOptions as possible pipeline runner
services.
Pipeline options registrar.
Pipeline runner registrar.
Result of executing a
Pipeline with Flink.A
SideInputReader for the Flink Batch Runner.The base class for
FlinkBoundedSource and FlinkUnboundedSource.An abstract implementation of
SourceReader which encapsulates Beam Sources
for data reading.A Flink
SourceSplit implementation that encapsulates a Beam Source.Constructs a StateBackend to use from flink pipeline options.
A
RichGroupReduceFunction for stateful ParDo in Flink Batch Runner.StateInternals that uses a Flink KeyedStateBackend to manage state.Eagerly create user state to work around https://jira.apache.org/jira/browse/FLINK-12653.
Serializer configuration snapshot for compatibility and format evolution.
Translate an unbounded portable pipeline representation into a Flink pipeline representation.
Predicate to determine whether a URN is a Flink native transform.
Streaming translation context.
A Flink
Source implementation that wraps a
Beam UnboundedSource.A Flink
SourceReader implementation
that reads from the assigned FlinkSourceSplits by using Beam UnboundedReaders.A
FloatCoder encodes Float values in 4 bytes using Java serialization.A client for the control plane of an SDK harness, which can issue requests to it over the Fn API.
A Fn API control service which adds incoming SDK harness connections to a sink.
A receiver of streamed data.
The
FnDataService is able to forward inbound elements to a consumer and is also a
consumer of outbound elements.An interface sharing common behavior with services used during execution of user Fns.
A
ClientResponseObserver which delegates all StreamObserver calls.Base class for table providers that look up table metadata using full table names, instead of
querying it by parts of the name separately.
A metric that reports the latest value out of reported values.
Implementation of
Gauge.The result of a
Gauge metric.Empty
GaugeResult, representing no values reported.Construct an oauth credential to be used by the SDK and the SDK workers.
A registrar containing the default GCP options.
Options used to configure Google Cloud Platform specific options such as the project and
credentials.
Attempts to infer the default project based upon the environment this application is executing
within.
EnableStreamingEngine defaults to false unless one of the two experiments is set.
Returns the default set of OAuth scopes.
Returns
PipelineOptions.getTempLocation() as the default GCP temp location.Attempts to load the GCP credentials.
A registrar containing the default GCP options.
This class implements a
SessionServiceFactory that retrieve the basic authentication
credentials from a Google Cloud Secret Manager secret.An abstract class that contains common configuration options for creating resources.
A builder for
GcsCreateOptions.AutoService registrar for the GcsFileSystem.Options used to configure Google Cloud Storage.
Returns the default
ExecutorService to use within the Apache Beam SDK.Creates a
GcsOptions.GcsCustomAuditEntries that key-value pairs to be stored as custom information
in GCS audit logs.Creates a
PathValidator object using the class specified in GcsOptions.getPathValidatorClass().Implements the Java NIO
Path API for Google Cloud Storage paths.GCP implementation of
PathValidator.ResourceId implementation for Google Cloud Storage.Utility class for staging files to GCS.
Provides operations on GCS.
This is a
DefaultValueFactory able to create a GcsUtil using any transport
flags specified on the PipelineOptions.A class that holds either a
StorageObject or an IOException.Class to generate first set of outputs for
DetectNewPartitionsDoFn.A
PTransform that produces longs starting from the given value, and either up to the
given limit or until Long.MAX_VALUE / until the given time elapses.Exposes GenerateSequence as an external transform for cross-language usage.
Parameters class to expose the transform to an external SDK.
Sequence generator table provider.
Helper to generate a DLQ transform to write PCollection to an external system.
A Provider for generic DLQ transforms that handle deserialization failures.
Deprecated.
new implementations should extend the
GetterBasedSchemaProviderV2 class'
methods which receive TypeDescriptors instead of ordinary Classes as
arguments, which permits to support generic type signatures during schema inferenceBenchmarks for
GetterBasedSchemaProvider on reading / writing fields based on toRowFunction / fromRowFunction.A newer version of
GetterBasedSchemaProvider, which works with TypeDescriptors,
and which by default delegates the old, Class based methods, to the new ones.A store to hold the global watermarks for a micro-batch.
A
GlobalWatermarkHolder.SparkWatermarks holds the watermarks and batch time relevant to a micro-batch input
from a specific source.Advance the WMs onBatchCompleted event.
The default window into which all data is placed (via
GlobalWindows).GlobalWindow.Coder for encoding and decoding GlobalWindows.A
WindowFn that assigns all data to the same window.A OIDC web identity token provider implementation that uses the application default credentials
set by the runtime (container, GCE instance, local environment, etc.).
Defines how to construct a
GoogleAdsClient.GoogleAdsIO provides an API for reading from the Google Ads API over supported
versions of the Google Ads client libraries.This interface can be used to implement custom client-side rate limiting policies.
Implement this interface to create a
GoogleAdsIO.RateLimitPolicy.Options used to configure Google Ads API specific options.
Attempts to load the Google Ads credentials.
Constructs and returns
Credentials to be used by Google Ads API calls.GoogleAdsV19 provides an API to read Google Ads API v19 reports.A
PTransform that reads the results of a Google Ads query as GoogleAdsRow
objects.A
PTransform that reads the results of many SearchGoogleAdsStreamRequest
objects as GoogleAdsRow objects.This rate limit policy wraps a
RateLimiter and can be used in low volume and
development use cases as a client-side rate limiting policy.These options configure debug settings for Google API clients created within the Apache Beam SDK.
A
GoogleClientRequestInitializer that adds the trace destination to Google API calls.A
Sink for Spark's
metric system reporting metrics (including Beam step metrics) to Graphite.A generic grouping transform for schema
PCollections.a
PTransform that does a combine using an aggregation built up by calls to
aggregateField and aggregateFields.a
PTransform that groups schema elements based on the given fields.a
PTransform that does a per-key combine using an aggregation built up by calls to
aggregateField and aggregateFields.a
PTransform that does a global combine using an aggregation built up by calls to
aggregateField and aggregateFields.a
PTransform that does a global combine using a provider Combine.CombineFn.A
PTransform for doing global aggregations on schema PCollections.A FlatMap function that groups by windows in batch mode using
ReduceFnRunner.A
PTransform that provides a secure alternative to GroupByKey.GroupByKey<K, V> takes a PCollection<KV<K, V>>, groups the values by key and
windows, and returns a PCollection<KV<K, Iterable<V>>> representing a map from each
distinct key and window of the input PCollection to an Iterable over all the
values associated with that key in the input per window.GroupByKey translator.
Traverses the pipeline to populate the candidates for group by key.
GroupBy window function.
A set of group/combine functions to apply to Spark
RDDs.A
ReadableState cell that combines multiple input values and outputs a single value of a
different type.A
PTransform that batches inputs to a desired batch size.Wrapper class for batching parameters supplied by users.
Functions for GroupByKey with Non-Merging windows translations to Spark.
An
OffsetRangeTracker for tracking a growable offset range.Provides the estimated end offset of the range.
A HeaderAccessorProvider which intercept the header in a GRPC request and expose the relevant
fields.
A
FnDataService implemented via gRPC.A
gRPC Server which manages a single FnService.An implementation of the Beam Fn Logging Service over gRPC.
An implementation of the Beam Fn State service.
A
DefaultValueFactory which locates a Hadoop Configuration.AutoService registrar for HadoopFileSystemOptions.AutoService registrar for the HadoopFileSystem.A
HadoopFormatIO is a Transform for reading data from any source or writing data to any
sink which implements Hadoop InputFormat or OutputFormat.Bounded source implementation for
HadoopFormatIO.A
PTransform that reads from any data source which implements Hadoop InputFormat.A wrapper to allow Hadoop
InputSplit to be serialized using
Java's standard serialization mechanisms.A
PTransform that writes to any data sink which implements Hadoop OutputFormat.Builder for External Synchronization defining.
Builder for partitioning determining.
Main builder of Write transformation.
Interface for restrictions for which a default implementation of
DoFn.NewTracker is available, depending only on the restriction
itself.Interface for watermark estimator state for which a default implementation of
DoFn.NewWatermarkEstimator is available, depending only on the watermark estimator state itself.Marker interface for
PTransforms and components to specify display data used
within UIs and diagnostic tools.A Flink combine runner that builds a map of merged windows and produces output after seeing all
input.
Interface for any Spark
Receiver that supports
reading from and to some offset.A
CoderProviderRegistrar for standard types used with HBaseIO.A bounded source and sink for HBase.
A
PTransform that reads from HBase.Implementation of
HBaseIO.readAll().A
PTransform that writes to HBase.Transformation that writes RowMutation objects to a Hbase table.
Adapter from HCatalog table schema to Beam
Schema.IO to read and write data using HCatalog.
A
PTransform to read data using HCatalog.A
PTransform to write to a HCatalog managed source.Beam SQL table that wraps
HCatalogIO.Utility classes to enable meta store conf/client creation.
Utilities to convert
HCatRecords to Rows.Implementation of
ExternalSynchronization which registers locks in the HDFS.Interface to access headers in the client request.
Defines a client to communicate with the GCP HCLS API (version v1).
Class for capturing errors on IO operations on Google Cloud Healthcare APIs resources.
Convenience transform to write dead-letter
HealthcareIOErrors to BigQuery TableRows.A heartbeat record serves as a notification that the change stream query has returned all changes
for the partition less or equal to the record timestamp.
This class is part of the process for
ReadChangeStreamPartitionDoFn SDF.Methods and/or interfaces annotated with
@Hidden will be suppressed from being output
when --help is specified on the command-line.A metric that reports information about the histogram of reported values.
HL7v2IO provides an API for reading from and writing to Google Cloud Healthcare HL7v2 API.The type Read that reads HL7v2 message contents given a PCollection of
HL7v2ReadParameter.PTransform to fetch a message from an Google Cloud Healthcare HL7v2 store based on
msgID.DoFn for fetching messages from the HL7v2 store with error handling.
The type Result includes
PCollection of HL7v2ReadResponse objects for
successfully read results and PCollection of HealthcareIOError objects for
failed reads.List HL7v2 messages in HL7v2 Stores with optional filter.
The type Read that reads HL7v2 message contents given a PCollection of message IDs strings.
PTransform to fetch a message from an Google Cloud Healthcare HL7v2 store based on
msgID.DoFn for fetching messages from the HL7v2 store with error handling.
The type Result includes
PCollection of HL7v2Message objects for successfully
read results and PCollection of HealthcareIOError objects for failed reads.The type Write that writes the given PCollection of HL7v2 messages.
The enum Write method.
The type HL7v2 message to wrap the
Message model.HL7v2ReadParameter represents the read parameters for a HL7v2 read request, used as the input
type for
HL7v2IO.HL7v2Read.HL7v2ReadResponse represents the response format for a HL7v2 read request, used as the output
type of
HL7v2IO.HL7v2Read.Coder for
HL7v2ReadResponse.PTransforms to compute HyperLogLogPlusPlus (HLL++) sketches on data streams based on the
ZetaSketch implementation.Provides
PTransforms to extract the estimated count of distinct elements (as
Longs) from each HLL++ sketch.Provides
PTransforms to aggregate inputs into HLL++ sketches.Builder for the
HllCount.Init combining PTransform.Provides
PTransforms to merge HLL++ sketches into a new sketch.HTTP client configuration for both, sync and async AWS clients.
A client that talks to the Cloud Healthcare API through HTTP requests.
The type FhirResourcePagesIterator for methods which return paged output.
Wraps
HttpResponse in an exception with a statusCode field for use with HealthcareIOError.The type Hl7v2 message id pages iterator.
SchemaTransform implementation for
IcebergIO.readRows(org.apache.beam.sdk.io.iceberg.IcebergCatalogConfig).A connector that reads and writes to Apache Iceberg
tables.
SchemaTransform implementation for
IcebergIO.readRows(org.apache.beam.sdk.io.iceberg.IcebergCatalogConfig).Utilities for converting between Beam and Iceberg types, made public for user's convenience.
SchemaTransform implementation for
IcebergIO.writeRows(org.apache.beam.sdk.io.iceberg.IcebergCatalogConfig).A generator of unique IDs.
Common
IdGenerator implementations.For internal use only; no backwards-compatibility guarantees.
Flink input format that implements impulses.
/** * Jet
Processor implementation for Beam's Impulse primitive.A
SourceFunc which executes the impulse transform contract.Source function which sends a single global impulse to a downstream operator.
Impulse translator.
Exception thrown by
WindowFn.verifyCompatibility(WindowFn) if two compared WindowFns are
not compatible, including the explanation of incompatibility.A
ProcessFunction which is not a functional interface.IO to read and write from InfluxDB.
A POJO describing a DataSourceConfiguration such as URL, userName and password.
A
PTransform to read from InfluxDB metric or data related to query.A
PTransform to write to a InfluxDB datasource.A DoFn responsible to initialize the metadata table and prepare it for managing the state of the
pipeline.
A DoFn responsible for initializing the change stream Connector.
Utility class to determine initial partition constants and methods.
States to initialize a pipeline outputted by
InitializeDoFn.Holds user state in memory.
A InMemoryJobService that prepares and runs jobs on behalf of a client using a
JobInvoker.A
MetaStore which stores the meta info in memory.A
InMemoryMetaTableProvider is an abstract TableProvider for in-memory types.A retry policy for streaming BigQuery inserts.
Contains information about a failed insert.
Kafka
Deserializer for Instant.Kafka
Serializer for Instant.Interface for any function that can handle a Fn API
BeamFnApi.InstructionRequest.Signifies that a publicly accessible API (public class, method or field) is intended for internal
use only and not for public consumption.
An implementation of
BoundedWindow that represents an interval from IntervalWindow.start
(inclusive) to IntervalWindow.end (exclusive).Encodes an
IntervalWindow as a pair of its upper bound and duration.Exception thrown when the configuration for a
SchemaIO is invalid.Exception thrown when the configuration for a
SchemaIO is invalid.Exception thrown when the schema for a
SchemaIO is invalid.Exception thrown when the request for a table is invalid, such as invalid metadata.
IS_INF(X)
An Ism file is a prefix encoded composite key value file broken into shards.
The footer stores the relevant information required to locate the index and bloom filter.
A
Coder for IsmFormat.Footer.A record containing a composite key and either a value or metadata.
A
Coder for IsmFormat.IsmRecords.A shard descriptor containing shard id, the data block offset, and the index offset for the
given shard.
A coder for
IsmFormat.IsmShards.The prefix used before each key which contains the number of shared and unshared bytes from the
previous key that was read.
A
Coder for IsmFormat.KeyPrefix.A coder for metadata key component.
IS_NAN(X)
An abstract base class with functionality for assembling a
Coder for a class that
implements Iterable.A
SchemaProvider for Java Bean objects.FieldValueTypeSupplier that's based on getter methods.FieldValueTypeSupplier that's based on setter methods.A set of utilities to generate getter and setter classes for JavaBean objects.
An implementation of
TypedSchemaTransformProvider for Explode.A
SchemaTransform for Explode.A
SchemaProvider for Java POJO objects.FieldValueTypeSupplier that's based on public fields.An implementation of
TypedSchemaTransformProvider for Filter for the java language.A
SchemaTransform for Filter-java.An implementation of
TypedSchemaTransformProvider for MapToFields for the java language.A
SchemaTransform for MapToFields-java.Loads
UdfProvider implementations from user-provided jars.A coder for JAXB annotated objects.
A class that manages a connection to a Solace broker using basic authentication.
Beam JDBC Connection.
Calcite JDBC driver with Beam defaults.
IO to read and write data on JDBC.
A POJO describing a
DataSource, either providing directly a DataSource or all
properties allowing to create a DataSource.Wraps a
JdbcIO.DataSourceConfiguration to provide a DataSource.This is the default
Predicate we use to detect DeadLock.Wraps a
JdbcIO.DataSourceConfiguration to provide a PoolingDataSource.An interface used by the JdbcIO
JdbcIO.ReadAll and JdbcIO.Write to set the parameters of the
PreparedStatement used to setParameters into the database.Implementation of
JdbcIO.read().Implementation of
JdbcIO.readAll().Implementation of
JdbcIO.readRows().Builder used to help with retry configuration for
JdbcIO.An interface used to control if we retry the statements when a
SQLException occurs.An interface used by
JdbcIO.Read for converting each row of the ResultSet into
an element of the resulting PCollection.An interface used by the JdbcIO Write to set the parameters of the
PreparedStatement
used to setParameters into the database.This class is used as the default return value of
JdbcIO.write().A
PTransform to write to a JDBC datasource.A
PTransform to write to a JDBC datasource.An implementation of
SchemaTransformProvider for
reading from JDBC connections using JdbcIO.A helper for
JdbcIO.ReadWithPartitions that handles range calculations.An implementation of
SchemaIOProvider for reading and writing JSON payloads with JdbcIO.Provides utility functions for working with
JdbcIO.The result of writing a row to JDBC datasource.
An implementation of
SchemaTransformProvider for
writing to a JDBC connections using JdbcIO.Jet specific
MetricResults.Jet specific implementation of
MetricsContainer.Pipeline options specific to the Jet runner.
Jet specific implementation of
PipelineResult.Jet specific implementation of Beam's
PipelineRunner.Registers the
JetPipelineOptions.Registers the
JetRunner.An unbounded source for JMS destinations (queues or topics).
An interface used by
JmsIO.Read for converting each jms Message into an element
of the resulting PCollection.A
PTransform to read from a JMS destination.A
PTransform to write to a JMS queue.JmsRecord contains message payload of the record as well as metadata (JMS headers and
properties).
A factory that has all job-scoped information, and can be combined with stage-scoped information
to create a
StageBundleFactory.A subset of
ProvisionApi.ProvisionInfo that
specifies a unique job, while omitting fields that are not known to the runner operator.Internal representation of a Job which has been invoked (prepared and run) by a client.
Factory to create
JobInvocation instances.A job that has been prepared, but not invoked.
Shared code for starting and serving an
InMemoryJobService.Configuration for the jobServer.
Utility class with different versions of joins.
A transform that performs equijoins across two schema
PCollections.Predicate object to specify fields to compare when doing an equi-join.
Implementation class for FieldsEqual.
PTransform representing a full outer join of two collections of KV elements.
Implementation class .
PTransform representing an inner join of two collections of KV elements.
PTransform representing a left outer join of two collections of KV elements.
PTransform representing a right outer join of two collections of KV elements.
This is a class to catch the built join and check if it is a legal join before passing it to the
actual RelOptRuleCall.
This is a function gets the output relation and checks if it is a legal relational node.
PTransforms for reading and writing JSON files.PTransform for writing JSON files.Matcher to compare a string or byte[] representing a JSON Object, independent of field order.
A
FileReadSchemaTransformFormatProvider that reads newline-delimited JSONs.The result of a
JsonToRow.withExceptionReporting(Schema) transform.Utils to convert JSON records to Beam
Row.A
FileWriteSchemaTransformFormatProvider for JSON format.An implementation of
TypedSchemaTransformProvider for JsonIO.write(java.lang.String).Configuration for writing to BigQuery with Storage Write API.
Builder for
JsonWriteTransformProvider.JsonWriteConfiguration.A service interface for defining one-time initialization of the JVM during pipeline execution.
Helpers for executing
JvmInitializer implementations.Checkpoint for a
KafkaUnboundedReader.A tuple to hold topic, partition, and offset that comprise the checkpoint for a single
partition.
A
PTransform that commits offsets of KafkaRecord.An unbounded source and a sink for Kafka topics.
A
PTransform to read from Kafka topics.Exposes
KafkaIO.TypedWithoutMetadata as an external transform for cross-language
usage.Parameters class to expose the Read transform to an external SDK.
A
PTransform to read from KafkaSourceDescriptor.A
PTransform to read from Kafka topics.A
PTransform to write to a Kafka topic with KVs .Exposes
KafkaIO.Write as an external transform for cross-language usage.Parameters class to expose the Write transform to an external SDK.
A
PTransform to write to a Kafka topic with ProducerRecord's.Initialize KafkaIO feature flags on worker.
Utility methods for translating
KafkaIO transforms to and from RunnerApi
representations.Common utility functions and default configurations for
KafkaIO.Read and KafkaIO.ReadSourceDescriptors.Stores and exports metrics for a batch of Kafka Client RPCs.
Metrics of a batch of RPCs.
No-op implementation of
KafkaResults.An interface for providing custom timestamp for elements written to Kafka.
Configuration for reading from a Kafka topic.
Builder for the
KafkaReadSchemaTransformConfiguration.KafkaRecord contains key and value of the record as well as metadata for the record (topic name,
partition id, and offset).
Coder for KafkaRecord.Helper class to create per worker metrics for Kafka Sink stages.
Quick Overview
Represents a Kafka source description.
Kafka table provider.
This is a copy of Kafka's
TimestampType.A keyed implementation of a
BufferingElementsHandler.An immutable tuple of keyed
PCollections with key type K.A utility class to help ensure coherence of tag and input PCollection types.
Keys<K> takes a PCollection of KV<K, V>s and returns a
PCollection<K> of the keys.Thrown when the Kinesis client was throttled due to rate limits.
IO to read from Kinesis streams.
Implementation of
KinesisIO.read().Configuration of Kinesis record aggregation.
Implementation of
KinesisIO.write().Result of
KinesisIO.write().PipelineOptions for
KinesisIO.A registrar containing the default
KinesisIOOptions.Kinesis interface for custom partitioner.
An explicit partitioner that always returns a
Nonnull explicit hash key.KinesisClientRecord enhanced with utility methods.Exposes
KinesisIO.Write and KinesisIO.Read as an external transform for cross-language
usage.A bounded source and sink for Kudu.
An interface used by the KuduIO Write to convert an input record into an Operation to apply as
a mutation in Kudu.
Implementation of
KuduIO.read().A
PTransform that writes to Kudu.An immutable key/value pair.
A
Comparator that orders KVs by the natural ordering of their keys.A
Comparator that orders KVs by the natural ordering of their values.A
KvCoder encodes KVs.KvSwap<K, V> takes a PCollection<KV<K, V>> and returns a PCollection<KV<V,
K>>, where all the keys and values have been swapped.KeySelector that retrieves a key from a KV.Util class for building/parsing labeled
MetricName.Builder class for a labeled
MetricName.Category tags for tests which validate that a Beam runner can handle keys up to a given size.
Tests if a runner supports 100KB keys.
Tests if a runner supports 100MB keys.
Tests if a runner supports 10KB keys.
Tests if a runner supports 10MB keys.
Tests if a runner supports 1MB keys.
HttpRequestInitializer for recording request to response latency of Http-based API calls.
Combine.CombineFn that wraps an AggregateFn.A
Coder which is able to take any existing coder and wrap it such that it is only invoked
in the outer context.Utilities for replacing or wrapping unknown coders with
LengthPrefixCoder.Standard collection of metrics used to record source and sinks information for lineage tracking.
Lineage metrics resource types.
A
FileReadSchemaTransformFormatProvider that reads lines as Strings.AutoService registrar for the LocalFileSystem.Helper functions for producing a
ResourceId that references a local file or directory.An implementation of
TypedSchemaTransformProvider for Logging.A
SchemaTransform for logging.A logical endpoint is a pair of an instruction ID corresponding to the
BeamFnApi.ProcessBundleRequest and the transform within the processing graph.A consumer of
Beam Log Entries.Pipeline visitor that fills lookup table of
PTransform to AppliedPTransform for
usage in FlinkBatchPortablePipelineTranslator.BatchTranslationContext.Top-level
PTransforms that build and instantiate turnkey
transforms.A Factory which creates
ManagedChannel instances.A ManagedFactory produces instances and tears down any produced instances when it is itself
closed.
Pipeline options to tune DockerEnvironment.
Register the
ManualDockerEnvironmentOptions.A
WatermarkEstimator which is controlled manually from within a DoFn.A
ControlClientPool backed by a client map.PTransforms for mapping a simple function over the elements of a PCollection.A
PTransform that adds exception handling to MapElements.MapKeys maps a SerializableFunction<K1,K2> over keys of a
PCollection<KV<K1,V>> and returns a PCollection<KV<K2, V>>.This interface allows you to implement a custom mapper to read and persist elements from/to
Cassandra.
Factory class for creating instances that will map a struct to a connector model.
Util class for mapping plugins.
A
ReadableState cell mapping keys to values.Map to tuple function.
MapValues maps a SerializableFunction<V1,V2> over values of a
PCollection<KV<K,V1>> and returns a PCollection<KV<K, V2>>.The result of
FileSystem.match(java.util.List<java.lang.String>).MatchResult.Metadata of a matched file.Builder class for
MatchResult.Metadata.Status of a
MatchResult.For internal use only; no backwards-compatibility guarantees.
For internal use only; no backwards-compatibility guarantees.
Represents the
PrimitiveViewT supplied to the ViewFn when it declares to use
the iterable materialization.Represents the
PrimitiveViewT supplied to the ViewFn when it declares to use
the multimap materialization.PTransforms for computing the maximum of the elements in a PCollection, or the
maximum of the values associated with each key in a PCollection of KVs.PTransforms for computing the arithmetic mean (a.k.a.Options that are used to control the Memory Monitor.
For internal use only; no backwards compatibility guarantees.
Base class for publishing messages to a Solace broker.
Interface for receiving messages from a Solace broker.
A
Coder for MatchResult.Metadata.This class generates a SpannerConfig for the change stream metadata database by copying only the
necessary fields from the SpannerConfig of the primary database.
Data access object for creating and dropping the metadata table.
Data access object for managing the state of the metadata Bigtable table.
Helper methods that simplifies some conversion and extraction of metadata table content.
The interface to handle CRUD of
BeamSql table metadata.Marker interface for all user-facing metrics.
Implements matching for metrics filters.
Metrics are keyed by the step name they are associated with and the name of the metric.
The name of a metric consists of a
MetricName.getNamespace() and a MetricName.getName().The name of a metric.
The results of a query for metrics.
The results of a single current metric.
Methods for interacting with the metrics of a pipeline that has been executed.
Helper for pretty-printing
Flink metrics.The
Metrics is a utility class for producing various kinds of metrics for reporting
properties of an executing pipeline.Accumulator of
MetricsContainerStepMap.For resilience,
Accumulators are required to be wrapped in a Singleton.AccumulatorV2 for Beam metrics captured in MetricsContainerStepMap.Spark Listener which checkpoints
MetricsContainerStepMap values for fault-tolerance.Holds the metrics for a single step.
AccumulatorV2 implementation for MetricsContainerStepMap.Manages and provides the metrics container associated with each thread.
Set the
MetricsContainer for the associated MetricsEnvironment.Simple POJO representing a filter for querying metrics.
Builder for creating a
MetricsFilter.Extension of
PipelineOptions that defines MetricsSink specific options.A
DefaultValueFactory that obtains the class of the NoOpMetricsSink if it
exists on the classpath, and throws an exception otherwise.Interface for all metric sinks.
A
Source that accommodates Spark's micro-batch oriented nature and wraps an UnboundedSource.A timestamp represented as microseconds since the epoch.
PTransforms for computing the minimum of the elements in a PCollection, or the
minimum of the values associated with each key in a PCollection of KVs.Represents a modification in a table emitted within a
DataChangeRecord.Represents the type of modification applied in the
DataChangeRecord.IO to read and write data on MongoDB GridFS.
Encapsulate the MongoDB GridFS connection logic.
Interface for the parser that is used to parse the GridFSFile into the appropriate types.
Callback for the parser to use to submit data.
A
PTransform to read data from MongoDB GridFS.A
BoundedSource for MongoDB GridFS.A
PTransform to write data to MongoDB GridFS.Function that is called to write the data to the give GridFS OutputStream.
IO to read and write data on MongoDB.
A
PTransform to read data from MongoDB.A
PTransform to write to a MongoDB database.Configures
Metrics throughout various features of RequestResponseIO.A helper class for monitoring jobs submitted to the service.
An interface that can be used for defining callbacks to receive a list of JobMessages
containing monitoring information.
A handler that logs monitoring messages.
Comparator for sorting rows in increasing order based on timestamp.
An object that configures
FileSystems.copy(java.util.List<org.apache.beam.sdk.io.fs.ResourceId>, java.util.List<org.apache.beam.sdk.io.fs.ResourceId>, org.apache.beam.sdk.io.fs.MoveOptions...), FileSystems.rename(java.util.List<org.apache.beam.sdk.io.fs.ResourceId>, java.util.List<org.apache.beam.sdk.io.fs.ResourceId>, org.apache.beam.sdk.io.fs.MoveOptions...), and FileSystems.delete(java.util.Collection<org.apache.beam.sdk.io.fs.ResourceId>, org.apache.beam.sdk.io.fs.MoveOptions...).Defines the standard
MoveOptions.An unbounded source for MQTT broker.
A POJO describing a MQTT connection.
A
PTransform to read from a MQTT broker.A
PTransform to write and send a message to a MQTT server.A container class for MQTT message metadata, including the topic name and payload.
DoFunctions ignore outputs that are not the main output.
A
ReadableState cell mapping keys to bags of values.Mutable state mutates when events apply to it.
A bundle of mutations that must be submitted atomically.
DoFn for reading from Apache Pulsar based on Pulsar
Reader from the start message id.A duration represented in nanoseconds.
A timestamp represented as nanoseconds since the epoch.
Category for integration tests that require Docker.
Category tag for validation tests which utilize
TestPipeline for execution and expect to
be executed by a PipelineRunner.This is a Beam IO to read from, and write data to, Neo4j.
This describes all the information needed to create a Neo4j
Session.Wraps a
Neo4jIO.DriverConfiguration to provide a Driver.This is the class which handles the work behind the
Neo4jIO.readAll() method.An interface used by
Neo4jIO.ReadAll for converting each row of a Neo4j Result record
Record into an element of the resulting PCollection.This is the class which handles the work behind the
Neo4jIO.writeUnwind() method.A
Trigger which never fires.The actual trigger class for
Never triggers.Represent new partition as a result of splits and merges.
NFA is an implementation of non-deterministic finite automata.This is a utility class to represent rowCount, rate and window.
This is a metadata used for row count and rate estimation.
Handler API.
A non-keyed implementation of a
BufferingElementsHandler.Abstract base class for
WindowFns that do not merge windows.A no-op implementation of Counter.
Construct an oauth credential to be used by the SDK and the SDK workers.
A no-op implementation of Histogram.
For internal use only; no backwards compatibility guarantees.
A
StepContext for Spark Batch Runner execution.doc.
Synchronously compute the earliest partition watermark, by delegating the call to
.
invalid reference
PartitionMetadataDao#getUnfinishedMinWatermark()
Indicates that we are missing a schema for a type.
A
NullableCoder encodes nullable values of type T using a nested Coder<T>
that does not tolerate null values.A
HttpRequestInitializer for requests that don't have credentials.NoOp implementation of a size estimator.
NoOp implementation of a throughput estimator.
Reference counting object pool to easily share invalid input: '&' destroy objects.
Client pool to easily share AWS clients per configuration.
A
BoundedSource that uses offsets to define starting and ending positions.A
Source.Reader that implements code common to readers of all OffsetBasedSources.A restriction represented by a range of integers [from, to).
A coder for
OffsetRanges.A
RangeTracker for non-negative positions of type long.A
RestrictionTracker for claiming offsets in an OffsetRange in a monotonically
increasing fashion.A logical type representing a union of fields.
Represents a single OneOf value.
Describes an order.
Transform for processing ordered events.
The result of the ordered processing.
A
ReadableState cell containing a list of values sorted by timestamp.Parent class for Ordered Processing configuration handlers.
Parent class for Ordered Processing configuration handlers to handle processing of the events
where global sequence is used.
Indicates the status of ordered processing for a particular key.
The
OrderKey class stores the information to sort a column.A
Trigger that executes according to its main trigger until its "finally" trigger fires.Creates factories which determine an underlying
StreamObserver implementation to use in
to interact with fn execution APIs.Creates an outbound observer for the given inbound observer.
A builder for an output, to set all the fields and extended metadata of a Beam value.
A factory that can create output receivers during an executable stage.
A representation used by
Steps to reference the
output of other Steps.Output tag filter.
Helper routines for packages.
Provides information about the pane an element belongs to.
A Coder for encoding PaneInfo instances.
Enumerates the possibilities for the timing of this pane firing related to the input and output
watermarks for its computation.
ParDo is the core element-wise transform in Apache Beam, invoking a user-specified
function on each of the elements of the input PCollection to produce zero or more output
elements, all of which are collected into the output PCollection.A
PTransform that, when applied to a PCollection<InputT>, invokes a
user-specified DoFn<InputT, OutputT> on all its elements, which can emit elements to
any of the PTransform's output PCollections, which are bundled into a result
PCollectionTuple.A
PTransform that, when applied to a PCollection<InputT>, invokes a
user-specified DoFn<InputT, OutputT> on all its elements, with all its outputs
collected into an output PCollection<OutputT>.ParDo translator.
A
PTransformOverrideFactory that provides overrides for applications of a ParDo
in the direct runner.Jet
Processor implementation for Beam's ParDo primitive (when no
user-state is being used).Jet
Processor supplier that will provide instances of ParDoP.A function to handle stateful processing in Apache Beam's SparkRunner.
An iterator implementation that processes timers from
SparkTimerInternals.IO to read and write Parquet files.
Implementation of
ParquetIO.parseGenericRecords(SerializableFunction).Implementation of
ParquetIO.parseFilesGenericRecords(SerializableFunction).Implementation of
ParquetIO.read(Schema).Implementation of
ParquetIO.readFiles(Schema).Implementation of
ParquetIO.sink(org.apache.avro.Schema).TableProvider for ParquetIO for consumption by Beam SQL.A
FileWriteSchemaTransformFormatProvider for Parquet format.Exception thrown when Beam SQL is unable to parse the statement.
PTransform for parsing JSON Strings.The result of parsing a single file with Tika: contains the file's location, metadata, extracted
text, and optionally an error.
Partition takes a PCollection<T> and a PartitionFn, uses the
PartitionFn to split the elements of the input PCollection into N partitions,
and returns a PCollectionList<T> that bundles N PCollection<T>s
containing the split elements.A function object that chooses an output partition for an element.
A function object that chooses an output partition for an element.
A partition end record serves as a notification that the client should stop reading the
partition.
This class is part of the process for
ReadChangeStreamPartitionDoFn SDF.A partition event record describes key range changes for a change stream partition.
This class is part of the process for
ReadChangeStreamPartitionDoFn SDF.A
WindowFn that places each value into exactly one window based on its timestamp and
never merges windows.Model for the partition metadata database table used in the Connector.
Partition metadata builder for better user experience.
The state at which a partition can be in the system:
CREATED: the partition has been created, but no query has been done against it yet.
Data access object for creating and dropping the partition metadata table.
Data access object for the Connector metadata tables.
Represents the execution of a read / write transaction in Cloud Spanner.
Represents a result from executing a Cloud Spanner read / write transaction.
This class is responsible for transforming a
Struct to a PartitionMetadata.Configuration for a partition metadata table.
There can be a race when many splits and merges happen to a single partition in quick succession.
Output result of
DetectNewPartitionsDoFn containing
information required to stream a partition.A partition start record serves as a notification that the client should schedule the partitions
to be queried.
This class is part of the process for invalid input: '{@link
org.apache.beam.sdk.io.gcp.spanner.changestreams.dofn..ReadChangeStreamPartitionDoFn'} SDF.
An assertion on the contents of a
PCollection incorporated into the pipeline.Default transform to check that a PAssert was successful.
A transform that applies an assertion-checking function over iterables of
ActualT to
the entirety of the contents of its input.A transform that applies an assertion-checking function to the sole element of a
PCollection.Builder interface for assertions applicable to iterables and PCollection contents.
Check that the passed-in matchers match the existing data.
An assertion checker that takes a single
PCollectionView<ActualT>
and an assertion over ActualT, and checks it within a Beam pipeline.Track the place where an assertion is defined.
An
PAssert.IterableAssert about the contents of a PCollection.An assert about the contents of each
PCollection in the given PCollectionList.Builder interface for assertions applicable to a single value.
A base class for LogicalTypes that use the same Java type as the underlying base type.
For internal use only; no backwards compatibility guarantees.
PatternCondition stores the function to decide whether a row is a match of a single
pattern.A
PCollection<T> is an immutable collection of values of type
T.The enumeration of cases for whether a
PCollection is bounded.A
PCollectionList<T> is an immutable list of homogeneously typed
PCollection<T>s.A
PCollectionRowTuple is an immutable tuple of PCollections
,
"keyed" by a string tag.A
PCollectionTuple is an immutable tuple of heterogeneously-typed PCollections, "keyed" by TupleTags.A
PCollectionView<T> is an immutable view of a PCollection
as a value of type T that can be accessed as a side input to a ParDo transform.For internal use only; no backwards compatibility guarantees.
Implementation which is able to adapt a multimap materialization to an in-memory
List<T>.Implementation which is able to adapt an iterable materialization to an in-memory
List<T>.Implementation which is able to adapt a multimap materialization to an in-memory
Map<K,
V>.Implementation which is able to adapt an iterable materialization to an in-memory
Map<K,
V>.Implementation which is able to adapt a multimap materialization to an in-memory
Map<K,
Iterable<V>>.Implementation which is able to adapt an iterable materialization to an in-memory
Map<K,
Iterable<V>>.Implementation which is able to adapt an iterable materialization to a
List<T>.Deprecated.
Implementation which is able to adapt an iterable materialization to a
Iterable<T>.Deprecated.
Implementation which is able to adapt a multimap materialization to a
List<T>.Deprecated.
Implementation which is able to adapt a multimap materialization to a
Map<K, V>.Deprecated.
Implementation which is able to adapt a multimap materialization to a
Map<K,
Iterable<V>>.A class for
PCollectionView implementations, with additional type parameters that are
not visible at pipeline assembly time when the view is used as a side input.Deprecated.
Implementation which is able to adapt an iterable materialization to a
T.Stores values or metadata about values.
A coder for
PCollectionViews.ValueOrMetadata.PCollectionView translator.
A
PTransform which produces a sequence of elements at fixed runtime intervals.A
PTransform which generates a sequence of timestamped elements at given runtime
intervals.The interface for things that might be input to a
PTransform.A
Pipeline manages a directed acyclic graph of PTransforms, and the
PCollections that the PTransforms consume and produce.For internal use only; no backwards-compatibility guarantees.
Control enum for indicating whether or not a traversal should process the contents of a
composite transform or not.
Default no-op
Pipeline.PipelineVisitor that enters all composite transforms.Handles failures in the form of exceptions.
PipelineOptions are used to configure Pipelines.
DefaultValueFactory which supplies an ID that is guaranteed to be unique within the
given process.Enumeration of the possible states for a given check.
A
DefaultValueFactory that obtains the class of the DirectRunner if it exists
on the classpath, and throws an exception otherwise.Returns a normalized job name constructed from
ApplicationNameOptions.getAppName(), the
local system user name (if available), the current time, and a random integer.Returns a user agent string constructed from
ReleaseInfo.getName() and ReleaseInfo.getVersion(), in the format [name]/[version].Constructs a
PipelineOptions or any derived interface that is composable to any other
derived interface of PipelineOptions via the PipelineOptions.as(java.lang.Class<T>) method.A fluent
PipelineOptions builder.PipelineOptions creators have the ability to automatically have their PipelineOptions registered with this SDK by creating a ServiceLoader entry and a
concrete implementation of this interface.Validates that the
PipelineOptions conforms to all the Validation criteria.Result of
Pipeline.run().Possible job states, for both completed and ongoing jobs.
A
PipelineRunner runs a Pipeline.The pipeline translator translates a Beam
Pipeline into a Spark correspondence, that can
then be evaluated.Shared, mutable state during the translation of a pipeline and omitted afterwards.
Unresolved translation, allowing to optimize the generated Spark DAG.
PipelineTranslator for executing a Pipeline in Spark in batch mode.Utilities for pipeline translation.
Class wrapper for a CDAP plugin.
Builder class for a
Plugin.Class for getting any filled
PluginConfig configuration object.Class for CDAP plugin constants.
Format types.
Format provider types.
Hadoop types.
Plugin types.
A set of utilities to generate getter and setter classes for POJOs.
PortablePipelineRunner that bundles the input pipeline along with all dependencies,
artifacts, etc.Contains common code for writing and reading portable pipeline jars.
Pipeline options common to all portable runners.
Result of a portable
PortablePipelineRunner.run(RunnerApi.Pipeline, JobInfo).Runs a portable Beam pipeline on some execution engine.
Registrar for the portable runner.
A DoFn class to gather metrics about the emitted
DataChangeRecords.The interface for things that might be output from a
PTransform.An
Iterable that returns PrefetchableIterators.This class contains static utility functions that operate on or return objects of type
PrefetchableIterable.A default implementation that caches an iterator to be returned when
PrefetchableIterables.Default.prefetch() is
invoked.Iterator that supports prefetching the next set of records.Prepare an input
PCollection for writing to BigQuery.A
PTransformOverrideFactory that produces PrimitiveParDoSingleFactory.ParDoSingle instances from ParDo.SingleOutput instances.A single-output primitive
ParDo.A translator for
PrimitiveParDoSingleFactory.ParDoSingle.Registers the
PrismPipelineOptions and TestPrismPipelineOptions.A
PipelineRunner executed on Prism.Utility methods for creating
BeamFnApi.ProcessBundleDescriptor instances.A container type storing references to the key, value, and window
Coder used when
handling bag user state requests.A container type storing references to the value, and window
Coder used when handling
side input state requests.A container type storing references to the key, timer and payload coders and the remote input
destination used when handling timer requests.
Environment for process-based execution.
An
EnvironmentFactory which forks processes based on the parameters in the Environment.Provider of ProcessEnvironmentFactory.
A function that computes an output value of type
OutputT from an input value of type
InputT and is Serializable.A simple process manager which forks processes and kills them if necessary.
Coder for ProducerRecord.A
ProjectionConsumer is a Schema-aware operation (such as a DoFn or PTransform) that
has a FieldAccessDescriptor describing which fields the
operation accesses.A factory for operations that execute a projection on a
Schema-aware PCollection.Constant property names used by the SDK in CloudWorkflow specifications.
Provides converts between Protobuf Message and Beam Row.
A
CoderProviderRegistrar for standard types used with Google Protobuf.Utility class for working with Protocol Buffer (Proto) data.
A
Coder using Google Protocol Buffers binary format.ProtoDomain is a container class for Protobuf descriptors.
Deprecated.
A set of
Schema.LogicalType classes to represent protocol buffer types.A Fixed32 type.
A Fixed64 type.
A SFixed32 type.
An SFixed64 type.
A SInt32 type.
A SIn64 type.
A UInt32 type.
A UIn64 type.
Helpers for implementing the "Provider" pattern.
Options needed for a Pub/Sub Lite Publisher.
This class is required to handle callbacks from Solace, to find out if messages were actually
published or there were any kind of error.
An (abstract) helper class for talking to Pubsub via an underlying transport.
A message received from Pubsub.
A message to be sent to Pubsub.
Path representing a cloud project id.
Factory for creating clients.
Path representing a Pubsub schema.
Path representing a Pubsub subscription.
Path representing a Pubsub topic.
A
CoderProviderRegistrar for standard types used with PubsubIO.A helper class for talking to Pubsub via grpc.
Read and Write
PTransforms for Cloud Pub/Sub streams.Class representing a Cloud Pub/Sub Subscription.
Class representing a Cloud Pub/Sub Topic.
Implementation of read methods.
Implementation of write methods.
A Pubsub client using JSON transport.
I/O transforms for reading from Google Pub/Sub Lite.
A sink which publishes messages to Pub/Sub Lite.
Pub/Sub Lite table provider.
Class representing a Pub/Sub message.
A coder for PubsubMessage treating the raw bytes being decoded as the message's payload.
Common util functions for converting between PubsubMessage proto and
PubsubMessage.Provides a
SchemaCoder for PubsubMessage, including the topic and all fields of a
PubSub message from server.A coder for PubsubMessage including all fields of a PubSub message from server.
A coder for PubsubMessage including attributes and the message id from the PubSub server.
A coder for PubsubMessage including attributes.
A coder for PubsubMessage treating the raw bytes being decoded as the message's payload, with the
message id from the PubSub server.
A coder for PubsubMessage including the topic from the PubSub server.
Properties that can be set when using Google Cloud Pub/Sub with the Apache Beam SDK.
Configuration for reading from Pub/Sub.
An implementation of
TypedSchemaTransformProvider for Pub/Sub reads configured using
PubsubReadSchemaTransformConfiguration.An implementation of
SchemaIOProvider for reading and writing JSON/AVRO payloads with
PubsubIO.TableProvider for PubsubIO for consumption by Beam SQL.A (partial) implementation of
PubsubClient for use by unit tests.Closing the factory will validate all expected messages were processed.
A PTransform which streams messages to Pubsub.
Users should use
instead.
invalid reference
PubsubIO#read
Configuration for writing to Pub/Sub.
An implementation of
TypedSchemaTransformProvider for Pub/Sub reads configured using
PubsubWriteSchemaTransformConfiguration.IO connector for reading and writing from Apache Pulsar.
Class representing a Pulsar Message record.
For internal use.
For internal use.
For internal use.
A logical type for PythonCallableSource objects.
Wrapper for invoking external Python transforms.
Pipeline options for
PythonExternalTransform.A registrar for
PythonExternalTransformOptions.Wrapper for invoking external Python
Map transforms..Utility to bootstrap and start a Beam Python service.
The
Quantifier class is intended for storing the information of the quantifier for a
pattern variable.Main action class for querying a partition change stream.
An interface that planners should implement to convert sql statement to
BeamRelNode or
SqlNode.A IO to publish or consume messages with a RabbitMQ broker.
A
PTransform to consume messages from RabbitMQ server.A
PTransform to publish messages to a RabbitMQ server.It contains the message payload, and additional metadata like routing key or attributes.
An implementation of a client-side throttler that enforces a gradual ramp-up, broadly in line
with Datastore best practices.
An elastic-sized byte array which allows you to manipulate it as a stream, or access it directly.
A
Coder which encodes the valid parts of this stream.A
Comparator that compares two byte arrays lexicographically.A
RangeTracker is a thread-safe helper object for implementing dynamic work rebalancing
in position-based BoundedSource.BoundedReader subclasses.Implement this interface to create a
RateLimitPolicy.Default rate limiter that throttles reading from a shard using an exponential backoff if the
response is empty or if the consumer is throttled by AWS.
This corresponds to an integer union tag and value.
A
PTransform for reading from a Source.PTransform that reads from a BoundedSource.Helper class for building
Read transforms.PTransform that reads from a UnboundedSource.A
Coder for FileIO.ReadableFile.A
State that can be read via ReadableState.read().For internal use only; no backwards-compatibility guarantees.
Reads each file in the input
PCollection of FileIO.ReadableFile using given parameters
for splitting files into offset ranges and for creating a FileBasedSource for a file.A class to handle errors which occur during file reads.
Reads each file of the input
PCollection and outputs each element as the value of a
KV, where the key is the filename from which that value came.Parameters class to expose the transform to an external SDK.
This class is part of
ReadChangeStreamPartitionDoFn SDF.A SDF (Splittable DoFn) class which is responsible for performing a change stream query for a
given partition.
RestrictionTracker used by
ReadChangeStreamPartitionDoFn to keep
track of the progress of the stream and to split the restriction for runner initiated
checkpoints.This restriction tracker delegates most of its behavior to an internal
TimestampRangeTracker.Util for invoking
Source.Reader methods that might require a MetricsContainerImpl
to be active.A
ReadOnlyTableProvider provides in-memory read only set of BeamSqlTable
BeamSqlTables.Encapsulates a spanner read operation.
Source translator.
doc.
This
DoFn reads Cloud Spanner 'information_schema.*' tables to build the SpannerSchema.Helper class for source operations.
Class for building an instance for
Receiver that uses Apache Beam mechanisms instead of
Spark environment.A
PTransform using the Recommendations AI API (https://cloud.google.com/recommendations).A
PTransform connecting to the Recommendations AI API
(https://cloud.google.com/recommendations) and creating CatalogItems.A
PTransform connecting to the Recommendations AI API
(https://cloud.google.com/recommendations) and creating UserEvents.The RecommendationAIIO class acts as a wrapper around the
s that interact with
the Recommendation AI API (https://cloud.google.com/recommendations).
invalid reference
PTransform
A
PTransform using the Recommendations AI API (https://cloud.google.com/recommendations).A
PTransform using the Recommendations AI API (https://cloud.google.com/recommendations).This class just transforms to PublishResult to be able to capture the windowing with the right
strategy.
Helper Class based on
Row, it provides Metadata associated with each Record when reading
from file(s) using ContextualTextIO.RedisConnectionConfiguration describes and wraps a connectionConfiguration to Redis
server or cluster.An IO to manipulate Redis key/value database.
Implementation of
RedisIO.read().Implementation of
RedisIO.readKeyPatterns().AÂ
PTransform to write to a Redis server.Determines the method used to insert data in Redis.
AÂ
PTransform to write stream key pairs (https://redis.io/topics/streams-intro) to a
Redis server.A family of
PTransforms that returns a PCollection equivalent to its
input but functions as an operational hint to a runner that redistributing the data in some way
is likely useful.Noop transform that hints to the runner to try to redistribute the work evenly, or via whatever
clever strategy the runner comes up with.
Registers translators for the Redistribute family of transforms.
ExecutableStageContext.Factory which counts ExecutableStageContext reference for book
keeping.Interface for creator which extends Serializable.
A set of reflection helper methods.
Represents a class and a schema.
Represents a type descriptor and a schema.
PTransforms to use Regular Expressions to process elements in a PCollection.Regex.MatchesName<String> takes a PCollection<String> and returns a
PCollection<List<String>> representing the value extracted from all the Regex groups of the
input PCollection to the number of times that element occurs in the input.Regex.Find<String> takes a PCollection<String> and returns a
PCollection<String> representing the value extracted from the Regex groups of the input
PCollection to the number of times that element occurs in the input.Regex.Find<String> takes a PCollection<String> and returns a
PCollection<List<String>> representing the value extracted from the Regex groups of the input
PCollection to the number of times that element occurs in the input.Regex.MatchesKV<KV<String, String>> takes a PCollection<String> and returns a
PCollection<KV<String, String>> representing the key and value extracted from the Regex
groups of the input PCollection to the number of times that element occurs in the
input.Regex.Find<String> takes a PCollection<String> and returns a
PCollection<String> representing the value extracted from the Regex groups of the input
PCollection to the number of times that element occurs in the input.Regex.MatchesKV<KV<String, String>> takes a PCollection<String> and returns a
PCollection<KV<String, String>> representing the key and value extracted from the Regex
groups of the input PCollection to the number of times that element occurs in the
input.Regex.Matches<String> takes a PCollection<String> and returns a
PCollection<String> representing the value extracted from the Regex groups of the input
PCollection to the number of times that element occurs in the input.Regex.MatchesKV<KV<String, String>> takes a PCollection<String> and returns a
PCollection<KV<String, String>> representing the key and value extracted from the Regex
groups of the input PCollection to the number of times that element occurs in the
input.Regex.MatchesName<String> takes a PCollection<String> and returns a
PCollection<String> representing the value extracted from the Regex groups of the input
PCollection to the number of times that element occurs in the input.Regex.MatchesNameKV<KV<String, String>> takes a PCollection<String> and returns
a PCollection<KV<String, String>> representing the key and value extracted from the
Regex groups of the input PCollection to the number of times that element occurs in the
input.Regex.ReplaceAll<String> takes a PCollection<String> and returns a
PCollection<String> with all Strings that matched the Regex being replaced with the
replacement string.Regex.ReplaceFirst<String> takes a PCollection<String> and returns a
PCollection<String> with the first Strings that matched the Regex being replaced with the
replacement string.Regex.Split<String> takes a PCollection<String> and returns a
PCollection<String> with the input string split into individual items in a list.Hamcrest matcher to assert a string matches a pattern.
PTransforms for converting between explicit and implicit form of various Beam
values.This transforms turns a side input into a singleton PCollection that can be used as the main
input for another transform.
Simple
Function to bring the windowing information into the value from the implicit
background representation of the PCollection.This is the implementation of NodeStatsMetadata.
A bundle capable of handling input data elements for a
bundle descriptor by
forwarding them to a remote environment for processing.A handle to an available remote
RunnerApi.Environment.A
RemoteEnvironment which uses the default RemoteEnvironment.close() behavior.Options that are used to control configuration of the remote environment.
Register the
RemoteEnvironmentOptions.An execution-time only
RunnerApi.PTransform which represents an SDK harness reading from a BeamFnApi.RemoteGrpcPort.An execution-time only
RunnerApi.PTransform which represents a write from within an SDK harness to
a BeamFnApi.RemoteGrpcPort.A pair of
which specifies the arguments to a
Coder and
invalid reference
BeamFnApi.Target
FnDataService to send data to a remote harness.A pair of
Coder and FnDataReceiver which can be registered to receive elements
for a LogicalEndpoint.A transform for renaming fields inside an existing schema.
The class implementing the actual PTransform.
A
Trigger that fires according to its subtrigger forever.PTransform for reading from and writing to Web APIs.Describes the run-time requirements of a
Contextful, such as access to side inputs.For internal use only; no backwards compatibility guarantees.
Implementation of
Reshuffle.viaRandomKey().For internal use only; no backwards compatibility guarantees.
An object that configures
ResourceId.resolve(java.lang.String, org.apache.beam.sdk.io.fs.ResolveOptions).Defines the standard resolve options.
Provides a definition of a resource hint known to the SDK.
Pipeline authors can use resource hints to provide additional information to runners about the
desired aspects of the execution environment.
Options that are used to control configuration of the remote environment.
Register the
ResourceHintsOptions.An identifier which represents a file-like resource.
A
Coder for ResourceId.A utility to test
ResourceId implementations.An interrupter for restriction tracker of type T.
Manages access to the restriction and keeps track of its claimed part for a splittable
DoFn.All
RestrictionTrackers SHOULD implement this interface to improve auto-scaling and
splitting performance.A representation for the amount of known completed and remaining work.
A representation of the truncate result.
Support utilities for interacting with
RestrictionTrackers.Interface allowing a runner to observe the calls to
RestrictionTracker.tryClaim(PositionT).A class that manages retrying of callables based on the exceptions they throw.
Configuration of the retry behavior for AWS SDK clients.
Implements a request initializer that adds retry handlers to all HttpRequests.
Models a Cassandra token range.
Row is an immutable tuple-like schema to represent one element in a PCollection.Builder for
Row.Builder for
Row that bases a row on another row.Bundle of rows according to the configured
Factory as input for benchmarks.A sub-class of SchemaCoder that can only encode
Row instances.Translator for row coders.
A convenience class for applying row updates to BigQuery using
BigQueryIO.applyRowMutations().This class indicates how to apply a row update to BigQuery.
A selector interface for extracting fields from a row.
A Concrete subclass of
Row that delegates to a set of provided FieldValueGetters.Concrete subclass of
Row that explicitly stores all fields of the row.Quality of Service manager options for Firestore RPCs.
Mutable Builder class for creating instances of
RpcQosOptions.Wrapper for invoking external Python
RunInference.Construct S3ClientBuilder from S3 pipeline options.
Object used to configure
S3FileSystem.AutoService registrar for the S3FileSystem.A registrar that creates
S3FileSystemConfiguration instances from PipelineOptions.Options used to configure Amazon Web Services S3.
Provide the default s3 upload buffer size in bytes: 64MB if more than 512MB in RAM are
available and 5MB otherwise.
PTransforms for taking samples of the elements in a PCollection, or samples of
the values associated with each key in a PCollection of KVs.CombineFn that computes a fixed-size sample of a collection of values.Classes that represent various SBE semantic types.
Representation of SBE's LocalMktDate.
Represents SBE's TimeOnly composite type.
Represents SBE's TZTimestamp composite type.
Represents SBE's uint16 type.
Represents SBE's uint32 type.
Represents SBE's uint64 type.
Represents SBE's uint8 type.
Representation of SBE's UTCDateOnly.
Represents SBE's UTCTimeOnly composite type.
Represents SBE's UTCTimestamp composite type.
Represents an SBE schema.
Options for configuring schema generation from an
Ir.Builder for
SbeSchema.IrOptions.Utilities for easier interoperability with the Spark Scala API.
A scalar function that can be executed as part of a SQL query.
Annotates the single method in a
ScalarFn implementation that is to be applied to SQL
function arguments.Reflection-based implementation logic for
ScalarFn.Beam-customized version from
ScalarFunctionImpl , to
address BEAM-5921.Builder class for building
Schema objects.Control whether nullable is included in equivalence check.
Field of a row.
Builder for
Schema.Field.A descriptor of a single field type.
A LogicalType allows users to define a custom schema type.
An enumerated list of type constructors.
A wrapper for a
GenericRecord and the TableSchema representing the schema of the
table (or query) it was generated from.Each IO in Beam has one table schema, by extending
SchemaBaseBeamTable.SchemaCoder is used as the coder for types that have schemas registered.Translator for Schema coders.
Can be put on a constructor or a static method, in which case that constructor or method will be
used to created instance of the class by Beam's schema code.
When used on a POJO field or a JavaBean getter, that field or getter is ignored from the inferred
schema.
Provides an instance of
ConvertHelpers.ConvertedSchemaInformation.An abstraction to create schema capable and aware IOs.
Provider to create
SchemaIO instances for use in Beam SQL and other SDKs.A general
TableProvider for IOs for consumption by Beam SQL.A schema represented as a serialized proto bytes.
Concrete implementations of this class allow creation of schema service objects that vend a
Schema for a specific type.SchemaProvider creators have the ability to automatically have their schemaProvider registered with this SDK by creating a ServiceLoader entry
and a concrete implementation of this interface.An abstraction representing schema capable and aware transforms.
Provider to create
SchemaTransform instances for use in Beam SQL and other SDKs.A
PTransformTranslation.TransformPayloadTranslator implementation that translates between a Java SchemaTransform and a protobuf payload for that transform.Utility methods for translating schemas.
A creator interface for user types that have schemas.
Provides utility functions for working with Beam
Schema types.A
JdbcIO.RowMapper implementation that converts JDBC
results into Beam Row objects.A set of utility functions for schemas.
Visitor that zips schemas, and accepts pairs of fields and their types.
Context referring to a current position in a schema.
KeySelector that retrieves a key from a KV<KV<element, KV<restriction,
watermarkState>>, size>.A high-level client for an SDK harness.
Options that are used to control configuration of the SDK harness.
The default implementation which detects how much memory to use for a process wide cache.
A
DefaultValueFactory which constructs an instance of the class specified by maxCacheMemoryUsageMbClass to compute the maximum amount of
memory to allocate to the process wide cache within an SDK harness instance.The set of log levels that can be used in the SDK harness.
Specifies the maximum amount of memory to use within the current SDK harness instance.
Defines a log level override for a specific class, package, or name.
A
PTransform for selecting a subset of fields from a schema type.A
PTransform representing a flattened schema.Helper methods to select subrows out of rows.
A class to execute requests to SEMP v2 with Basic Auth authentication.
This interface defines methods for interacting with a Solace message broker using the Solace
Element Management Protocol (SEMP).
This interface serves as a blueprint for creating SempClient objects, which are used to interact
with a Solace message broker using the Solace Element Management Protocol (SEMP).
Default accumulator used to combine sequence ranges.
Util methods to help with serialization / deserialization.
A union of the
BiConsumer and Serializable interfaces.A union of the
BiFunction and Serializable interfaces.A
Coder for Java classes that implement Serializable.A
CoderProviderRegistrar which registers a CoderProvider which can handle
serializable types.A
Comparator that is also Serializable.A wrapper to allow Hadoop
Configurations to be serialized using Java's standard
serialization mechanisms.A function that computes an output value of type
OutputT from an input value of type
InputT, is Serializable, and does not allow checked exceptions to be declared.Useful
SerializableFunction overrides.A wrapper around
Ir that fulfils Java's Serializable contract.A
Matcher that is also Serializable.Static class for building and using
SerializableMatcher instances.SerializableRexFieldAccess.
SerializableRexInputRef.
SerializableRexNode.
SerializableRexNode.Builder.
A
gRPC server factory.Creates a
gRPC Server using the default server factory.Factory that constructs client-accessible URLs from a local server address and port.
A
WindowFn that windows values into sessions separated by periods with no input for at
least the duration specified by Sessions.getGapDuration().The SessionService interface provides a set of methods for managing a session with the Solace
messaging system.
This abstract class serves as a blueprint for creating `SessionServiceFactory` objects.
The
PTransforms that allow to compute different set functions across PCollections.A
ReadableState cell containing a set of elements.Provided by user and called within
DoFn.Setup and @{link
org.apache.beam.sdk.transforms.DoFn.Teardown} lifecycle methods of Call's DoFn.Deprecated.
Use
ShardedKey instead.Function for assigning
ShardedKeys to input elements for sharded WriteFiles.Standard shard naming templates.
Broadcast helper for side inputs.
BroadcastVariableInitializer that initializes the broadcast input as a Map from
window to side input.Metadata class for side inputs in Spark runner.
Utility class for creating and managing side input readers in the Spark runner.
SideInputValues serves as a Kryo serializable container that contains a materialized view
of side inputs.General
SideInputValues for BoundedWindows in two possible
states.Specialized
SideInputValues for use with the GlobalWindow in two possible
states.Factory function for load
SideInputValues from a Dataset.A
SerializableFunction which is not a functional interface.A specialized
ConstantInputDStream that emits its RDD exactly once.Deprecated.
replace with a
DefaultJobBundleFactory when appropriate if the EnvironmentFactory is a DockerEnvironmentFactory, or create an
InProcessJobBundleFactory and inline the creation of the environment if appropriate.IO to read and write data on SingleStoreDB.
A POJO describing a SingleStoreDB
DataSource by providing all properties needed to
create it.A
PTransform for reading data from SingleStoreDB.A
PTransform for reading data from SingleStoreDB.An interface used by
SingleStoreIO.Read and SingleStoreIO.ReadWithPartitions for converting each row of the
ResultSet into an element of the resulting PCollection.A RowMapper that provides a Coder for resulting PCollection.
A RowMapper that requires initialization.
An interface used by the SingleStoreIO
SingleStoreIO.Read to set the parameters of the PreparedStatement.An interface used by the SingleStoreIO
SingleStoreIO.Write to map a data from each element of PCollection to a List of Strings.A
PTransform for writing data to SingleStoreDB.Configuration for reading from SingleStoreDB.
An implementation of
TypedSchemaTransformProvider for SingleStoreDB read jobs configured
using SingleStoreSchemaTransformReadConfiguration.Configuration for writing to SingleStoreDB.
An implementation of
TypedSchemaTransformProvider for SingleStoreDB write jobs configured
using SingleStoreSchemaTransformWriteConfiguration.Singleton keyed word item.
Singleton keyed work item coder.
A Flink combine runner takes elements pre-grouped by window and produces output after seeing all
input.
Standard Sink Metrics.
This class is used to estimate the size in bytes of a given element.
PTransforms to compute the estimate frequency of each element in a stream.Implements the
Combine.CombineFn of SketchFrequencies transforms.Implementation of
SketchFrequencies.globally().Implementation of
SketchFrequencies.perKey().Wrap StreamLib's Count-Min Sketch to support counting all user types by hashing the encoded
user type using the supplied deterministic coder.
A
LogWriter which uses an SLF4J Logger as the underlying log backend.A
WindowFn that windows values into possibly overlapping fixed-size timestamp-based
windows.Wraps an existing coder with Snappy compression.
This is an AutoValue representation of an Iceberg
Snapshot.Class for preparing configuration for batch write and read.
Implemenation of
SnowflakeServices.BatchService used in production.POJO describing single Column within Snowflake Table.
Interface for data types to provide SQLs for themselves.
IO to read and write data on Snowflake.
Combines list of
String to provide one String with paths where files were
staged for write.Interface for user-defined function mapping parts of CSV line into T.
A POJO describing a
DataSource, providing all properties allowing to create a DataSource.Wraps
SnowflakeIO.DataSourceConfiguration to provide DataSource.Implementation of
SnowflakeIO.read().Removes temporary staged files after reading.
Parses
String from incoming data in PCollection to have proper format for CSV
files.Interface for user-defined function mapping T into array of Objects.
Implementation of
SnowflakeIO.write().Interface which defines common methods for interacting with Snowflake.
Class for preparing configuration for streaming write.
Implementation of
SnowflakeServices.StreamingService used in production.POJO representing schema of Table in Snowflake.
Exposes
SnowflakeIO.Read and SnowflakeIO.Write as an external transform for
cross-language usage.IO to send notifications via SNS.
Implementation of
SnsIO.write().Creates a
SocketAddress based upon a supplied string.Provides core data models and utilities for working with Solace messages in the context of Apache
Beam pipelines.
The correlation key is an object that is passed back to the client during the event broker ack
or nack.
Represents a Solace message destination (either a Topic or a Queue).
Represents a Solace destination type.
The result of writing a message to Solace.
Represents a Solace queue.
Represents a Solace message record with its associated metadata.
A utility class for mapping
BytesXMLMessage instances to Solace.Record objects.Represents a Solace topic.
Checkpoint for an unbounded Solace source.
A
PTransform to read and write from/to Solace event
broker.The
SolaceIO.Write transform's output return this type, containing the successful
publishes (SolaceOutput.getSuccessfulPublish()).Transforms for reading and writing data from/to Solr.
A POJO describing a connection configuration to Solr.
A
PTransform reading data from Solr.A POJO describing a replica of Solr.
A POJO encapsulating a configuration for retry behavior when issuing requests to Solr.
A
PTransform writing data to Solr.A Flink combine runner that first sorts the elements by window and then does one pass that merges
windows and outputs results.
SortValues<PrimaryKeyT, SecondaryKeyT, ValueT> takes a PCollection<KV<PrimaryKeyT,
Iterable<KV<SecondaryKeyT, ValueT>>>> with elements consisting of a primary key and iterables
over <secondary key, value> pairs, and returns a PCollection<KV<PrimaryKeyT,
Iterable<KV<SecondaryKeyT, ValueT>>> of the same elements but with values sorted by a secondary
key.Base class for defining input formats and creating a
Source for reading the input.The interface that readers of custom input sources must implement.
Wrapper for executing a
Source as a Flink InputFormat.InputSplit for SourceInputFormat.Standard
Source Metrics.Classes implementing Beam
Source RDDs.A
SourceRDD.Unbounded is the implementation of a micro-batch in a SourceDStream.This class can be used as a mapper for each
SourceRecord retrieved.SourceRecordJson implementation.Interface used to map a Kafka source record.
Helper functions and test harnesses for checking correctness of
Source implementations.Expected outcome of
BoundedSource.BoundedReader.splitAtFraction(double).Manages lifecycle of
DatabaseClient and Spanner instances.Configuration for a Cloud Spanner client.
Builder for
SpannerConfig.Reading from Cloud Spanner
A
PTransform that create a transaction.A builder for
SpannerIO.CreateTransaction.A failure handling strategy.
Implementation of
SpannerIO.read().Implementation of
SpannerIO.readAll().Interface to display the name of the metadata table on Dataflow UI.
A
PTransform that writes Mutation objects to Google Cloud Spanner.Same as
SpannerIO.Write but supports grouped mutations.A provider for reading from Cloud Spanner using a Schema Transform Provider.
Encapsulates Cloud Spanner Schema.
Exception to signal that Spanner schema retrieval failed.
Exposes
SpannerIO.WriteRows, SpannerIO.ReadRows and SpannerIO.ChangeStreamRead as an external transform for cross-language usage.The results of a
SpannerIO.write() transform.An implementation of
Window.Assign for the Spark runner.Translates a bounded portable pipeline into a Spark job.
Predicate to determine whether a URN is a Spark native transform.
A Spark
Source that is tailored to expose a SparkBeamMetric, wrapping an
underlying MetricResults instance.A Spark
Source that is tailored to expose a SparkBeamMetric, wrapping an
underlying MetricResults instance.A
CombineFnBase.GlobalCombineFn with a CombineWithContext.Context for the SparkRunner.Accumulator of WindowedValues holding values for different windows.
Type of the accumulator.
Spark runner
PipelineOptions handles Spark execution-related configurations, such as the
master address, and other user-related knobs.Returns Spark's default storage level for the Dataset or RDD API based on the respective
runner.
Returns the default checkpoint directory of /tmp/${job.name}.
A custom
PipelineOptions to work with properties related to JavaSparkContext.Returns an empty list, to avoid handling null.
Singleton class that contains one
ExecutableStageContext.Factory per job.An implementation of
GroupByKeyViaGroupByKeyOnly.GroupAlsoByWindow logic for grouping by windows and controlling
trigger firings and pane accumulation.Processes Spark's input data iterators using Beam's
DoFnRunner.Creates a job invocation to manage the Spark runner's execution of a portable pipeline.
Driver program that starts a job server for the Spark runner.
Spark runner-specific Configuration for the jobServer.
Pipeline visitor for translating a Beam pipeline into equivalent Spark operations.
SparkPCollectionView is used to pass serialized views to lambdas.
Type of side input.
Spark runner
PipelineOptions handles Spark execution-related configurations, such as the
master address, batch-interval, and other user-related knobs.Represents a Spark pipeline execution result.
Runs a portable pipeline on Apache Spark.
Translator to support translation between Beam transformations and Spark transformations.
Interface for portable Spark translators.
Pipeline options specific to the Spark portable runner running a streaming job.
Holds current processing context for
SparkInputDataProcessor.Streaming sources for Spark
Receiver.A
PTransform to read from Spark Receiver.The SparkRunner translate operations defined on a pipeline to a representation executable by
Spark, and then submitting the job to Spark to be executed.
Evaluator on the pipeline.
Pipeline runner which translates a Beam pipeline into equivalent Spark operations, without
running them.
PipelineResult of running a
Pipeline using SparkRunnerDebugger Use SparkRunnerDebugger.DebugSparkPipelineResult.getDebugString() to get a String representation of the Pipeline translated into
Spark native operations.Custom
KryoRegistrators for Beam's Spark runner needs and registering used class in spark
translation for better serialization performance.Registers the
SparkPipelineOptions.Registers the
SparkRunner.A
JavaStreamingContext factory for resilience.KryoRegistrator for Spark to serialize broadcast variables used for side-inputs.SideInputReader using broadcasted
SideInputValues.A
SideInputReader for the SparkRunner.An implementation of
StateInternals for the SparkRunner.Translates an unbounded portable pipeline into a Spark job.
Translation context used to lazily store Spark datasets during streaming portable pipeline
translation and compute them after translation.
Spark runner
PipelineOptions handles Spark execution-related configurations, such as the
master address, and other user-related knobs.A Spark runner build on top of Spark's SQL Engine (Structured
Streaming framework).
Contains the
PipelineRunnerRegistrar and PipelineOptionsRegistrar for the SparkStructuredStreamingRunner.Registers the
SparkStructuredStreamingPipelineOptions.Registers the
SparkStructuredStreamingRunner.An implementation of
TimerInternals for the SparkRunner.PTransform overrides for Spark runner.Translation context used to lazily store Spark data sets during portable pipeline translation and
compute them after translation.
A "composite" InputDStream implementation for
UnboundedSources.A metadata holder for an input stream partition.
A representation of a split result.
Flink operator for executing splittable
DoFns.A
SplunkEvent describes a single payload sent to Splunk's Http Event Collector (HEC)
endpoint.A builder class for creating a
SplunkEvent.A
Coder for SplunkEvent objects.An unbounded sink for Splunk's Http Event Collector (HEC).
Class
SplunkIO.Write provides a PTransform that allows writing SplunkEvent
records into a Splunk HTTP Event Collector end-point using HTTP POST requests.A class for capturing errors that occur while writing
SplunkEvent to Splunk's Http Event
Collector (HEC) end point.A builder class for creating a
SplunkWriteError.Parse tree for
UNIQUE, PRIMARY KEY constraints.Parse tree for column.
Exception thrown when BeamSQL cannot convert sql to BeamRelNode.
Parse tree for
CREATE EXTERNAL TABLE statement.Parse tree for
CREATE FUNCTION statement.Utilities concerning
SqlNode for DDL.Parse tree for
DROP TABLE statement.SQL parse tree node to represent
SET and RESET statements.SqlTransform is the DSL interface of Beam SQL.Beam
Schema.LogicalTypes corresponding to SQL data types.IO to read (unbounded) from and write to SQS queues.
A
PTransform to read/receive messages from SQS.Deprecated.
superseded by
SqsIO.WriteBatchesA
PTransform to send messages to SQS.Mapper to create a
SendMessageBatchRequestEntry from a unique batch entry id and the
input T.A more convenient
SqsIO.WriteBatches.EntryMapperFn variant that already sets the entry id.Result of
SqsIO.writeBatches().Configuration class for reading data from an AWS SQS queue.
An implementation of
TypedSchemaTransformProvider for jobs reading data from AWS SQS
queues and configured via SqsReadConfiguration.Customer provided key for use with Amazon S3 server-side encryption.
A bundle factory scoped to a particular
ExecutableStage, which has all of the resources it
needs to provide new RemoteBundles.Interface for staging files needed for running a Dataflow pipeline.
A state cell, supporting a
State.clear() operation.State and Timers wrapper.
For internal use only; no backwards-compatibility guarantees.
For internal use only; no backwards-compatibility guarantees.
For internal use only; no backwards-compatibility guarantees.
The
StateDelegator is able to delegate BeamFnApi.StateRequests to a set of registered
handlers.Allows callers to deregister from receiving further state requests.
Jet
Processor implementation for Beam's stateful ParDo primitive.Jet
Processor supplier that will provide instances of StatefulParDoP.A specialized evaluator for ParDo operations in Spark Streaming context that is invoked when
stateful streaming is detected in the DoFn.
Handler for
StateRequests.A set of utility methods which construct
StateRequestHandlers.A handler for bag user state.
A factory which constructs
StateRequestHandlers.BagUserStateHandlers.A handler for iterable side inputs.
A handler for multimap side inputs.
Marker interface that denotes some type of side input handler.
A factory which constructs
StateRequestHandlers.MultimapSideInputHandlers.A specification of a persistent state cell.
Cases for doing a "switch" on the type of
StateSpec.A base class for a visitor with a default method for cases it is not interested in.
A class containing
StateSpec mappingFunctions.Static methods for working with
StateSpecs.A
provision service that returns a static response to all calls.A
RemoteEnvironment that connects to Dataflow runner harness.An
EnvironmentFactory that creates StaticRemoteEnvironment used by a runner harness that
would like to use an existing InstructionRequestHandler.Provider for StaticRemoteEnvironmentFactory.
A set of utilities for inferring a Beam
Schema from static Java types.Constants and variables for CDC support.
A transform that converts messages to protocol buffers in preparation for writing to BigQuery.
This DoFn flushes and optionally (if requested) finalizes Storage API streams.
This
PTransform manages loads into BigQuery using the Storage API.Class used to wrap elements being sent to the Storage API sinks.
A transform to write sharded records to BigQuery using the Storage API.
A transform to write sharded records to BigQuery using the Storage API (Streaming).
Write records to the Storage API using a standard batch approach.
Deprecated.
Legacy non-portable source which can be replaced by a DoFn with timers.
PTransform that performs streaming BigQuery write.
Stores and exports metrics for a batch of Streaming Inserts RPCs.
No-op implementation of
StreamingInsertsResults.Metrics of a batch of InsertAll RPCs.
Deprecated.
tests which use unbounded PCollections should be in the category
UsesUnboundedPCollections.Options used to configure streaming.
StateRequestHandler that uses SideInputHandler to
access the broadcast state that represents side inputs.Class for creating context object of different CDAP classes with stream source type.
Supports translation between a Beam transform, and Spark's operations on DStreams.
Registers classes specialized by the Spark runner.
Translator matches Beam transformation with the appropriate evaluator.
This transform takes in key-value pairs of
TableRow entries and the TableDestination it should be written to.Position for
ReadChangeStreamPartitionProgressTracker.Stream TransformTranslator interface.
Combine.CombineFns for aggregating strings or bytes with an optional delimiter (default comma).A
Combine.CombineFn that aggregates bytes with a byte array as delimiter.A
Combine.CombineFn that aggregates strings with a string as delimiter.A
Coder that wraps a Coder<String> and encodes/decodes values via string
representations.A metric that reports set of unique string values.
Implementation of
StringSet.The result of a
StringSet metric.Empty
StringSetResult, representing no values reported and is immutable.A collection of static methods for manipulating datastructure representations transferred via the
Dataflow API.
A wrapper around a byte[] that uses structural, value-based equality rather than byte[]'s normal
object identity.
A (Key, Coder) pair that uses the structural value of the key (as provided by
Coder.structuralValue(Object)) to perform equality and hashing.An abstract base class to implement a
Coder that defines equality, hashing, and printing
via the class name and recursively using StructuredCoder.getComponents().An implementation of AwsCredentialsProvider that periodically sends an
AssumeRoleWithWebIdentityRequest to the AWS Security Token Service to maintain short-lived
sessions to use for authentication.Builder class for
StsAssumeRoleForFederatedCredentialsProvider.Output of
PAssert.PTransforms for computing the sum of the elements in a PCollection, or the sum of
the values associated with each key in a PCollection of KVs.A
StreamObserver which provides synchronous access access to an underlying StreamObserver.Represents the metadata of a
BeamSqlTable.Builder class for
Table.A wrapper for a
KuduTable and the TableAndRecord representing a typed record.Encapsulates a BigQuery table destination.
A coder for
TableDestination objects.A
Coder for TableDestination that includes time partitioning information.A
Coder for TableDestination that includes time partitioning and clustering
information.Represents a parsed table name that is specified in a FROM clause (and other places).
Helper class to extract table identifiers from the query.
A
TableProvider handles the metadata CRUD of a specified kind of tables.Utility methods for converting JSON
TableRow objects to dynamic protocol message, for use
with the Storage write API.A descriptor for ClickHouse table schema.
A column in ClickHouse table.
A descriptor for a column type.
An enumeration of possible kinds of default values in ClickHouse.
An enumeration of possible types in ClickHouse.
An updatable cache for table schemas.
Helper utilities for handling schema-update responses.
For internal use only; no backwards-compatibility guarantees.
PTransforms for getting information about quantiles in a stream.Implementation of
TDigestQuantiles.globally().Implementation of
TDigestQuantiles.perKey().Implements the
Combine.CombineFn of TDigestQuantiles transforms.A PTransform that returns its input, but also applies its input to an auxiliary PTransform, akin
to the shell
tee command, which is named after the T-splitter used in plumbing.Test rule which creates a new table with specified schema, with randomized name and exposes few
APIs to work with it.
Interface to implement a polling assertion.
Mocked table for bounded data sources.
A set of options used to configure the
TestPipeline.TestDataflowRunner is a pipeline runner that wraps a DataflowRunner when running
tests against the TestPipeline.A
TestRule that validates that all submitted tasks finished and were completed.A union of the
ExecutorService and TestRule interfaces.Test Flink runner.
A JobService for tests.
An implementation of
DoFn.OutputReceiver that naively collects all output values.A creator of test pipelines that can be used inside of tests that can be configured to run
locally or against a remote pipeline runner.
An exception thrown in case an abandoned
PTransform is
detected, that is, a PTransform that has not been run.An exception thrown in case a test finishes without invoking
Pipeline.run().Implementation detail of
TestPipeline.newProvider(T), do not use.JUnit 5 extension for
TestPipeline that provides the same functionality as the JUnit 4
TestRule implementation.TestPipelineOptions is a set of options for test pipelines.Matcher which will always pass.
Factory for
PipelineResult matchers which always pass.Options for
TestPortableRunner.Factory for default config.
Register
TestPortablePipelineOptions.TestPortableRunner is a pipeline runner that wraps a PortableRunner when running
tests against the TestPipeline.PipelineOptions for use with the TestPrismRunner.Test rule which creates a new topic and subscription with randomized names and exposes the APIs
to work with them.
Test rule which observes elements of the
PCollection and checks whether they match the
success criteria.A
SparkPipelineOptions for tests.A factory to provide the default watermark to stop a pipeline that reads from an unbounded
source.
The SparkRunner translate operations defined on a pipeline to a representation executable by
Spark, and then submitting the job to Spark to be executed.
A testing input that generates an unbounded
PCollection of elements, advancing the
watermark and processing time as elements are emitted.An incomplete
TestStream.A
TestStream.Event that produces elements.An event in a
TestStream.The types of
TestStream.Event that are supported by TestStream.A
TestStream.Event that advances the processing time clock.Coder for
TestStream.A
TestStream.Event that advances the watermark.Utility methods which enable testing of
StreamObservers.A builder for a test
CallStreamObserver that performs various callbacks.Flink source for executing
TestStream.Base class for mocked table.
Test in-memory table provider for use in tests.
TableWitRows.
Utility functions for mock classes.
A mocked unbounded table.
Register
TestUniversalRunner.Options.Registrar for the portable runner.
PTransforms for reading and writing text files.Deprecated.
Use
Compression.Implementation of
TextIO.read().Deprecated.
See
TextIO.readAll() for details.Implementation of
TextIO.readFiles().Implementation of
TextIO.sink().Implementation of
TextIO.write().This class is used as the default return value of
TextIO.write().TextJsonTable is a BeamSqlTable that reads text files and converts them according
to the JSON format.This returns a row count estimation for files associated with a file pattern.
Builder for
TextRowCountEstimator.This strategy stops sampling if we sample enough number of bytes.
This strategy stops sampling when total number of sampled bytes are more than some threshold.
An exception that will be thrown if the estimator cannot get an estimation of the number of
lines.
This strategy samples all the files.
Sampling Strategy shows us when should we stop reading further files.
Implementation detail of
TextIO.Read.TextTable is a BeamSqlTable that reads text files and converts them according to
the specified format.Text table provider.
Read-side converter for
TextTable with format 'csv'.Read-side converter for
TextTable with format 'lines'.Write-side converter for for
TextTable with format 'lines'.A
Coder that encodes Integer Integers as the ASCII bytes of their textual,
decimal, representation.PTransforms for reading and writing TensorFlow TFRecord files.Deprecated.
Use
Compression.Implementation of
TFRecordIO.read().Implementation of
TFRecordIO.readFiles().Implementation of
TFRecordIO.write().Configuration for reading from TFRecord.
Builder for
TFRecordReadSchemaTransformConfiguration.Configuration for reading from TFRecord.
A
Coder using a Thrift TProtocol to
serialize/deserialize elements.PTransforms for reading and writing files containing Thrift encoded data.Implementation of
ThriftIO.readFiles(java.lang.Class<T>).Implementation of
ThriftIO.sink(org.apache.thrift.protocol.TProtocolFactory).Writer to write Thrift object to
OutputStream.Schema provider for generated thrift types.
An estimator to calculate the throughput of the outputted elements from a DoFn.
A
BiConsumer which can throw Exceptions.A
BiFunction which can throw Exceptions.Transforms for parsing arbitrary files using Apache Tika.
Implementation of
TikaIO.parse().Implementation of
TikaIO.parseFiles().A time without a time-zone.
TimeDomain specifies whether an operation is based on timestamps of elements or current
"real-world" time as reported while processing.A timer for a specified time domain that can be set to register the desire for further processing
at particular time in its specified time domain.
A factory that passes timers to
TimerReceiverFactory.timerDataConsumer.Interface for interacting with time.
A specification for a
Timer.Static methods for working with
TimerSpecs.Utility class for handling timers in the Spark runner.
A marker class used to identify timer keys and values in Spark transformations.
Policies for combining timestamps that occur within a window.
Convert between different Timestamp and Instant classes.
An immutable pair of a value and a timestamp.
A
Coder for TimestampedValue.This encoder/decoder writes a com.google.cloud.Timestamp object as a pair of long and int to avro
and reads a Timestamp object from the same pair.
A
WatermarkEstimator that observes the timestamps of all records output from a DoFn.A timestamp policy to assign event time for messages in a Kafka partition and watermark for it.
The context contains state maintained in the reader for the partition.
An extendable factory to create a
TimestampPolicy for each partition at runtime by
KafkaIO reader.Assigns Kafka's log append time (server side ingestion time) to each record.
A simple policy that uses current time for event time and watermark.
Internal policy to support deprecated withTimestampFn API.
A
TimestampPrefixingWindowCoder wraps arbitrary user custom window coder.A restriction represented by a range of timestamps [from, to).
A
RestrictionTracker for claiming positions in a TimestampRange in a
monotonically increasing fashion.For internal use only; no backwards-compatibility guarantees.
For internal use only; no backwards-compatibility guarantees.
For internal use only; no backwards-compatibility guarantees.
Provides methods in order to convert timestamp to nanoseconds representation and back.
A helper class for converting between Dataflow API and SDK time representations.
Time conversion utilities.
Creates a
PTransform that serializes UTF-8 JSON objects from a Schema-aware
PCollection (i.e.PTransforms for finding the largest (or smallest) set of elements in a
PCollection, or the largest (or smallest) set of values associated with each key in a
PCollection of KVs.Deprecated.
use
Top.Natural insteadA
Serializable Comparator that that uses the compared elements' natural
ordering.Serializable Comparator that that uses the reverse of the compared elements'
natural ordering.Deprecated.
use
Top.Reversed insteadCombineFn for Top transforms that combines a bunch of Ts into a single
count-long List<T>, using compareFn to choose the largest Ts.The
Coder for encoding and decoding TopicPartition in Beam.PTransforms for converting a PCollection<?>, PCollection<KV<?,?>>, or PCollection<Iterable<?>> to a PCollection<String>.A transaction object.
Describe a
PTransform evaluator.A
Runnable that will execute a PTransform on some bundle of input.Provides a mapping of
RunnerApi.FunctionSpec to a PTransform, together with
mappings of its inputs and outputs to maps of PCollections.A utility that can be used to manage a Beam Transform Service.
A
TransformTranslator knows how to translate a particular subclass of PTransform
for the Cloud Dataflow service.A
TransformTranslator provides the capability to translate a specific primitive or
composite PTransform into its Spark correspondence.Supports translation between a Beam transform, and Spark's operations on RDDs.
The interface for a
TransformTranslator to build a Dataflow step.The interface provided to registered callbacks for interacting with the
DataflowRunner,
including reading and writing the values of PCollections and side inputs.Translator matches Beam transformation with the appropriate evaluator.
A set of utilities to help translating Beam transformations into Spark transformations.
doc.
A SparkCombineFn function applied to grouped KVs.
A utility class to filter
TupleTags.Helpers for cloud communication.
Triggers control when the elements for a specific key and window are output.
For internal use only; no backwards-compatibility guarantees.
A
TupleTag is a typed tag to use as the key of a heterogeneously typed tuple, like PCollectionTuple.A
TupleTagList is an immutable list of heterogeneously typed TupleTags.TVFSlidingWindowFn assigns window based on input row's "window_start" and "window_end"
timestamps.
Provides static constants or utils for TVF streaming.
doc.
Twister pipeline translator for batch pipelines.
Twister2BatchTranslationContext.
Twister2 wrapper for Bounded Source.
Empty Source wrapper.
Twister2PipelineExecutionEnvironment.
Twister2PipelineOptions.
Represents a Twister2 pipeline execution result.
Twister2PipelineTranslator, both batch and streaming translators need to extend from this.
Deprecated.
The support for twister2 is scheduled for removal in Beam 3.0.
AutoService registrar - will register Twister2Runner and Twister2Options as possible pipeline
runner services.
Pipeline options registrar.
Pipeline runner registrar.
Sink Function that collects results.
Twister pipeline translator for stream pipelines.
Twister2StreamingTranslationContext.
A
PipelineRunner that executes the operations in the pipeline by first translating them
to a Twister2 Plan and then executing them either locally or on a Twister2 cluster, depending on
the configuration.Twister2TranslationContext.
Represents a type of a column within Cloud Spanner.
A
Combine.CombineFn delegating all relevant calls to given delegate.A description of a Java type, including actual generic parameters where possible.
A utility class for creating
TypeDescriptor objects for different types, such as Java
primitive types, containers and KVs of other TypeDescriptor objects, and
extracting type variables of parameterized types (e.g.A helper interface for use with
TypeDescriptors.extractFromTypeParameters(Object, Class, TypeVariableExtractor).Like
SchemaTransformProvider except uses a configuration object instead of Schema and
Row.Captures a free type variable that can be used in
TypeDescriptor.where(org.apache.beam.sdk.values.TypeParameter<X>, org.apache.beam.sdk.values.TypeDescriptor<X>).Implement
AggregateFunction to take a Combine.CombineFn as UDAF.Beam-customized version from
ReflectiveFunctionBase, to address BEAM-5921.Helps build lists of
FunctionParameter.Provider for user-defined functions written in Java.
Defines Java UDFs for use in tests.
Provider for UDF and UDAF.
This DoFn is the responsible for writing to Solace in batch mode (holding up any messages), and
emit the corresponding output (success or fail; only for persistent messages), so the
SolaceIO.Write connector can be composed with other subsequent transforms in the pipeline.
DStream holder Can also crate a DStream from a supplied queue of values, but mainly for testing.
This DoFn encapsulates common code used both for the
UnboundedBatchedSolaceWriter and
UnboundedStreamingSolaceWriter.A
Source that reads an unbounded amount of input and, because of that, supports some
additional operations such as checkpointing, watermarks, and record ids.A marker representing the progress and state of an
UnboundedSource.UnboundedReader.A checkpoint mark that does nothing when finalized.
A
Reader that reads an unbounded amount of input.Jet
Processor implementation for reading from an unbounded Beam
source.Wrapper for executing
UnboundedSources as a Flink Source.This DoFn is the responsible for writing to Solace in streaming mode (one message at a time, not
holding up any message), and emit the corresponding output (success or fail; only for persistent
messages), so the SolaceIO.Write connector can be composed with other subsequent transforms in
the pipeline.
A UnionCoder encodes RawUnionValues.
Generate unique IDs that can be used to differentiate different jobs and partitions.
A base class for logical types that are not understood by the Java SDK.
Combines the source event which failed to process with the failure reason.
Options for controlling what to do with unsigned types, specifically whether to use a higher bit
count or, in the case of uint64, a string.
Defines the exact behavior for unsigned types.
Builder for
UnsignedOptions.A legacy snapshot which does not care about schema compatibility.
Builds a MongoDB UpdateConfiguration object.
Update destination schema based on data that is about to be copied into it.
Implements a response intercepter that logs the upload id if the upload id header exists and it
is the first request (does not have upload_id parameter in the request).
Base
Exception for signaling errors in user custom code.Extends
UserCodeQuotaException to allow the user custom code to specifically signal a
Quota or API overuse related error.A
UserCodeExecutionException that signals an error with a remote system.An extension of
UserCodeQuotaException to specifically signal a user code timeout.Category tag for validation tests which utilize
Metrics.Category tag for validation tests which utilize splittable
ParDo with a DoFn.BoundedPerElement DoFn.Category tag for validation tests which utilize
BoundedTrie.Category tag for validation tests which use
DoFn.BundleFinalizer.Category tag for validation tests which utilize
Metrics.Category tag for validation tests which utilize
Counter.Category tag for validation tests which utilize custom window merging.
Category tag for validation tests which utilize
Distribution.Category tag for tests which relies on a pre-defined port, such as expansion service or transform
service.
Category tag for tests which validate that currect failure message is provided by failed
pipeline.
Category tag for validation tests which utilize
Gauge.Category for tests that use
Impulse transformations.Category tag for tests which use the expansion service in Java.
Category tag for validation tests which use key.
Category tag for validation tests which utilize --tempRoot from
TestPipelineOptions and
and expect a default KMS key enable for the bucket specified.Category tag for validation tests which utilize looping timers in
ParDo.Category tag for validation tests which utilize
MapState.Category tag for validation tests which utilize the metrics pusher feature.
Category tag for validation tests which utilize
MultimapState.Category tag for validation tests which utilize
DoFn.OnWindowExpiration.Category tag for validation tests which utilize
OrderedListState.Category tag for the ParDoLifecycleTest for exclusion (BEAM-3241).
Category tag for validation tests which rely on a runner providing per-key ordering.
Category tag for validation tests which rely on a runner providing per-key ordering in between
transforms in the same ProcessBundleRequest.
Category tag for validation tests which utilize timers in
ParDo.Category tag for tests which use the expansion service in Python.
Category tag for validation tests which utilize
DoFn.RequiresTimeSortedInput in stateful
ParDo.Category tag for validation tests which utilize schemas.
Category tag for tests which validate that the SDK harness executes in a well formed environment.
Category tag for validation tests which utilize
SetState.Category tag for validation tests which use sideinputs.
Category tag for validation tests which use multiple side inputs with different coders.
Category tag for validation tests which utilize stateful
ParDo.Category for tests that enforce strict event-time ordering of fired timers, even in situations
where multiple tests mutually set one another and watermark hops arbitrarily far to the future.
Category tag for validation tests which utilize
StringSet.Category tag for tests that use System metrics.
Category tag for tests that use
TestStream, which is not a part of the Beam model but a
special feature currently only implemented by the direct runner and the Flink Runner (streaming).Subcategory for
UsesTestStream tests which use TestStream # across multiple
stages.Category tag for validation tests which use outputTimestamp.
Subcategory for
UsesTestStream tests which use the processing time feature of TestStream.Category tag for validation tests which use timerMap.
Category tag for validation tests which utilize timers in
ParDo.Category tag for validation tests which use triggered sideinputs.
Category tag for validation tests which utilize at least one unbounded
PCollection.Category tag for validation tests which utilize splittable
ParDo with a DoFn.UnboundedPerElement DoFn.Various common methods used by the Jet based runner.
A wrapper of
byte[] that can be used as a hash-map key.A Uuid storable in a Pub/Sub Lite attribute.
A coder for a Uuid.
Options for deduplicating Pub/Sub Lite messages based on the UUID they were published with.
A transform for deduplicating Pub/Sub Lite messages based on the UUID they were published with.
Base class for types representing UUID as two long values.
Category tag for tests which validate that a Beam runner is correctly implemented.
Validation represents a set of annotations that can be used to annotate getter properties
on PipelineOptions with information representing the validation criteria to be used when
validating with the PipelineOptionsValidator.This criteria specifies that the value must be not null.
Kryo serializer for
ValueAndCoderLazySerializable.A holder object that lets you serialize an element with a Coder with minimal wasted space.
Represents the capture type of a change stream.
An immutable tuple of value, timestamp, window, and pane.
A coder for
ValueInSingleWindow.A
ValueProvider abstracts the notion of fetching a value that may or may not be currently
available.For internal use only; no backwards compatibility guarantees.
ValueProvider.NestedValueProvider is an implementation of ValueProvider that allows for
wrapping another ValueProvider object.ValueProvider.RuntimeValueProvider is an implementation of ValueProvider that allows for a
value to be provided at execution time rather than at graph construction time.For internal use only; no backwards compatibility guarantees.
ValueProvider.StaticValueProvider is an implementation of ValueProvider that allows for a
static value to be provided.Utilities for working with the
ValueProvider interface.Values<V> takes a PCollection of KV<K, V>s and returns a
PCollection<V> of the values.A
ReadableState cell containing a single value.For internal use only; no backwards compatibility guarantees.
A LogicalType representing a variable-length byte array with specified maximum length.
A LogicalType representing a variable-length string with specified maximum length.
Combine.CombineFn for Variance on Number types.Benchmarks for
VarInt and variants.Output to
Blackhole.Input from randomly generated bytes.
Output to
ByteStringOutputStream.Input from randomly generated longs.
Factory class for PTransforms integrating with Google Cloud AI - VideoIntelligence service.
A PTransform taking a PCollection of
ByteString and an optional side input with a
context map and emitting lists of VideoAnnotationResults for each element.A PTransform taking a PCollection of
KV of ByteString and VideoContext
and emitting lists of VideoAnnotationResults for each element.A PTransform taking a PCollection of
String and an optional side input with a context
map and emitting lists of VideoAnnotationResults for each element.Transforms for creating
PCollectionViews from PCollections (to read them as side inputs).For internal use only; no backwards-compatibility guarantees.
For internal use only; no backwards-compatibility guarantees.
For internal use only; no backwards-compatibility guarantees.
For internal use only; no backwards-compatibility guarantees.
For internal use only; no backwards-compatibility guarantees.
For internal use only; no backwards-compatibility guarantees.
Provides an index to value mapping using a random starting index and also provides an offset
range for each window seen.
For internal use only; no backwards-compatibility guarantees.
Jet
Processor implementation for Beam's side input producing
primitives.Delays processing of each window in a
PCollection until signaled.Implementation of
Wait.on(org.apache.beam.sdk.values.PCollection<?>...).Given a "poll function" that produces a potentially growing set of outputs for an input, this
transform simultaneously continuously watches the growth of output sets of all inputs, until a
per-input termination condition is reached.
A function that computes the current set of outputs for the given input, in the form of a
Watch.Growth.PollResult.The result of a single invocation of a
Watch.Growth.PollFn.A strategy for determining whether it is time to stop polling the current input regardless of
whether its output is complete or not.
A
WatermarkEstimator which is used for estimating output watermarks of a splittable
DoFn.Support utilties for interacting with
WatermarkEstimators.A set of
WatermarkEstimators that users can use to advance the output watermark for their
associated splittable DoFns.Concrete implementation of a
ManualWatermarkEstimator.A watermark estimator that observes timestamps of records output from a DoFn reporting the
timestamp of the last element seen as the current watermark.
A watermark estimator that tracks wall time.
Interface which allows for accessing the current watermark and watermark estimator state.
For internal use only; no backwards-compatibility guarantees.
WatermarkParameters contains the parameters used for watermark computation.Implement this interface to define a custom watermark calculation heuristic.
Implement this interface to create a
WatermarkPolicy.ArrivalTimeWatermarkPolicy uses
WatermarkPolicyFactory.CustomWatermarkPolicy for watermark computation.CustomWatermarkPolicy uses parameters defined in
WatermarkParameters to compute
watermarks.Watermark policy where the processing time is used as the event time.
Defines the behavior for a OIDC web identity token provider.
Window logically divides up or groups the elements of a PCollection into finite
windows according to a WindowFn.A Primitive
PTransform that assigns windows to elements based on a WindowFn.Specifies the conditions under which a final pane will be created when a window is permanently
closed.
Specifies the conditions under which an on-time pane will be created when a window is closed.
Flink operator for executing window
DoFns.A value along with Beam's windowing information and all other metadata.
Implementations of
WindowedValue and static utility methods.Coder for
WindowedValue.A parameterized coder for
WindowedValue.A
WindowedValues which holds exactly single window per value.Deprecated.
Use ParamWindowedValueCoder instead, it is a general purpose implementation of the
same concept but makes timestamp, windows and pane info configurable.
Abstract class for
WindowedValue coder.The argument to the
Window transform used to assign elements into windows and to
determine how windows are merged.A utility class for testing
WindowFns.Jet
Processor implementation for Beam's GroupByKeyOnly +
GroupAlsoByWindow primitives.A
WindowingStrategy describes the windowing behavior for a specific collection of values.The accumulation modes that can be used with windowing.
An implementation of
TypedSchemaTransformProvider for WindowInto.A function that takes the windows of elements in a main input and maps them to the appropriate
window in a
PCollectionView consumed as a side input.Helpers to construct coders for gRPC port reads and writes.
A collection of utilities for writing transforms that can handle exceptions raised during
processing of elements.
A simple handler that extracts information from an exception to a
Map<String, String>
and returns a KV where the key is the input element that failed processing, and the
value is the map of exception attributes.The value type passed as input to exception handlers.
An intermediate output type for PTransforms that allows an output collection to live alongside
a collection of elements that failed the transform.
WithKeys<K, V> takes a PCollection<V>, and either a constant key of type
K or a function from V to K, and returns a PCollection<KV<K, V>>, where
each of the values in the input PCollection has been paired with either the constant key
or a key computed from the value.A
and
MetricRegistry decorator-like that supports
invalid reference
AggregatorMetric
SparkBeamMetric as Gauges.A
MetricRegistry decorator-like that supports BeamMetricSets as Gauges.A
PTransform for assigning timestamps to all the elements of a PCollection.Duplicated from beam-examples-java to avoid dependency.
A PTransform that converts a PCollection containing lines of text into a PCollection of
formatted word counts.
A SimpleFunction that converts a Word and Count into a printable string.
Options supported by
WordCount.Workarounds for dealing with limitations of Flink or its libraries.
KeySelector that retrieves a key from a KeyedWorkItem.Wrapper class for
ReceiverSupervisor that doesn't use Spark Environment.Parameters class to expose the transform to an external SDK.
Enum containing all supported dispositions during writing to table phase.
A
PTransform that writes to a FileBasedSink.The result of a
WriteFiles transform.Return type of
JmsIO.Write transform.The result of a
BigQueryIO.Write transform.DoFn for writing to Apache Pulsar.
Transforms for reading and writing XML files using JAXB mappers.
Implementation of
XmlIO.read().Deprecated.
Use
Compression instead.Implementation of
XmlIO.readFiles().Implementation of
XmlIO.sink(java.lang.Class<T>).Implementation of
XmlIO.write().Implementation of
XmlIO.read().A
FileWriteSchemaTransformFormatProvider for XML format.Allows one to invoke Beam YAML
transforms from Java.
This is a class to indicate that a TVF is a ZetaSQL SQL native UDTVF.
Wraps an existing coder with Zstandard compression.
ApproximateCountDistinctin thezetasketchextension module, which makes use of theHllCountimplementation.