All Classes and Interfaces
Class
Description
BeamRelNode to replace
Project
and Filter
node.Abstract base for runners that execute a
Combine.PerKey
.A final combiner that takes in
AccumT
and produces OutputT
.Adapter interface that allows using a
CombineFnBase.GlobalCombineFn
to either produce
the AccumT
as output or to combine several accumulators into an OutputT
.A partial combiner that takes in
InputT
and produces AccumT
.Abstract base class for iterators that process Spark input data and produce corresponding output
values.
Factory class for creating instances that will handle different functions of DoFns.
Factory class for creating instances that will handle each type of record within a change stream
query.
A transform to add new nullable fields to a PCollection's schema.
Inner PTransform for AddFields.
A
ClientInterceptor
that attaches a provided SDK Harness ID to outgoing messages.This class adds pseudo-key with a given cardinality.
A transform to add UUIDs to each message to be written to Pub/Sub Lite.
A
Phaser
which never terminates.A composite
Trigger
that fires when all of its sub-triggers are ready.A composite
Trigger
that executes its sub-triggers in order.A composite
Trigger
that fires once after at least one of its sub-triggers have fired.A
Trigger
that fires at some point after a specified number of input elements have
arrived.A
Trigger
trigger that fires at a specified point in processing time, relative to when
input first arrives.FOR INTERNAL USE ONLY.
AfterWatermark
triggers fire based on progress of the system watermark.A watermark trigger targeted relative to the end of the window.
An aggregate function that can be executed as part of a SQL query.
Wrapper
Combine.CombineFn
s for aggregation function calls.Builds a MongoDB AggregateIterable object.
AmqpIO supports AMQP 1.0 protocol using the Apache QPid Proton-J library.
A
PTransform
to read/receive messages using AMQP 1.0 protocol.A
PTransform
to send messages using AMQP 1.0 protocol.A coder for AMQP message.
A
CoderProviderRegistrar
for standard types used with AmqpIO
.A
PTransform
using the Cloud AI Natural language processing capability.ApiIOError
is a data class for storing details about an error.Options that allow setting the application name.
PTransform
s for estimating the number of distinct elements in a PCollection
, or
the number of distinct values associated with each key in a PCollection
of KV
s.PTransform
for estimating the number of distinct elements in a PCollection
.PTransform
s for computing the approximate number of distinct elements in a stream.Implements the
Combine.CombineFn
of ApproximateDistinct
transforms.Implementation of
ApproximateDistinct.globally()
.Coder for
HyperLogLogPlus
class.Implementation of
ApproximateDistinct.perKey()
.PTransform
s for getting an idea of a PCollection
's data distribution using
approximate N
-tiles (e.g.The
ApproximateQuantilesCombineFn
combiner gives an idea of the distribution of a
collection of values using approximate N
-tiles.Deprecated.
CombineFn
that computes an estimate of the number of distinct values that were
combined.A heap utility class to efficiently track the largest added elements.
PTransform
for estimating the number of distinct elements in a PCollection
.PTransform
for estimating the number of distinct values associated with each key in a
PCollection
of KV
s.Converts Arrow schema to Beam row schema.
An
ArtifactRetrievalService
that uses FileSystems
as its backing storage.A pairing of a newly created artifact type and an output stream that will be readable at that
type.
Provides a concrete location to which artifacts can be staged on retrieval.
PTransform
for serializing objects to JSON Strings
./** * Jet
Processor
implementation for Beam's Windowing primitive.Assign Windows function.
Assign Window translator.
Async handler that automatically retries unprocessed records in case of a partial success.
Statistics on the batch request.
Asynchronously compute the earliest partition watermark and stores it in memory.
A
Coder
that serializes and deserializes the AttributeValue
objects.Enables users to specify their own `JMS` backlog reporters enabling
JmsIO
to report
UnboundedSource.UnboundedReader.getTotalBacklogBytes()
.A
SchemaProvider
for AutoValue classes.FieldValueTypeSupplier
that's based on AutoValue getters.Utilities for managing AutoValue schemas.
A
Coder
using Avro binary format.Create
DatumReader
and DatumWriter
for given schemas.Specialized
AvroDatumFactory
for GenericRecord
.Specialized
AvroDatumFactory
for java classes transforming to avro through reflection.Specialized
AvroDatumFactory
for SpecificRecord
.AvroCoder specialisation for GenericRecord, needed for cross-language transforms.
Coder registrar for AvroGenericCoder.
Coder translator for AvroGenericCoder.
Utility methods for converting Avro
GenericRecord
objects to dynamic protocol message,
for use with the Storage write API.PTransform
s for reading and writing Avro files.Deprecated.
See
AvroIO.parseAllGenericRecords(SerializableFunction)
for details.Implementation of
AvroIO.read(java.lang.Class<T>)
and AvroIO.readGenericRecords(org.apache.avro.Schema)
.Deprecated.
See
AvroIO.readAll(Class)
for details.Implementation of
AvroIO.readFiles(java.lang.Class<T>)
.Deprecated.
Users can achieve the same by providing this transform in a
ParDo
before using write in AvroIO AvroIO.write(Class)
.Implementation of
AvroIO.write(java.lang.Class<T>)
.This class is used as the default return value of
AvroIO.write(java.lang.Class<T>)
Avro 1.8 ships with joda time conversions only.
Avro 1.8 invalid input: '&' 1.9 ship joda time conversions.
A
SchemaProvider
for AVRO generated SpecificRecords and POJOs.An implementation of
SchemaIOProvider
for reading and writing Avro files with AvroIO
.A
FileBasedSink
for Avro files.Do not use in pipelines directly: most users should use
AvroIO.Read
.A
BlockBasedSource.BlockBasedReader
for reading blocks from Avro files.TableProvider
for AvroIO
for consumption by Beam SQL.Utils to convert AVRO records to Beam rows.
Wrapper for fixed byte fields.
A
FileWriteSchemaTransformFormatProvider
for avro format.Builder factory for AWS
SdkPojo
to avoid using reflection to instantiate a builder.A Jackson
Module
that registers a JsonSerializer
and JsonDeserializer
for
AwsCredentialsProvider
and some subclasses.Options used to configure Amazon Web Services specific options such as credentials and region.
Attempt to load default region.
Return
DefaultCredentialsProvider
as default provider.A registrar containing the default AWS options.
Schema provider for AWS
SdkPojo
models using the provided field metadata (@see SdkPojo.sdkFields()
) rather than reflection.Utilities for working with AWS Serializables.
AutoService
registrar for the AzureBlobStoreFileSystem
.A Jackson
Module
that registers a JsonSerializer
and JsonDeserializer
for
Azure credential providers.Attempts to load Azure credentials.
A registrar containing the default Azure options.
An adapter for converting between Apache Beam and Google API client representations of backoffs.
A
ReadableState
cell containing a bag of values.Basic implementation of
BeamSqlTable
.A factory for creating
JcsmpSessionService
instances.A class that manages REST calls to the Solace Element Management Protocol (SEMP) using basic
authentication.
A factory for creating
BasicAuthSempClient
instances.Class for Batch, Sink and Stream CDAP wrapper classes that use it to provide common details.
StateRequestHandler
that uses a BatchSideInputHandlerFactory.SideInputGetter
to access side inputs.Returns the value for the side input with the given PCollection id from the runner.
Class for creating context object of different CDAP classes with batch sink type.
Class for creating context object of different CDAP classes with batch source type.
PTransformOverrideFactories
that expands to correctly implement
stateful ParDo
using window-unaware BatchViewOverrides.GroupByKeyAndSortValuesOnly
to linearize
processing per key.A key-preserving
DoFn
that explodes an iterable that has been grouped by key and
window.Batch TransformTranslator interface.
This rule is essentially a wrapper around Calcite's
AggregateProjectMergeRule
.BeamRelNode
to replace a Aggregate
node.Rule to detect the window/trigger settings.
Aggregation rule that doesn't include projection.
This is a shell tset environment which is used on as a central driver model to fit what beam
expects.
The Twister2 worker that will execute the job logic once the job is submitted from the run
method.
Built-in aggregations functions for COUNT/MAX/MIN/SUM/AVG/VAR_POP/VAR_SAMP.
Built-in Analytic Functions for the aggregation analytics functionality.
BeamBuiltinFunctionClass interface.
BeamBuiltinMethods.
Adapter from
TableProvider
to Schema
.Adapter from
BeamSqlTable
to a calcite Table.Planner rule to merge a
BeamCalcRel
with a BeamCalcRel
.BeamRelNode to replace
Project
and Filter
node.WrappedList translates
List
on access.WrappedMap translates
Map
on access.WrappedRow translates
Row
on access.A
RelOptRule
that converts a LogicalCalc
into a chain of AbstractBeamCalcRel
nodes via CalcRelSplitter
.BeamCodegenUtils.
A
BeamJoinRel
which does CoGBK JoinRule to convert
LogicalJoin
node to BeamCoGBKJoinRel
node.VolcanoCost
represents the cost of a plan node.Implementation of
RelOptCostFactory
that creates
BeamCostModel
s.BeamRelNode to replace a
Enumerable
node.An adapter class that allows one to apply Apache Beam PTransforms directly to Flink DataSets.
An adapter class that allows one to apply Apache Beam PTransforms directly to Flink DataStreams.
A gRPC multiplexer for a specific
Endpoints.ApiServiceDescriptor
.An outbound data buffering aggregator with size-based buffer and time-based buffer if
corresponding options are set.
A Beam
BoundedSource
for Impulse Source.BeamRelNode
to replace a Intersect
node.ConverterRule
to replace Intersect
with BeamIntersectRel
.BeamRelNode to replace a
TableModify
node.BeamRelNode to replace a
TableScan
node.customized data type in Beam.
This is very similar to
JoinAssociateRule
.This is exactly similar to
JoinPushThroughJoinRule
.An abstract
BeamRelNode
to implement Join Rels.Collections of
PTransform
and DoFn
used to perform JOIN operation.Transform to execute Join as Lookup.
A Kafka topic that saves records as CSV format.
BeamKafkaTable
represent a Kafka topic, as source or target.Convention for Beam SQL.
BeamRelNode
to replace a Match
node.ConverterRule
to replace Match
with BeamMatchRel
.BeamRelNode
to replace a Minus
node.ConverterRule
to replace Minus
with BeamMinusRel
.BeamPCollectionTable
converts a PCollection<Row>
as a virtual table, then a
downstream query can query directly.customized data type in Beam.
A
RelNode
that can also give a PTransform
that implements the expression.Utility methods for converting Beam
Row
objects to dynamic protocol message, for use with
the Storage write API.RuleSet
used in BeamQueryPlanner
.Delegate for Set operators:
BeamUnionRel
, BeamIntersectRel
and
BeamMinusRel
.Set operator type.
Collections of
PTransform
and DoFn
used to perform Set operations.Transform a
BeamSqlRow
to a KV<BeamSqlRow, BeamSqlRow>
.Filter function used for Set operators.
A
BeamJoinRel
which does sideinput JoinRule to convert
LogicalJoin
node to BeamSideInputJoinRel
node.A
BeamJoinRel
which does Lookup JoinRule to convert
LogicalJoin
node to BeamSideInputLookupJoinRel
node.BeamRelNode
to replace a Sort
node.ConverterRule
to replace Sort
with BeamSortRel
.BeamSqlCli
provides methods to execute Beam SQL with an interactive client.Example pipeline that uses Google Cloud Data Catalog to retrieve the table metadata.
Pipeline options to specify the query and the output for the example.
Contains the metadata of tables/UDF functions, and exposes APIs to
query/validate/optimize/translate SQL statements.
BeamSqlEnv's Builder.
A test PTransform to display output in console.
Options used to configure BeamSQL.
AutoService
registrar for BeamSqlPipelineOptions
.Utilities for
BeamRelNode
.A seekable table converts a JOIN operator to an inline lookup.
This interface defines a Beam Sql Table.
This interface defines Beam SQL Table Filter.
Interface to create a UDF in Beam SQL.
Custom StoppableFunction for backward compatibility.
BeamRelNode to replace
TableFunctionScan
.This is the conveter rule that converts a Calcite
TableFunctionScan
to Beam
TableFunctionScanRel
.This class stores row count statistics.
Utility methods for working with
BeamTable
.BeamRelNode
to implement an uncorrelated Uncollect
, aka UNNEST.BeamRelNode
to replace a Union
.BeamRelNode
to replace a Values
node.ConverterRule
to replace Values
with BeamValuesRel
.BeamRelNode
to replace a Window
node.A Fn Status service which can collect run-time status information from SDK harnesses for
debugging purpose.
Planner rule to merge a
BeamZetaSqlCalcRel
with a BeamZetaSqlCalcRel
.BeamRelNode to replace
Project
and Filter
node based on the ZetaSQL
expression evaluator.A
to a chain of
BeamCalcSplittingRule
that converts a
invalid reference
LogicalCalc
BeamZetaSqlCalcRel
and/or BeamCalcRel
via CalcRelSplitter
.Catalog for registering tables and functions.
BeamRelNode
to implement an uncorrelated ZetaSqlUnnest
, aka UNNEST.A
BigDecimalCoder
encodes a BigDecimal
as an integer scale encoded with VarIntCoder
and a BigInteger
encoded using BigIntegerCoder
.Provides converters from
BigDecimal
to other numeric types based on the input Schema.TypeName
.A
BigEndianIntegerCoder
encodes Integers
in 4 bytes, big-endian.A
BigEndianLongCoder
encodes Longs
in 8 bytes, big-endian.A
BigEndianShortCoder
encodes Shorts
in 2 bytes, big-endian.A
BigIntegerCoder
encodes a BigInteger
as a byte array containing the big endian
two's-complement representation, encoded via ByteArrayCoder
.A wrapper class to call Bigquery API calls.
A
CoderProviderRegistrar
for standard types used with BigQueryIO
.An implementation of
TypedSchemaTransformProvider
for BigQuery Storage Read API jobs
configured via BigQueryDirectReadSchemaTransformProvider.BigQueryDirectReadSchemaTransformConfiguration
.A
SchemaTransform
for BigQuery Storage Read API, configured with BigQueryDirectReadSchemaTransformProvider.BigQueryDirectReadSchemaTransformConfiguration
and instantiated by BigQueryDirectReadSchemaTransformProvider
.Configuration for reading from BigQuery with Storage Read API.
Configuration for reading from BigQuery.
An implementation of
TypedSchemaTransformProvider
for BigQuery read jobs configured using
BigQueryExportReadSchemaTransformConfiguration
.An implementation of
SchemaTransform
for BigQuery read jobs configured using BigQueryExportReadSchemaTransformConfiguration
.An implementation of
TypedSchemaTransformProvider
for BigQuery write jobs configured
using BigQueryWriteConfiguration
.A set of helper functions and classes used by
BigQueryIO
.Model definition for BigQueryInsertError.
A
Coder
that encodes BigQuery BigQueryInsertError
objects.PTransform
s for reading and writing BigQuery tables.Implementation of
BigQueryIO.read()
.Implementation of
BigQueryIO.read(SerializableFunction)
.Determines the method used to read data from BigQuery.
An enumeration type for the priority of a query.
Implementation of
BigQueryIO.write()
.An enumeration type for the BigQuery create disposition strings.
Determines the method used to insert data in BigQuery.
An enumeration type for the BigQuery schema update options strings.
An enumeration type for the BigQuery write disposition strings.
A matcher to verify data in BigQuery by processing given query and comparing with content's
checksum.
Properties needed when using Google BigQuery with the Apache Beam SDK.
An implementation of
SchemaIOProvider
for reading and writing to BigQuery with BigQueryIO
.Exception to signal that BigQuery schema retrieval failed.
An interface for real, mock, or fake implementations of Cloud BigQuery services.
Container for reading data from streaming endpoints.
An interface to get, create and delete Cloud BigQuery datasets and tables.
An interface for the Cloud BigQuery load service.
An interface representing a client object for making calls to the BigQuery Storage API.
An interface for appending records to a Storage API write stream.
An interface to get, create and flush Cloud BigQuery STORAGE API write streams.
An implementation of
BigQueryServices
that actually communicates with the cloud BigQuery
service.Helper class to create perworker metrics for BigQuery Sink stages.
A
Source
representing reading from a table.An implementation of
TypedSchemaTransformProvider
for BigQuery Storage Write API jobs
configured via BigQueryWriteConfiguration
.A
SchemaTransform
for BigQuery Storage Write API, configured with BigQueryWriteConfiguration
and instantiated by BigQueryStorageWriteApiSchemaTransformProvider
.BigQuery table provider.
Utility methods for BigQuery related operations.
Options for how to convert BigQuery data to Beam data.
Builder for
BigQueryUtils.ConversionOptions
.Controls whether to truncate timestamps to millisecond precision lossily, or to crash when
truncation would result.
Options for how to convert BigQuery schemas to Beam schemas.
Builder for
BigQueryUtils.SchemaConversionOptions
.Configuration for writing to BigQuery with SchemaTransforms.
Builder for
BigQueryWriteConfiguration
.A BigQuery Write SchemaTransformProvider that routes to either
BigQueryFileLoadsSchemaTransformProvider
or BigQueryStorageWriteApiSchemaTransformProvider
.This is probably a temporary solution to what is a bigger migration from
cloud-bigtable-client-core to java-bigtable.
Override the configuration of Cloud Bigtable data and admin client.
Configuration for a Cloud Bigtable client.
Transforms
for reading from and writing to Google Cloud Bigtable.Overwrite options to determine what to do if change stream name is being reused and there
exists metadata of the same change stream name.
A
PTransform
that reads from Google Cloud Bigtable.A
PTransform
that writes to Google Cloud Bigtable.A
PTransform
that writes to Google Cloud Bigtable and emits a BigtableWriteResult
for each batch written.An implementation of
TypedSchemaTransformProvider
for Bigtable Read jobs configured via
BigtableReadSchemaTransformProvider.BigtableReadSchemaTransformConfiguration
.Configuration for reading from Bigtable.
The result of writing a batch of rows to Bigtable.
A coder for
BigtableWriteResult
.An implementation of
TypedSchemaTransformProvider
for Bigtable Write jobs configured via
BigtableWriteSchemaTransformProvider.BigtableWriteSchemaTransformConfiguration
.Configuration for writing to Bigtable.
Coder for
BitSet
.Construct BlobServiceClientBuilder from Azure pipeline options.
Options used to configure Microsoft Azure Blob Storage.
A
BlockBasedSource
is a FileBasedSource
where a file consists of blocks of
records.A
Block
represents a block of records that can be read.A
Reader
that reads records from a BlockBasedSource
.Holds an RDD or values for deferred conversion to an RDD if needed.
PTransform
that reads a bounded amount of data from an UnboundedSource
, specified
as one or both of a maximum number of elements or a maximum period of time to read.A
Source
that reads a finite amount of input and, because of that, supports some
additional operations.A
Reader
that reads a bounded amount of input and supports some additional operations,
such as progress estimation and dynamic work rebalancing.Jet
Processor
implementation for reading from a bounded Beam
source.Internal: For internal use only and not for public consumption.
Implementation of
BoundedTrie
.Internal: For internal use only and not for public consumption.
A
BoundedWindow
represents window information assigned to data elements.An interface for elements buffered during a checkpoint when using @RequiresStableInput.
Sorter
that will use in memory sorting until the values can't fit into memory and will
then fall back to external sorting.Contains configuration for the sorter.
A
DoFnRunner
which buffers data for supporting DoFn.RequiresStableInput
.A thread safe
StreamObserver
which uses a bounded queue to pass elements to a processing
thread responsible for interacting with the underlying CallStreamObserver
.Hash Functions.
BuiltinStringFunctions.
TrigonometricFunctions.
An immutable collection of elements which are part of a
PCollection
.A handler which is invoked when the SDK returns
BeamFnApi.DelayedBundleApplication
s as
part of the bundle completion.Utility methods for creating
BundleCheckpointHandler
s.A
BundleCheckpointHandler
which uses TimerInternals.TimerData
and ValueState
to reschedule BeamFnApi.DelayedBundleApplication
.A handler for the runner when a finalization request has been received.
Utility methods for creating
BundleFinalizationHandler
s.A handler for bundle progress messages, both during bundle execution and on its completion.
A handler which is invoked whenever an active bundle is split.
Serializable byte array.
A
Coder
for byte[]
.Give a Java type, returns the Java type expected for use with Row.
Takes a
StackManipulation
that returns a value.Row is going to call the setter with its internal Java type, however the user object being set
might have a different type internally.
A naming strategy for ByteBuddy classes.
A class representing a key consisting of an array of bytes.
A class representing a range of
ByteKeys
.An estimator to provide an estimate on the byte throughput of the outputted elements.
An estimator to provide an estimate on the throughput of the outputted elements.
A duplicate of
ByteStringCoder
that uses the Apache Beam vendored protobuf.A
Coder
for ByteString
objects based on their encoded Protocol Buffer form.Benchmarks for
ByteStringOutputStream
.These benchmarks below provide good details as to the cost of creating a new buffer vs copying
a subset of the existing one and re-using the larger one.
Helper functions to evaluate the completeness of collection of ByteStringRanges.
ByteToWindow function.
ByteToWindow function.
ByteToWindow function.
Transforms for reading and writing request/response associations to a cache.
A simple POJO that holds both cache read and write
PTransform
s.SideInputReader that caches results for costly
Materializations
.SideInputReader
that caches materialized views.A wrapper around a
Factory
that assumes the schema parameter never changes.Abstract wrapper for
CalciteConnection
to simplify extension.Wrapper for
CalciteFactory
.The core component to handle through a SQL statement, from explain execution plan, to generate a
Beam pipeline.
Utility methods for Calcite related operations.
A LogicalType corresponding to TIME_WITH_LOCAL_TIME_ZONE.
CalcRelSplitter operates on a
Calc
with multiple RexCall
sub-expressions that
cannot all be implemented by a single concrete RelNode
.Type of relational expression.
A collection of
WindowFn
s that windows values into calendar-based windows such as spans
of days, months, or years.A
WindowFn
that windows elements into periods measured by days.A
WindowFn
that windows elements into periods measured by months.A
WindowFn
that windows elements into periods measured by years.Caller
interfaces user custom code intended for API calls.Informs whether a call to an API should backoff.
A simplified
ThreadSafe
blocking queue that can be cancelled freeing any blocked Thread
s and preventing future Thread
s from blocking.The exception thrown when a
CoderRegistry
or CoderProvider
cannot provide a
Coder
that has been requested.Indicates the reason that
Coder
inference failed.An IO to read and write from/to Apache Cassandra
Specify the mutation type: either write or delete.
A
PTransform
to read data from Apache Cassandra.A
PTransform
to read data from Apache Cassandra.A
PTransform
to mutate into Apache Cassandra.Set of utilities for casting rows between schemas.
Describes compatibility errors during casting.
Narrowing changes type without guarantee to preserve data.
Interface for statically validating casts.
Widening changes to type that can represent any possible value of the original type.
ZetaSQLCastFunctionImpl.
Represents a named and configurable container for managing tables.
Top-level authority that manages
Catalog
s.Over-arching registrar to capture available
Catalog
s.A
CdapIO
is a Transform for reading data from source or writing data to sink of a Cdap
Plugin.A
PTransform
to read from CDAP source.A
PTransform
to write to CDAP sink.A
CEPCall
instance represents an operation (node) that contains an operator and a list of
operands.A
CEPFieldRef
instance represents a node that points to a specified field in a
Row
.CEPKind
corresponds to Calcite's SqlKind
.CEPLiteral
represents a literal node.The
CEPMeasure
class represents the Measures clause and contains information about output
columns.CEPOperation
is the base class for the evaluation operations defined in the
DEFINE
syntax of MATCH_RECOGNIZE
.The
CEPOperator
records the operators (i.e.Core pattern class that stores the definition of a single pattern.
Some utility methods for transforming Calcite's constructs into our own Beam constructs (for
serialization purpose).
This class is responsible for processing individual ChangeStreamRecord.
Data access object to list and read stream partitions of a table.
Responsible for making change stream queries for a given partition.
Class to aggregate metrics related functionality.
Class to aggregate metrics related functionality.
Represents a Spanner Change Stream Record.
Holds internal execution metrics / metadata for the processed
ChangeStreamRecord
.Decorator class over a
ResultSet
that provides telemetry for the streamed records.Represents telemetry metadata gathered during the consumption of a change stream query.
Single place for defining the constants used in the
Spanner.readChangeStreams()
connector.Checkpoint data to make it available in future pipeline runs.
Checkpoint dir tree.
Helpers for reporting checkpoint durations.
A child partition represents a new partition that should be queried.
Represents a ChildPartitionsRecord.
This class is part of the process for
ReadChangeStreamPartitionDoFn
SDF.Encoder for TIME and DATETIME values, according to civil_time encoding.
A read-only
FileSystem
implementation looking up resources using a ClassLoader.AutoService
registrar for the ClassLoaderFileSystem
.An IO to write to ClickHouse.
A
PTransform
to write to ClickHouse.Writes Rows and field values using
ClickHousePipedOutputStream
.Factory to build and configure any
AwsClientBuilder
using a specific ClientConfiguration
or the globally provided settings in AwsOptions
as fallback.Default implementation of
ClientBuilderFactory
.Trust provider to skip certificate verification.
AWS client configuration.
Access to the current time.
A receiver of streamed data that can be closed.
An
AutoCloseable
that wraps a resource that needs to be cleaned up but does not implement
AutoCloseable
itself.An exception that wraps errors thrown while a resource is being closed.
A function that knows how to clean up after a resource.
A
ThrowingConsumer
that can be closed.A representation of an arbitrary Java object to be instantiated by Dataflow workers.
Utilities for converting an object to a
CloudObject
.A translator that takes an object and creates a
CloudObject
which can be converted back
to the original object.A class providing transforms between Cloud Pub/Sub and Pub/Sub Lite message types.
Properties needed when using Google CloudResourceManager with the Apache Beam SDK.
Factory class for implementations of
AnnotateImages
.Accepts
ByteString
(encoded image contents) with optional DoFn.SideInput
with a Map
of ImageContext
to
the image.Accepts
String
(image URI on GCS) with optional DoFn.SideInput
with a Map
of ImageContext
to
the image.A
Sink
for Spark's
metric system reporting metrics (including Beam step metrics) to a CSV file.A
Sink
for Spark's
metric system reporting metrics (including Beam step metrics) to Graphite.A
Coder<T>
defines how to encode and decode values of type T
into
byte streams.Deprecated.
To implement a coder, do not use any
Coder.Context
.Exception thrown by
Coder.verifyDeterministic()
if the encoding is not deterministic,
including details of why the encoding is not deterministic.Coder
authors have the ability to automatically have their Coder
registered with
the Dataflow Runner by creating a ServiceLoader
entry and a concrete implementation of
this interface.An
Exception
thrown if there is a problem encoding or decoding a value.Serialization utility class.
Serialization utility class.
A function for converting a byte array pair to a key-value pair.
Properties for use in
Coder
tests.An
ElementByteSizeObserver
that records the observed element sizes for testing
purposes.A
CoderProvider
provides Coder
s.Coder
creators have the ability to automatically have their coders
registered with this SDK by creating a ServiceLoader
entry and a concrete implementation
of this interface.Static utility methods for creating and working with
CoderProvider
s.This class is used to estimate the size in bytes of a given element.
Flink
TypeInformation
for Beam Coder
s.Flink
TypeSerializer
for Beam Coders
.A row result of a
CoGroupByKey
.A
Coder
for CoGbkResult
s.A schema for the results of a
CoGroupByKey
.A transform that performs equijoins across multiple schema
PCollection
s.Defines the set of fields to extract for the join key, as well as other per-input join options.
A
PTransform
that calculates the cross-product join.The implementing PTransform.
A
PTransform
that performs a CoGroupByKey
on a tuple of tables.Defines a column type from a Cloud Spanner table with the following information: column name,
column type, flag indicating if column is primary key and column position in the table.
PTransform
s for combining PCollection
elements globally and per-key.A
CombineFn
that uses a subclass of Combine.AccumulatingCombineFn.Accumulator
as its
accumulator type.The type of mutable accumulator values used by this
AccumulatingCombineFn
.An abstract subclass of
Combine.CombineFn
for implementing combiners that are more easily and
efficiently expressed as binary operations on double
s.An abstract subclass of
Combine.CombineFn
for implementing combiners that are more easily
expressed as binary operations.An abstract subclass of
Combine.CombineFn
for implementing combiners that are more easily and
efficiently expressed as binary operations on int
sAn abstract subclass of
Combine.CombineFn
for implementing combiners that are more easily and
efficiently expressed as binary operations on long
s.A
CombineFn<InputT, AccumT, OutputT>
specifies how to combine a collection of input
values of type InputT
into a single output value of type OutputT
.Combine.Globally<InputT, OutputT>
takes a PCollection<InputT>
and returns a
PCollection<OutputT>
whose elements are the result of combining all the elements in
each window of the input PCollection
, using a specified CombineFn<InputT, AccumT, OutputT>
.Combine.GloballyAsSingletonView<InputT, OutputT>
takes a PCollection<InputT>
and returns a PCollectionView<OutputT>
whose elements are the result of combining all
the elements in each window of the input PCollection
, using a specified CombineFn<InputT, AccumT, OutputT>
.GroupedValues<K, InputT, OutputT>
takes a PCollection<KV<K, Iterable<InputT>>>
,
such as the result of GroupByKey
, applies a specified CombineFn<InputT, AccumT, OutputT>
to each of the input KV<K, Iterable<InputT>>
elements to produce a combined output KV<K, OutputT>
element, and returns a
PCollection<KV<K, OutputT>>
containing all the combined output elements.Holds a single value value of type
V
which may or may not be present.PerKey<K, InputT, OutputT>
takes a PCollection<KV<K, InputT>>
, groups it by
key, applies a combining function to the InputT
values associated with each key to
produce a combined OutputT
value, and returns a PCollection<KV<K, OutputT>>
representing a map from each distinct key of the input PCollection
to the corresponding
combined value.Like
Combine.PerKey
, but sharding the combining of hot keys.Deprecated.
For internal use only; no backwards-compatibility guarantees.
For internal use only; no backwards-compatibility guarantees.
Static utility methods that create combine function instances.
A tuple of outputs produced by a composed combine functions.
A builder class to construct a composed
CombineFnBase.GlobalCombineFn
.A composed
Combine.CombineFn
that applies multiple CombineFns
.A composed
CombineWithContext.CombineFnWithContext
that applies multiple CombineFnWithContexts
.Utilities for testing
CombineFns
.This class contains combine functions that have access to
PipelineOptions
and side inputs
through CombineWithContext.Context
.A combine function that has access to
PipelineOptions
and side inputs through
CombineWithContext.Context
.Information accessible to all methods in
CombineFnWithContext
and
KeyedCombineFnWithContext
.An internal interface for signaling that a
GloballyCombineFn
or a
PerKeyCombineFn
needs to access CombineWithContext.Context
.A
ReadableState
cell defined by a Combine.CombineFn
, accepting multiple input values,
combining them as specified into accumulators, and producing a single output value.A Source that reads from compressed files.
Reader for a
CompressedSource
.Deprecated.
Use
Compression
insteadFactory interface for creating channels that decompress the content of an underlying channel.
Various compression types for reading/writing files.
Class for building
PluginConfig
object of the specific class .A
DeserializerProvider
that uses Confluent Schema Registry to resolve a
Deserializer
s and Coder
given a subject.Enumeration of debezium connectors.
Print to console.
Write to console.
PTransform
writing PCollection
to the console.Pair of a bit of user code (a "closure") and the
Requirements
needed to run it.A function from an input to an output that may additionally access
Contextful.Fn.Context
when
computing the result.An accessor for additional capabilities available in
Contextful.Fn.apply(InputT, org.apache.beam.sdk.transforms.Contextful.Fn.Context)
.PTransform
s that read text files and collect contextual information of the elements in
the input.Implementation of
ContextualTextIO.read()
.Implementation of
ContextualTextIO.readFiles()
.A range of contiguous event sequences and the latest timestamp of the events in the range.
A pool of control clients that brokers incoming SDK harness connections (in the form of
InstructionRequestHandlers
.A sink for
InstructionRequestHandlers
keyed by worker id.A source of
InstructionRequestHandlers
.Conversion context, some rules need this data to convert the nodes.
A set of utilities for converting between different objects supporting schemas.
Helper functions for converting between equivalent schema types.
Return value after converting a schema.
A
BoundedSource
reading from Comos.Create a cosmos client from the pipeline options.
PTransforms
to count the elements in a PCollection
.A metric that reports a single long value and can be incremented or decremented.
Implementation of
Counter
.Returns the count of TRUE values for expression.
Pipeline visitors that fills a lookup table of
PValue
to number of consumers.Most users should use
GenerateSequence
instead.The checkpoint for an unbounded
CountingSource
is simply the last value produced.A custom coder for
CounterMark
.Combine.CombineFn
for Covariance on Number
types.A
PipelineRunner
that applies no overrides and throws an exception on calls to Pipeline.run()
.Create<T>
takes a collection of elements of type T
known when the pipeline is
constructed and returns a PCollection<T>
containing the elements.A
PTransform
that creates a PCollection
whose elements have associated
timestamps.A
PTransform
that creates a PCollection
from a set of in-memory objects.A
PTransform
that creates a PCollection
whose elements have associated
windowing metadata.A
DataflowRunner
marker class for creating a PCollectionView
.Enum containing all supported dispositions for table.
An abstract class that contains common configuration options for creating resources.
An abstract builder for
CreateOptions
.A standard configuration options with builder.
Builder for
CreateOptions.StandardCreateOptions
.Create an input stream from Queue.
Spark streaming overrides for various view (side input) transforms.
Creates a primitive
PCollectionView
.Creates any tables needed before performing streaming writes to the tables.
Construct an oauth credential to be used by the SDK and the SDK workers.
Parameters abstract class to expose the transforms to an external SDK.
PTransform
s for reading and writing CSV files.PTransform
for writing CSV files.PTransform
for Parsing CSV Record Strings into Schema
-mapped target types.CsvIOParseError
is a data class to store errors from CSV record processing.A
Sink
for Spark's
metric system reporting metrics (including Beam step metrics) to a CSV file.A
FileWriteSchemaTransformFormatProvider
for CSV format.An implementation of
TypedSchemaTransformProvider
for CsvIO.write(java.lang.String, org.apache.commons.csv.CSVFormat)
.Configuration for writing to BigQuery with Storage Write API.
Builder for
CsvWriteTransformProvider.CsvWriteConfiguration
.An abstract base class that implements all methods of
Coder
except Coder.encode(T, java.io.OutputStream)
and Coder.decode(java.io.InputStream)
.Describes a customer.
An optional component to use with the
RetryHttpRequestInitializer
in order to provide
custom errors for failing http calls.A Builder which allows building immutable CustomHttpErrors object.
A simple Tuple class for creating a list of HttpResponseMatcher and HttpResponseCustomError to
print for the responses.
A helper class for supporting sources defined as
Source
.Interface that table providers can implement if they require custom table name resolution.
A policy for custom record timestamps where timestamps within a partition are expected to be
roughly monotonically increasing with a cap on out of order event delays (say 1 minute).
A Custom X509TrustManager that trusts a user provided CA and default CA's.
Utility class for wiring up Jet DAGs based on Beam pipelines.
Listener that can be registered with a
DAGBuilder
in order to be notified when edges
are being registered.Factory class to create data access objects to perform change stream queries and access the
metadata tables.
Pipeline options for Data Catalog table provider.
Uses DataCatalog to get the source type and schema for a table.
A data change record encodes modifications to Cloud Spanner rows.
This class is part of the process for
ReadChangeStreamPartitionDoFn
SDF.Wrapper around the generated
Dataflow
client to provide common functionality.Specialized implementation of
GroupByKey
for translating Redistribute transform into
Dataflow service protos.Registers
DataflowGroupByKey.DataflowGroupByKeyTranslator
.An exception that is thrown if the unique job name constraint of the Dataflow service is broken
because an existing job with the same job name is currently active.
An exception that is thrown if the existing job has already been updated within the Dataflow
service and is no longer able to be updated.
A
RuntimeException
that contains information about a DataflowPipelineJob
.Internal.
Returns the default Dataflow client built from the passed in PipelineOptions.
Creates a
Stager
object using the class specified in DataflowPipelineDebugOptions.getStagerClass()
.Sets Integer value based on old, deprecated field (
DataflowPipelineDebugOptions.getUnboundedReaderMaxReadTimeSec()
).A DataflowPipelineJob represents a job submitted to Dataflow using
DataflowRunner
.Options that can be used to configure the
DataflowRunner
.Set of available Flexible Resource Scheduling goals.
Returns a default staging location under
GcpOptions.getGcpTempLocation()
.Register the
DataflowPipelineOptions
.Register the
DataflowRunner
.DataflowPipelineTranslator
knows how to translate Pipeline
objects into Cloud
Dataflow Service API Job
s.The result of a job translation.
Options that are used to configure the Dataflow pipeline worker pool.
Type of autoscaling algorithm to use.
Options for controlling profiling of pipeline execution.
Configuration the for profiling agent.
A
PipelineRunner
that executes the operations in the pipeline by first translating them
to the Dataflow representation using the DataflowPipelineTranslator
and then submitting
them to a Dataflow service for execution.A marker
DoFn
for writing the contents of a PCollection
to a streaming PCollectionView
backend implementation.An instance of this class can be passed to the
DataflowRunner
to add user defined hooks
to be invoked at various times during pipeline execution.Populates versioning and other information for
DataflowRunner
.Signals there was an error retrieving information about a job from the Cloud Dataflow Service.
[Internal] Options for configuring StreamingDataflowWorker.
EnableStreamingEngine defaults to false unless one of the experiment is set.
Read global get config request period from system property
'windmill.global_config_refresh_period'.
Read counter reporting period from system property 'windmill.harness_update_reporting_period'.
Factory for creating local Windmill address.
Read 'MaxStackTraceToReport' from system property 'windmill.max_stack_trace_to_report' or
Integer.MAX_VALUE if unspecified.
Read 'PeriodicStatusPageOutputDirector' from system property
'windmill.periodic_status_page_directory' or null if unspecified.
Factory for setting value of WindmillServiceStreamingRpcBatchLimit based on environment.
A
DataflowPipelineJob
that is returned when --templateRunner
is set.Helpers for cloud communication.
Options that are used exclusively within the Dataflow worker harness.
Deprecated.
This interface will no longer be the source of truth for worker logging configuration
once jobs are executed using a dedicated SDK harness instead of user code being co-located
alongside Dataflow worker code.
The set of log levels that can be used on the Dataflow worker.
Defines a log level override for a specific class, package, or name.
Wrapper for invoking external Python
DataframeTransform
.The main PTransform that encapsulates the data generation logic.
A stateful DoFn that converts a sequence of Longs into structured Rows.
Represents a 'datagen' table within a Beam SQL pipeline.
The service entry point for the 'datagen' table type.
Wrapper for
DataInputView
.Wrapper for
DataOutputView
.Holder for Spark RDD/DStream.
DatastoreIO
provides an API for reading from and writing to Google Cloud Datastore over different
versions of the Cloud Datastore Client libraries.DatastoreV1
provides an API to Read, Write and Delete PCollections
of
Google Cloud Datastore version v1 Entity
objects.A
PTransform
that deletes Entities
from Cloud Datastore.A
PTransform
that deletes Entities
from Cloud Datastore and returns
DatastoreV1.WriteSuccessSummary
for each successful write.A
PTransform
that deletes Entities
associated with the given Keys
from Cloud Datastore and returns DatastoreV1.WriteSuccessSummary
for each successful delete.A
PTransform
that reads the result rows of a Cloud Datastore query as Entity
objects.A
PTransform
that writes Entity
objects to Cloud Datastore.Summary object produced when a number of writes are successfully written to Datastore in a
single Mutation.
A
PTransform
that writes Entity
objects to Cloud Datastore and returns DatastoreV1.WriteSuccessSummary
for each successful write.An implementation of
SchemaIOProvider
for reading and writing payloads with DatastoreIO
.An abstraction to create schema aware IOs.
TableProvider
for DatastoreIO
for consumption by Beam SQL.DataStreams.DataStreamDecoder
treats multiple ByteString
s as a single input stream decoding
values with the supplied iterator.An adapter which converts an
InputStream
to a PrefetchableIterator
of T
values using the specified Coder
.An adapter which wraps an
DataStreams.OutputChunkConsumer
as an OutputStream
.A callback which is invoked whenever the
DataStreams.outbound(org.apache.beam.sdk.fn.stream.DataStreams.OutputChunkConsumer<org.apache.beam.vendor.grpc.v1p69p0.com.google.protobuf.ByteString>)
OutputStream
becomes full.A date without a time-zone.
DateFunctions.
A datetime without a time-zone.
DateTimeUtils.
Utility class which exposes an implementation
DebeziumIO.read()
and a Debezium configuration.A POJO describing a Debezium configuration.
Implementation of
DebeziumIO.read()
.A schema-aware transform provider for
DebeziumIO
.Exposes
DebeziumIO.Read
as an external transform for cross-language usage.A receiver of encoded data, decoding it and passing it onto a downstream consumer.
Remove values with duplicate ids.
A set of
PTransform
s which deduplicate input records over a time domain and threshold.Deduplicates keyed values using the key over a specified time domain and threshold.
Deduplicates values over a specified time domain and threshold.
A
PTransform
that uses a SerializableFunction
to obtain a representative value
for each input element used for deduplication.Default
represents a set of annotations that can be used to annotate getter properties on
PipelineOptions
with information representing the default value to be returned if no
value is specified.This represents that the default of the option is the specified boolean primitive value.
This represents that the default of the option is the specified byte primitive value.
This represents that the default of the option is the specified char primitive value.
This represents that the default of the option is the specified
Class
value.This represents that the default of the option is the specified double primitive value.
This represents that the default of the option is the specified enum.
This represents that the default of the option is the specified float primitive value.
Value must be of type
DefaultValueFactory
and have a default constructor.This represents that the default of the option is the specified int primitive value.
This represents that the default of the option is the specified long primitive value.
This represents that the default of the option is the specified short primitive value.
This represents that the default of the option is the specified
String
value.Default implementation of
AutoScaler
.Construct BlobServiceClientBuilder with given values of Azure client properties.
The
DefaultCoder
annotation specifies a Coder
class to handle encoding and
decoding instances of the annotated class.A
CoderProviderRegistrar
that registers a CoderProvider
which can use the
@DefaultCoder
annotation to provide coder providers
that creates
Coder
s.A
CoderProvider
that uses the @DefaultCoder
annotation to provide coder providers
that create Coder
s.The
CoderCloudObjectTranslatorRegistrar
containing the default collection of Coder
Cloud Object Translators
.Implementation of a
ExecutableStageContext
.A default
FileBasedSink.FilenamePolicy
for windowed and unwindowed files.Encapsulates constructor parameters to
DefaultFilenamePolicy
.A Coder for
DefaultFilenamePolicy.Params
.Factory for a default value for Google Cloud region according to
https://cloud.google.com/compute/docs/gcloud-compute/#default-properties.
The default way to construct a
GoogleAdsClient
.A
JobBundleFactory
for which the implementation can specify a custom EnvironmentFactory
for environment management.A container for EnvironmentFactory and its corresponding Grpc servers.
Holder for an
SdkHarnessClient
along with its associated state and data servers.A
PipelineOptionsRegistrar
containing the PipelineOptions
subclasses available by
default.Construct S3ClientBuilder with default values of S3 client properties like path style access,
accelerated mode, etc.
Registers the "s3" uri schema to be handled by
S3FileSystem
.The
DefaultSchema
annotation specifies a SchemaProvider
class to handle obtaining
a schema and row for the specified class.SchemaProvider
for default schemas.Registrar for default schemas.
Default global sequence combiner.
This default implementation of
BeamSqlTableFilter
interface.A trigger that is equivalent to
Repeatedly.forever(AfterWatermark.pastEndOfWindow())
.An interface used with the
Default.InstanceFactory
annotation to specify the class that
will be an instance factory to produce default values for a given getter on PipelineOptions
.A
DelegateCoder<T, IntermediateT>
wraps a Coder
for IntermediateT
and
encodes/decodes values of type T
by converting to/from IntermediateT
and then
encoding/decoding using the underlying Coder<IntermediateT>
.A
CodingFunction<InputT, OutputT>
is a serializable
function from InputT
to OutputT
that may throw any Exception
.Implementation of
Counter
that delegates to the instance for the current context.Implementation of
Distribution
that delegates to the instance for the current context.Implementation of
Gauge
that delegates to the instance for the current context.Implementation of
Histogram
that delegates to the instance for the current context.Descriptions are used to generate human readable output when the
--help
command is
specified.Provides a configured
Deserializer
instance and its associated Coder
.This class processes
DetectNewPartitionsDoFn
.This class is responsible for scheduling partitions.
A SplittableDoFn (SDF) that is responsible for scheduling partitions to be queried.
This restriction tracker delegates most of its behavior to an internal
TimestampRangeTracker
.Metadata of the progress of
DetectNewPartitionsDoFn
from the metadata
table.The DicomIO connectors allows Beam pipelines to make calls to the Dicom API of the Google Cloud
Healthcare API (https://cloud.google.com/healthcare/docs/how-tos#dicom-guide).
This class makes a call to the retrieve metadata endpoint
(https://cloud.google.com/healthcare/docs/how-tos/dicomweb#retrieving_metadata).
Options that can be used to configure the
DirectRunner
.A
DefaultValueFactory
that returns the result of Runtime.availableProcessors()
from the DirectOptions.AvailableParallelismFactory.create(PipelineOptions)
method.Registers the
DirectOptions
.Registers the
DirectRunner
.The result of running a
Pipeline
with the DirectRunner
.A
StreamObserver
which uses synchronization on the underlying CallStreamObserver
to provide thread safety.Internal-only options for tweaking the behavior of the
PipelineOptions.DirectRunner
in ways that users
should never do.Static display data associated with a pipeline component.
Utility to build up display data from a component and its included subcomponents.
Unique identifier for a display data item within a component.
Items
are the unit of display data.Specifies an
DisplayData.Item
to register as display data.Structured path of registered display data within a component hierarchy.
Display data type.
Distinct<T>
takes a PCollection<T>
and returns a PCollection<T>
that has
all distinct elements of the input.A
Distinct
PTransform
that uses a SerializableFunction
to obtain a
representative value for each input element.A metric that reports information about the distribution of reported values.
Implementation of
Distribution
.The result of a
Distribution
metric.A
PTransform
connecting to Cloud DLP (https://cloud.google.com/dlp/docs/libraries) and
deidentifying text according to provided settings.A
PTransform
connecting to Cloud DLP (https://cloud.google.com/dlp/docs/libraries) and
inspecting text for identifying data according to provided settings.A
PTransform
connecting to Cloud DLP (https://cloud.google.com/dlp/docs/libraries) and
inspecting text for identifying data according to provided settings.An
EnvironmentFactory
that creates docker containers by shelling out to docker.Provider for DockerEnvironmentFactory.
The argument to
ParDo
providing the code to use to process elements of the input PCollection
.Annotation for declaring that a state parameter is always fetched.
Annotation on a splittable
DoFn
specifying that the DoFn
performs a bounded amount of work per input element, so
applying it to a bounded PCollection
will produce also a bounded PCollection
.A parameter that is accessible during
@StartBundle
, @ProcessElement
and @FinishBundle
that allows the caller
to register a callback that will be invoked after the bundle has been successfully completed
and the runner has commit the output.An instance of a function that will be invoked after bundle finalization.
Parameter annotation for the input element for
DoFn.ProcessElement
, DoFn.GetInitialRestriction
, DoFn.GetSize
, DoFn.SplitRestriction
, DoFn.GetInitialWatermarkEstimatorState
, DoFn.NewWatermarkEstimator
, and DoFn.NewTracker
methods.Annotation for specifying specific fields that are accessed in a Schema PCollection.
Annotation for the method to use to finish processing a batch of elements.
Annotation for the method that maps an element to an initial restriction for a splittable
DoFn
.Annotation for the method that maps an element and restriction to initial watermark estimator
state for a splittable
DoFn
.Annotation for the method that returns the coder to use for the restriction of a splittable
DoFn
.Annotation for the method that returns the corresponding size for an element and restriction
pair.
Annotation for the method that returns the coder to use for the watermark estimator state of a
splittable
DoFn
.Parameter annotation for dereferencing input element key in
KV
pair.Receives tagged output for a multi-output function.
Annotation for the method that creates a new
RestrictionTracker
for the restriction of
a splittable DoFn
.Annotation for the method that creates a new
WatermarkEstimator
for the watermark state
of a splittable DoFn
.Annotation for registering a callback for a timer.
Annotation for registering a callback for a timerFamily.
Annotation for the method to use for performing actions on window expiration.
Receives values of the given type.
When used as a return value of
DoFn.ProcessElement
, indicates whether there is more work to
be done for the current element.Annotation for the method to use for processing elements.
Annotation that may be added to a
DoFn.ProcessElement
, DoFn.OnTimer
, or DoFn.OnWindowExpiration
method to indicate that the runner must ensure that the observable contents
of the input PCollection
or mutable state must be stable upon retries.Annotation that may be added to a
DoFn.ProcessElement
method to indicate that the runner
must ensure that the observable contents of the input PCollection
is sorted by time, in
ascending order.Parameter annotation for the restriction for
DoFn.GetSize
, DoFn.SplitRestriction
, DoFn.GetInitialWatermarkEstimatorState
, DoFn.NewWatermarkEstimator
, and DoFn.NewTracker
methods.Annotation for the method to use to prepare an instance for processing bundles of elements.
Parameter annotation for the SideInput for a
DoFn.ProcessElement
method.Annotation for the method that splits restriction of a splittable
DoFn
into multiple parts to
be processed in parallel.Annotation for the method to use to prepare an instance for processing a batch of elements.
Annotation for declaring and dereferencing state cells.
Annotation for the method to use to clean up this instance before it is discarded.
Parameter annotation for the TimerMap for a
DoFn.ProcessElement
method.Annotation for declaring and dereferencing timers.
Parameter annotation for the input element timestamp for
DoFn.ProcessElement
, DoFn.GetInitialRestriction
, DoFn.GetSize
, DoFn.SplitRestriction
, DoFn.GetInitialWatermarkEstimatorState
, DoFn.NewWatermarkEstimator
, and DoFn.NewTracker
methods.Annotation for the method that truncates the restriction of a splittable
DoFn
into a bounded one.Annotation on a splittable
DoFn
specifying that the DoFn
performs an unbounded amount of work per input element, so
applying it to a bounded PCollection
will produce an unbounded PCollection
.Parameter annotation for the watermark estimator state for the
DoFn.NewWatermarkEstimator
method.DoFn function.
Flink operator for executing
DoFns
.A
WindowedValueReceiver
that can buffer its outputs.Implementation of
DoFnOperator.OutputManagerFactory
that creates an DoFnOperator.BufferedOutputManager
that can write to multiple logical outputs by Flink side output.Common
DoFn.OutputReceiver
and DoFn.MultiOutputReceiver
classes.DoFnRunner decorator which registers
MetricsContainerImpl
.DoFnRunner
decorator which registers MetricsContainerImpl
.Represents information about how a DoFn extracts schemas.
The builder object.
Deprecated.
Use
TestPipeline
with the DirectRunner
.Deprecated.
Use
TestPipeline
with the DirectRunner
.A
DoubleCoder
encodes Double
values in 8 bytes using Java serialization.A transform to drop fields from a schema.
Implementation class for DropFields.
A specialization of
FileBasedSink.DynamicDestinations
for AvroIO
.This class provides the most general way of specifying dynamic BigQuery table destinations.
Some helper classes that derive from
FileBasedSink.DynamicDestinations
.A
Coder
using Google Protocol Buffers binary format.IO to read from and write to DynamoDB tables.
Read data from DynamoDB using
DynamoDBIO.Read.getScanRequestFn()
and emit an element of type DynamoDBIO.Read
for each ScanResponse
using the mapping function DynamoDBIO.Read.getScanResponseMapperFn()
.Write a PCollection data into DynamoDB.
Transforms for reading and writing data from/to Elasticsearch.
A
BoundedSource
reading from Elasticsearch.A
PTransform
writing Bulk API entities created by ElasticsearchIO.DocToBulk
to
an Elasticsearch cluster.A POJO describing a connection configuration to Elasticsearch.
A
PTransform
converting docs to their Bulk API counterparts.A
PTransform
reading data from Elasticsearch.A POJO encapsulating a configuration for retry behavior when issuing requests to ES.
A
PTransform
writing data to Elasticsearch.Manipulates test data used by the
ElasticsearchIO
integration tests.Pipeline options for elasticsearch tests.
Map to tuple function.
An
EnvironmentFactory
that communicates to a FnHarness
which is executing in the
same process.Provider of EmbeddedEnvironmentFactory.
Passing null values to Spark's Java API may cause problems because of Guava preconditions.
Options for allowing or disallowing filepatterns that match no resources in
FileSystems.match(java.util.List<java.lang.String>)
.A wrapper around a
Throwable
for use with coders.An encoded
BoundedWindow
used within Runners to track window information without needing
to decode the window.Flink
TypeComparator
for Beam values that have been
encoded to byte data by a Coder
.TypeSerializer
for values that were encoded using a Coder
.Flink
TypeInformation
for Beam values that have been encoded to byte data by a Coder
.Encoders
utility class.Encoder / expression utils that are called from generated code.
Represents an error during encoding (serializing) a class.
Represents an error during encoding (serializing) a class.
This
Schema.LogicalType
represent an enumeration over a fixed set of values.This class represents a single enum value.
Creates
environments
which communicate to an SdkHarnessClient
.Provider for a
EnvironmentFactory
and ServerFactory
for the environment.ErrorContainer interface.
An Error Handler is a utility object used for plumbing error PCollections to a configured sink
Error Handlers must be closed before a pipeline is run to properly pipe error collections to the
sink, and the pipeline will be rejected if any handlers aren't closed.
A default, placeholder error handler that exists to allow usage of .addErrorCollection()
without effects.
The
EvaluationContext
is the result of a pipeline translation
and can be used to evaluate / run the pipeline.The EvaluationContext allows us to define pipeline instructions and translate between
PObject<T>
s or PCollection<T>
s and Ts or DStreams/RDDs of Ts.Classes extending this interface will be called by
OrderedEventProcessor
to examine every
incoming event.The interface that enables querying of a graph of independently executable stages and the inputs
and outputs of those stages.
The context required in order to execute
stages
.Creates
ExecutableStageContext
instances.This operator is the streaming equivalent of the
FlinkExecutableStageFunction
.Drives the execution of a
Pipeline
by scheduling work.The state of the driver.
Options for configuring the
ScheduledExecutorService
used throughout the Java runtime.Returns the default
ScheduledExecutorService
to use within the Apache Beam SDK.A
gRPC Server
for an ExpansionService.A service that allows pipeline expand transforms from a remote SDK.
A registrar that creates
TransformProvider
instances from RunnerApi.FunctionSpec
s.Exposes Java transforms via
ExternalTransformRegistrar
.Options used to configure the
ExpansionService
.Loads the ExpansionService config.
Loads the allow list from
ExpansionServiceOptions.getJavaClassLookupAllowlistFile()
, defaulting to an empty
JavaClassLookupTransformProvider.AllowList
.Apache Beam provides a number of experimental features that can be enabled with this flag.
Extracts expressions (function calls, field accesses) from the resolve query nodes, converts them
to RexNodes.
An
EnvironmentFactory
which requests workers via the given URL in the Environment.Provider of ExternalEnvironmentFactory.
Exposes
PubsubIO.Read
as an external transform for cross-language usage.Parameters class to expose the transform to an external SDK.
Does an external sort of the provided values.
ExternalSorter.Options
contains configuration of the sorter.Sorter type.
Provides mechanism for acquiring locks related to the job.
An interface for building a transform from an externally provided configuration.
A registrar which contains a mapping from URNs to available
ExternalTransformBuilder
s.Exposes
PubsubIO.Write
as an external transform for cross-language usage.Parameters class to expose the transform to an external SDK.
A Factory interface for schema-related objects for a specific Java type.
Alternative implementation of
PipelineResult
used to avoid throwing Exceptions in certain
situations.An immutable tuple of value, timestamp, window, and pane.
A coder for
FailsafeValueInSingleWindow
.A generic failure of an SQL transform.
Class FailureCollectorWrapper is a class for collecting ValidationFailure.
A fake implementation of BigQuery's query service..
An implementation of
BigQueryServices.BigQueryServerStream
which takes a List
as the Iterable
to simulate a server stream.A fake dataset service that can be serialized, for use in testReadFromTable.
A fake implementation of BigQuery's job service.
FhirBundleParameter represents a FHIR bundle in JSON format to be executed on a FHIR store.
FhirIO
provides an API for reading and writing resources to Google Cloud Healthcare Fhir API.Deidentify FHIR resources from a FHIR store to a destination FHIR store.
A function that schedules a deidentify operation and monitors the status.
The type Execute bundles.
ExecuteBundlesResult contains both successfully executed bundles and information help debugging
failed executions (eg metadata invalid input: '&' error msgs).
Export FHIR resources from a FHIR store to new line delimited json files on GCS or BigQuery.
A function that schedules an export operation and monitors the status.
Writes each bundle of elements to a new-line delimited JSON file on GCS and issues a
fhirStores.import Request for that file.
The enum Content structure.
The type Read.
The type Result.
The type Search.
The type Write.
The type Result.
The enum Write method.
The type FhirIOPatientEverything for querying a FHIR Patient resource's compartment.
PatientEverythingParameter defines required attributes for a FHIR GetPatientEverything request
in
FhirIOPatientEverything
.The Result for a
FhirIOPatientEverything
request.FhirSearchParameter represents the query parameters for a FHIR search request, used as a
parameter for
FhirIO.Search
.FhirSearchParameterCoder is the coder for
FhirSearchParameter
, which takes a coder for
type T.Used inside of a
DoFn
to describe which fields in a schema
type need to be accessed for processing.Description of a single field.
Builder class.
Qualifier for a list selector.
Qualifier for a map selector.
OneOf union for a collection selector.
The kind of qualifier.
Parser for textual field-access selector.
This class provides an empty implementation of
FieldSpecifierNotationListener
,
which can be extended to create a listener which only needs to handle a subset
of the available methods.This class provides an empty implementation of
FieldSpecifierNotationVisitor
,
which can be extended to create a visitor which only needs to handle a subset
of the available methods.This interface defines a complete listener for a parse tree produced by
FieldSpecifierNotationParser
.This interface defines a complete generic visitor for a parse tree produced
by
FieldSpecifierNotationParser
.Utilities for converting between
Schema
field types and TypeDescriptor
s that
define Java objects which can represent these field types.For internal use only; no backwards-compatibility guarantees.
For internal use only; no backwards-compatibility guarantees.
Represents type information for a Java type that will be used to infer a Schema type.
A naming policy for schema fields.
Abstract class for file-based output.
Deprecated.
use
Compression
.A class that allows value-dependent writes in
FileBasedSink
.A naming policy for output files.
Result of a single bundle write.
A coder for
FileBasedSink.FileResult
objects.Provides hints about how to generate output files, such as a suggested filename suffix (e.g.
Implementations create instances of
WritableByteChannel
used by FileBasedSink
and related classes to allow decorating, or otherwise transforming, the raw data that
would normally be written directly to the WritableByteChannel
passed into FileBasedSink.WritableByteChannelFactory.create(WritableByteChannel)
.Abstract operation that manages the process of writing to
FileBasedSink
.Abstract writer that writes a bundle to a
FileBasedSink
.A common base class for all file-based
Source
s.A
reader
that implements code common to readers of
FileBasedSource
s.A given
FileBasedSource
represents a file resource of one of these types.Matcher to verify checksum of the contents of an
ShardedFile
in E2E test.General-purpose transforms for working with files: listing files (matching), reading and writing.
Implementation of
FileIO.match()
.Implementation of
FileIO.matchAll()
.Describes configuration for matching filepatterns, such as
EmptyMatchTreatment
and
continuous watching for matching files.A utility class for accessing a potentially compressed file.
Implementation of
FileIO.readMatches()
.Enum to control how directories are handled.
Specifies how to write elements to individual files in
FileIO.write()
and FileIO.writeDynamic()
.Implementation of
FileIO.write()
and FileIO.writeDynamic()
.A policy for generating names for shard files.
Interface that provides a
PTransform
that reads in a PCollection
of FileIO.ReadableFile
s and outputs the data represented as a PCollection
of Row
s.Flink
metrics reporter
for writing
metrics to a file specified via the "metrics.reporter.file.path" config key (assuming an alias of
"file" for this reporter in the "metrics.reporters" setting).File staging related options.
File system interface in Beam.
A registrar that creates
FileSystem
instances from PipelineOptions
.Clients facing
FileSystem
utility.The configuration for building file writing transforms using
SchemaTransform
and SchemaTransformProvider
.Configures extra details related to writing CSV formatted files.
Configures extra details related to writing Parquet formatted files.
Configures extra details related to writing XML formatted files.
Provides a
PTransform
that writes a PCollection
of Row
s and outputs a
PCollection
of the file names according to a registered AutoService
FileWriteSchemaTransformFormatProvider
implementation.FileWriteSchemaTransformFormatProviders
contains FileWriteSchemaTransformFormatProvider
implementations.A
TypedSchemaTransformProvider
implementation for writing a Row
PCollection
to file systems, driven by a FileWriteSchemaTransformConfiguration
.Fill gaps in timeseries.
Argument to withInterpolateFunction function.
A
PTransform
for filtering a collection of schema types.PTransform
s for filtering from a PCollection
the elements satisfying a predicate,
or satisfying an inequality with a given value based on the elements' natural ordering.Implementation of the filter.
Utilities that convert between a SQL filter expression and an Iceberg
Expression
.Builds a MongoDB FindQuery object.
FirestoreIO
provides an API for reading from and writing to Google Cloud
Firestore.FirestoreV1
provides an API which provides lifecycle managed PTransform
s for Cloud Firestore
v1 API.Concrete class representing a
PTransform
<
PCollection
<
BatchGetDocumentsRequest
>,
PTransform
<
BatchGetDocumentsResponse
>>
which will read from Firestore.A type safe builder for
FirestoreV1.BatchGetDocuments
allowing configuration and instantiation.Concrete class representing a
PTransform
<
PCollection
<
Write
>,
PCollection
<
FirestoreV1.WriteFailure
>
which will write to Firestore.A type safe builder for
FirestoreV1.BatchWriteWithDeadLetterQueue
allowing configuration and
instantiation.A type safe builder for
FirestoreV1.BatchWriteWithSummary
allowing configuration and
instantiation.Exception that is thrown if one or more
Write
s is unsuccessful
with a non-retryable status code.Concrete class representing a
PTransform
<
PCollection
<
ListCollectionIdsRequest
>,
PTransform
<
ListCollectionIdsResponse
>>
which will read from Firestore.A type safe builder for
FirestoreV1.ListCollectionIds
allowing configuration and instantiation.Concrete class representing a
PTransform
<
PCollection
<
ListDocumentsRequest
>,
PTransform
<
ListDocumentsResponse
>>
which will read from Firestore.A type safe builder for
FirestoreV1.ListDocuments
allowing configuration and instantiation.Concrete class representing a
PTransform
<
PCollection
<
PartitionQueryRequest
>,
PTransform
<
RunQueryRequest
>>
which will read from Firestore.A type safe builder for
FirestoreV1.PartitionQuery
allowing configuration and instantiation.Type safe builder factory for read operations.
Concrete class representing a
PTransform
<
PCollection
<
RunQueryRequest
>,
PTransform
<
RunQueryResponse
>>
which
will read from Firestore.A type safe builder for
FirestoreV1.RunQuery
allowing configuration and instantiation.Type safe builder factory for write operations.
Failure details for an attempted
Write
.Summary object produced when a number of writes are successfully written to Firestore in a
single BatchWrite.
A LogicalType representing a fixed-length byte array.
Fixed precision numeric types used to represent jdbc NUMERIC and DECIMAL types.
A LogicalType representing a fixed-length string.
A
WindowFn
that windows values into fixed-size timestamp-based windows.PTransform
s for mapping a simple function that returns iterables over the elements of a
PCollection
and merging the results.A
PTransform
that adds exception handling to FlatMapElements
.Flatten<T>
takes multiple PCollection<T>
s bundled into a
PCollectionList<T>
and returns a single PCollection<T>
containing all the elements in
all the input PCollection
s.FlattenIterables<T>
takes a PCollection<Iterable<T>>
and returns a
PCollection<T>
that contains all the elements from each iterable.A
PTransform
that flattens a PCollectionList
into a PCollection
containing all the elements of all the PCollection
s in its input.Jet
Processor
implementation for Beam's Flatten primitive.Jet
Processor
supplier that will provide instances of FlattenP
.An implementation of
TypedSchemaTransformProvider
for Flatten.Flatten translator.
Category tag for tests that use a
Flatten
where the input PCollectionList
contains PCollections
heterogeneous coders
.Flink
FlatMapFunction
for implementing Window.Assign
.A translator that translates bounded portable pipelines into executable Flink pipelines.
Batch translation context.
Predicate to determine whether a URN is a Flink native transform.
Transform translation interface.
A Flink
Source
implementation that wraps a
Beam BoundedSource
.A Flink
SourceReader
implementation
that reads from the assigned FlinkSourceSplits
by using Beam BoundedReaders
.StateInternals
that uses a Flink OperatorStateBackend
to manage the broadcast
state.Result of a detached execution of a
Pipeline
with Flink.Encapsulates a
DoFn
inside a Flink RichMapPartitionFunction
.Singleton class that contains one
ExecutableStageContext.Factory
per job.Flink operator that passes its input DataSet through an SDK-executed
ExecutableStage
.A Flink function that demultiplexes output from a
FlinkExecutableStageFunction
.Utilities for Flink execution environments.
Explode
WindowedValue
that belongs to multiple windows into multiple "single window"
values
, so we can safely group elements by (K, W) tuples.A map function that outputs the input element without any change.
Job Invoker for the
FlinkRunner
.Driver program that starts a job server for the Flink runner.
Flink runner-specific Configuration for the jobServer.
Utility functions for dealing with key encoding.
Special version of
FlinkReduceFunction
that supports merging windows.Helper class for holding a
MetricsContainerImpl
and forwarding Beam metrics to Flink
accumulators and metrics.The base helper class for holding a
MetricsContainerImpl
and forwarding
Beam metrics to Flink accumulators and metrics.Entry point for starting an embedded Flink cluster.
A
FlatMapFunction
function that filters out those elements that don't belong in this
output.Reduce function for non-merging GBK implementation.
A
StepContext
for Flink Batch Runner execution.This is the first step for executing a
Combine.PerKey
on
Flink.Options which can be used to configure the Flink Runner.
Maximum bundle size factory.
Maximum bundle time factory.
Runs a Pipeline on Flink via
FlinkRunner
.Flink job entry point to launch a Beam pipeline by executing an external SDK driver program.
Interface for portable Flink translators.
A handle used to execute a translated pipeline.
The context used for pipeline translation.
Result of executing a portable
Pipeline
with Flink.Various utilies related to portability.
This is the second part for executing a
Combine.PerKey
on
Flink, the second part is FlinkReduceFunction
.A
PipelineRunner
that executes the operations in the pipeline by first translating them
to a Flink Plan and then executing them either locally or on a Flink cluster, depending on the
configuration.AutoService registrar - will register FlinkRunner and FlinkOptions as possible pipeline runner
services.
Pipeline options registrar.
Pipeline runner registrar.
Result of executing a
Pipeline
with Flink.A
SideInputReader
for the Flink Batch Runner.The base class for
FlinkBoundedSource
and FlinkUnboundedSource
.An abstract implementation of
SourceReader
which encapsulates Beam Sources
for data reading.A Flink
SourceSplit
implementation that encapsulates a Beam Source
.Constructs a StateBackend to use from flink pipeline options.
A
RichGroupReduceFunction
for stateful ParDo
in Flink Batch Runner.StateInternals
that uses a Flink KeyedStateBackend
to manage state.Eagerly create user state to work around https://jira.apache.org/jira/browse/FLINK-12653.
Serializer configuration snapshot for compatibility and format evolution.
Translate an unbounded portable pipeline representation into a Flink pipeline representation.
Predicate to determine whether a URN is a Flink native transform.
Streaming translation context.
A Flink
Source
implementation that wraps a
Beam UnboundedSource
.A Flink
SourceReader
implementation
that reads from the assigned FlinkSourceSplits
by using Beam UnboundedReaders
.A
FloatCoder
encodes Float
values in 4 bytes using Java serialization.A client for the control plane of an SDK harness, which can issue requests to it over the Fn API.
A Fn API control service which adds incoming SDK harness connections to a sink.
A receiver of streamed data.
The
FnDataService
is able to forward inbound elements to a consumer and is also a
consumer of outbound elements.An interface sharing common behavior with services used during execution of user Fns.
A
ClientResponseObserver
which delegates all StreamObserver
calls.Base class for table providers that look up table metadata using full table names, instead of
querying it by parts of the name separately.
A metric that reports the latest value out of reported values.
Implementation of
Gauge
.The result of a
Gauge
metric.Empty
GaugeResult
, representing no values reported.Construct an oauth credential to be used by the SDK and the SDK workers.
A registrar containing the default GCP options.
Options used to configure Google Cloud Platform specific options such as the project and
credentials.
Attempts to infer the default project based upon the environment this application is executing
within.
EnableStreamingEngine defaults to false unless one of the two experiments is set.
Returns the default set of OAuth scopes.
Returns
PipelineOptions.getTempLocation()
as the default GCP temp location.Attempts to load the GCP credentials.
A registrar containing the default GCP options.
This class implements a
SessionServiceFactory
that retrieve the basic authentication
credentials from a Google Cloud Secret Manager secret.An abstract class that contains common configuration options for creating resources.
A builder for
GcsCreateOptions
.AutoService
registrar for the GcsFileSystem
.Options used to configure Google Cloud Storage.
Returns the default
ExecutorService
to use within the Apache Beam SDK.Creates a
GcsOptions.GcsCustomAuditEntries
that key-value pairs to be stored as custom information
in GCS audit logs.Creates a
PathValidator
object using the class specified in GcsOptions.getPathValidatorClass()
.Implements the Java NIO
Path
API for Google Cloud Storage paths.GCP implementation of
PathValidator
.ResourceId
implementation for Google Cloud Storage.Utility class for staging files to GCS.
Provides operations on GCS.
This is a
DefaultValueFactory
able to create a GcsUtil
using any transport
flags specified on the PipelineOptions
.A class that holds either a
StorageObject
or an IOException
.Class to generate first set of outputs for
DetectNewPartitionsDoFn
.A
PTransform
that produces longs starting from the given value, and either up to the
given limit or until Long.MAX_VALUE
/ until the given time elapses.Exposes GenerateSequence as an external transform for cross-language usage.
Parameters class to expose the transform to an external SDK.
Sequence generator table provider.
Helper to generate a DLQ transform to write PCollection to an external system.
A Provider for generic DLQ transforms that handle deserialization failures.
Deprecated.
new implementations should extend the
GetterBasedSchemaProviderV2
class'
methods which receive TypeDescriptor
s instead of ordinary Class
es as
arguments, which permits to support generic type signatures during schema inferenceBenchmarks for
GetterBasedSchemaProvider
on reading / writing fields based on toRowFunction
/ fromRowFunction
.A newer version of
GetterBasedSchemaProvider
, which works with TypeDescriptor
s,
and which by default delegates the old, Class
based methods, to the new ones.A store to hold the global watermarks for a micro-batch.
A
GlobalWatermarkHolder.SparkWatermarks
holds the watermarks and batch time relevant to a micro-batch input
from a specific source.Advance the WMs onBatchCompleted event.
The default window into which all data is placed (via
GlobalWindows
).GlobalWindow.Coder
for encoding and decoding GlobalWindow
s.A
WindowFn
that assigns all data to the same window.A OIDC web identity token provider implementation that uses the application default credentials
set by the runtime (container, GCE instance, local environment, etc.).
Defines how to construct a
GoogleAdsClient
.GoogleAdsIO
provides an API for reading from the Google Ads API over supported
versions of the Google Ads client libraries.This interface can be used to implement custom client-side rate limiting policies.
Implement this interface to create a
GoogleAdsIO.RateLimitPolicy
.Options used to configure Google Ads API specific options.
Attempts to load the Google Ads credentials.
Constructs and returns
Credentials
to be used by Google Ads API calls.GoogleAdsV19
provides an API to read Google Ads API v19 reports.A
PTransform
that reads the results of a Google Ads query as GoogleAdsRow
objects.A
PTransform
that reads the results of many SearchGoogleAdsStreamRequest
objects as GoogleAdsRow
objects.This rate limit policy wraps a
RateLimiter
and can be used in low volume and
development use cases as a client-side rate limiting policy.These options configure debug settings for Google API clients created within the Apache Beam SDK.
A
GoogleClientRequestInitializer
that adds the trace destination to Google API calls.A
Sink
for Spark's
metric system reporting metrics (including Beam step metrics) to Graphite.A generic grouping transform for schema
PCollection
s.a
PTransform
that does a combine using an aggregation built up by calls to
aggregateField and aggregateFields.a
PTransform
that groups schema elements based on the given fields.a
PTransform
that does a per-key combine using an aggregation built up by calls to
aggregateField and aggregateFields.a
PTransform
that does a global combine using an aggregation built up by calls to
aggregateField and aggregateFields.a
PTransform
that does a global combine using a provider Combine.CombineFn
.A
PTransform
for doing global aggregations on schema PCollections.A FlatMap function that groups by windows in batch mode using
ReduceFnRunner
.GroupByKey<K, V>
takes a PCollection<KV<K, V>>
, groups the values by key and
windows, and returns a PCollection<KV<K, Iterable<V>>>
representing a map from each
distinct key and window of the input PCollection
to an Iterable
over all the
values associated with that key in the input per window.GroupByKey translator.
Traverses the pipeline to populate the candidates for group by key.
GroupBy window function.
A set of group/combine functions to apply to Spark
RDD
s.A
ReadableState
cell that combines multiple input values and outputs a single value of a
different type.A
PTransform
that batches inputs to a desired batch size.Wrapper class for batching parameters supplied by users.
Functions for GroupByKey with Non-Merging windows translations to Spark.
An
OffsetRangeTracker
for tracking a growable offset range.Provides the estimated end offset of the range.
A HeaderAccessorProvider which intercept the header in a GRPC request and expose the relevant
fields.
A
FnDataService
implemented via gRPC.A
gRPC Server
which manages a single FnService
.An implementation of the Beam Fn Logging Service over gRPC.
An implementation of the Beam Fn State service.
A
DefaultValueFactory
which locates a Hadoop Configuration
.AutoService
registrar for HadoopFileSystemOptions
.AutoService
registrar for the HadoopFileSystem
.A
HadoopFormatIO
is a Transform for reading data from any source or writing data to any
sink which implements Hadoop InputFormat
or OutputFormat
.Bounded source implementation for
HadoopFormatIO
.A
PTransform
that reads from any data source which implements Hadoop InputFormat.A wrapper to allow Hadoop
InputSplit
to be serialized using
Java's standard serialization mechanisms.A
PTransform
that writes to any data sink which implements Hadoop OutputFormat.Builder for External Synchronization defining.
Builder for partitioning determining.
Main builder of Write transformation.
Interface for restrictions for which a default implementation of
DoFn.NewTracker
is available, depending only on the restriction
itself.Interface for watermark estimator state for which a default implementation of
DoFn.NewWatermarkEstimator
is available, depending only on the watermark estimator state itself.Marker interface for
PTransforms
and components to specify display data used
within UIs and diagnostic tools.A Flink combine runner that builds a map of merged windows and produces output after seeing all
input.
Interface for any Spark
Receiver
that supports
reading from and to some offset.A
CoderProviderRegistrar
for standard types used with HBaseIO
.A bounded source and sink for HBase.
A
PTransform
that reads from HBase.Implementation of
HBaseIO.readAll()
.A
PTransform
that writes to HBase.Transformation that writes RowMutation objects to a Hbase table.
Adapter from HCatalog table schema to Beam
Schema
.IO to read and write data using HCatalog.
A
PTransform
to read data using HCatalog.A
PTransform
to write to a HCatalog managed source.Beam SQL table that wraps
HCatalogIO
.Utility classes to enable meta store conf/client creation.
Utilities to convert
HCatRecords
to Rows
.Implementation of
ExternalSynchronization
which registers locks in the HDFS.Interface to access headers in the client request.
Defines a client to communicate with the GCP HCLS API (version v1).
Class for capturing errors on IO operations on Google Cloud Healthcare APIs resources.
Convenience transform to write dead-letter
HealthcareIOError
s to BigQuery TableRow
s.A heartbeat record serves as a notification that the change stream query has returned all changes
for the partition less or equal to the record timestamp.
This class is part of the process for
ReadChangeStreamPartitionDoFn
SDF.Methods and/or interfaces annotated with
@Hidden
will be suppressed from being output
when --help
is specified on the command-line.A metric that reports information about the histogram of reported values.
HL7v2IO
provides an API for reading from and writing to Google Cloud Healthcare HL7v2 API.The type Read that reads HL7v2 message contents given a PCollection of
HL7v2ReadParameter
.PTransform
to fetch a message from an Google Cloud Healthcare HL7v2 store based on
msgID.DoFn for fetching messages from the HL7v2 store with error handling.
The type Result includes
PCollection
of HL7v2ReadResponse
objects for
successfully read results and PCollection
of HealthcareIOError
objects for
failed reads.List HL7v2 messages in HL7v2 Stores with optional filter.
The type Read that reads HL7v2 message contents given a PCollection of message IDs strings.
PTransform
to fetch a message from an Google Cloud Healthcare HL7v2 store based on
msgID.DoFn for fetching messages from the HL7v2 store with error handling.
The type Result includes
PCollection
of HL7v2Message
objects for successfully
read results and PCollection
of HealthcareIOError
objects for failed reads.The type Write that writes the given PCollection of HL7v2 messages.
The enum Write method.
The type HL7v2 message to wrap the
Message
model.HL7v2ReadParameter represents the read parameters for a HL7v2 read request, used as the input
type for
HL7v2IO.HL7v2Read
.HL7v2ReadResponse represents the response format for a HL7v2 read request, used as the output
type of
HL7v2IO.HL7v2Read
.Coder for
HL7v2ReadResponse
.PTransform
s to compute HyperLogLogPlusPlus (HLL++) sketches on data streams based on the
ZetaSketch implementation.Provides
PTransform
s to extract the estimated count of distinct elements (as
Long
s) from each HLL++ sketch.Provides
PTransform
s to aggregate inputs into HLL++ sketches.Builder for the
HllCount.Init
combining PTransform
.Provides
PTransform
s to merge HLL++ sketches into a new sketch.HTTP client configuration for both, sync and async AWS clients.
A client that talks to the Cloud Healthcare API through HTTP requests.
The type FhirResourcePagesIterator for methods which return paged output.
Wraps
HttpResponse
in an exception with a statusCode field for use with HealthcareIOError
.The type Hl7v2 message id pages iterator.
SchemaTransform implementation for
IcebergIO.readRows(org.apache.beam.sdk.io.iceberg.IcebergCatalogConfig)
.A connector that reads and writes to Apache Iceberg
tables.
SchemaTransform implementation for
IcebergIO.readRows(org.apache.beam.sdk.io.iceberg.IcebergCatalogConfig)
.A table provider for Iceberg tables.
Utilities for converting between Beam and Iceberg types, made public for user's convenience.
SchemaTransform implementation for
IcebergIO.writeRows(org.apache.beam.sdk.io.iceberg.IcebergCatalogConfig)
.A generator of unique IDs.
Common
IdGenerator
implementations.For internal use only; no backwards-compatibility guarantees.
Flink input format that implements impulses.
/** * Jet
Processor
implementation for Beam's Impulse primitive.A
SourceFunc
which executes the impulse transform contract.Source function which sends a single global impulse to a downstream operator.
Impulse translator.
Exception thrown by
WindowFn.verifyCompatibility(WindowFn)
if two compared WindowFns are
not compatible, including the explanation of incompatibility.A
ProcessFunction
which is not a functional interface.IO to read and write from InfluxDB.
A POJO describing a DataSourceConfiguration such as URL, userName and password.
A
PTransform
to read from InfluxDB metric or data related to query.A
PTransform
to write to a InfluxDB datasource.A DoFn responsible to initialize the metadata table and prepare it for managing the state of the
pipeline.
A DoFn responsible for initializing the change stream Connector.
Utility class to determine initial partition constants and methods.
States to initialize a pipeline outputted by
InitializeDoFn
.Holds user state in memory.
A InMemoryJobService that prepares and runs jobs on behalf of a client using a
JobInvoker
.A
MetaStore
which stores the meta info in memory.A
InMemoryMetaTableProvider
is an abstract TableProvider
for in-memory types.A retry policy for streaming BigQuery inserts.
Contains information about a failed insert.
Kafka
Deserializer
for Instant
.Kafka
Serializer
for Instant
.Interface for any function that can handle a Fn API
BeamFnApi.InstructionRequest
.Signifies that a publicly accessible API (public class, method or field) is intended for internal
use only and not for public consumption.
An implementation of
BoundedWindow
that represents an interval from IntervalWindow.start
(inclusive) to IntervalWindow.end
(exclusive).Encodes an
IntervalWindow
as a pair of its upper bound and duration.Exception thrown when the configuration for a
SchemaIO
is invalid.Exception thrown when the configuration for a
SchemaIO
is invalid.Exception thrown when the schema for a
SchemaIO
is invalid.Exception thrown when the request for a table is invalid, such as invalid metadata.
IS_INF(X)
An Ism file is a prefix encoded composite key value file broken into shards.
The footer stores the relevant information required to locate the index and bloom filter.
A
Coder
for IsmFormat.Footer
.A record containing a composite key and either a value or metadata.
A
Coder
for IsmFormat.IsmRecord
s.A shard descriptor containing shard id, the data block offset, and the index offset for the
given shard.
A coder for
IsmFormat.IsmShard
s.The prefix used before each key which contains the number of shared and unshared bytes from the
previous key that was read.
A
Coder
for IsmFormat.KeyPrefix
.A coder for metadata key component.
IS_NAN(X)
An abstract base class with functionality for assembling a
Coder
for a class that
implements Iterable
.A
SchemaProvider
for Java Bean objects.FieldValueTypeSupplier
that's based on getter methods.FieldValueTypeSupplier
that's based on setter methods.A set of utilities to generate getter and setter classes for JavaBean objects.
An implementation of
TypedSchemaTransformProvider
for Explode.A
SchemaTransform
for Explode.A
SchemaProvider
for Java POJO objects.FieldValueTypeSupplier
that's based on public fields.An implementation of
TypedSchemaTransformProvider
for Filter for the java language.A
SchemaTransform
for Filter-java.An implementation of
TypedSchemaTransformProvider
for MapToFields for the java language.A
SchemaTransform
for MapToFields-java.Loads
UdfProvider
implementations from user-provided jars.A coder for JAXB annotated objects.
A class that manages a connection to a Solace broker using basic authentication.
Beam JDBC Connection.
Calcite JDBC driver with Beam defaults.
IO to read and write data on JDBC.
A POJO describing a
DataSource
, either providing directly a DataSource
or all
properties allowing to create a DataSource
.Wraps a
JdbcIO.DataSourceConfiguration
to provide a DataSource
.This is the default
Predicate
we use to detect DeadLock.Wraps a
JdbcIO.DataSourceConfiguration
to provide a PoolingDataSource
.An interface used by the JdbcIO Write to set the parameters of the
PreparedStatement
used to setParameters into the database.Implementation of
JdbcIO.read()
.Implementation of
JdbcIO.readAll()
.Implementation of
JdbcIO.readRows()
.Builder used to help with retry configuration for
JdbcIO
.An interface used to control if we retry the statements when a
SQLException
occurs.An interface used by
JdbcIO.Read
for converting each row of the ResultSet
into
an element of the resulting PCollection
.An interface used by the JdbcIO Write to set the parameters of the
PreparedStatement
used to setParameters into the database.This class is used as the default return value of
JdbcIO.write()
.A
PTransform
to write to a JDBC datasource.A
PTransform
to write to a JDBC datasource.An implementation of
SchemaTransformProvider
for
reading from JDBC connections using JdbcIO
.A helper for
JdbcIO.ReadWithPartitions
that handles range calculations.An implementation of
SchemaIOProvider
for reading and writing JSON payloads with JdbcIO
.Provides utility functions for working with
JdbcIO
.The result of writing a row to JDBC datasource.
An implementation of
SchemaTransformProvider
for
writing to a JDBC connections using JdbcIO
.Jet specific
MetricResults
.Jet specific implementation of
MetricsContainer
.Pipeline options specific to the Jet runner.
Jet specific implementation of
PipelineResult
.Jet specific implementation of Beam's
PipelineRunner
.Registers the
JetPipelineOptions
.Registers the
JetRunner
.An unbounded source for JMS destinations (queues or topics).
An interface used by
JmsIO.Read
for converting each jms Message
into an element
of the resulting PCollection
.A
PTransform
to read from a JMS destination.A
PTransform
to write to a JMS queue.JmsRecord contains message payload of the record as well as metadata (JMS headers and
properties).
A factory that has all job-scoped information, and can be combined with stage-scoped information
to create a
StageBundleFactory
.A subset of
ProvisionApi.ProvisionInfo
that
specifies a unique job, while omitting fields that are not known to the runner operator.Internal representation of a Job which has been invoked (prepared and run) by a client.
Factory to create
JobInvocation
instances.A job that has been prepared, but not invoked.
Shared code for starting and serving an
InMemoryJobService
.Configuration for the jobServer.
Utility class with different versions of joins.
A transform that performs equijoins across two schema
PCollection
s.Predicate object to specify fields to compare when doing an equi-join.
Implementation class for FieldsEqual.
PTransform representing a full outer join of two collections of KV elements.
Implementation class .
PTransform representing an inner join of two collections of KV elements.
PTransform representing a left outer join of two collections of KV elements.
PTransform representing a right outer join of two collections of KV elements.
This is a class to catch the built join and check if it is a legal join before passing it to the
actual RelOptRuleCall.
This is a function gets the output relation and checks if it is a legal relational node.
PTransform
s for reading and writing JSON files.PTransform
for writing JSON files.Matcher to compare a string or byte[] representing a JSON Object, independent of field order.
A
FileReadSchemaTransformFormatProvider
that reads newline-delimited JSONs.The result of a
JsonToRow.withExceptionReporting(Schema)
transform.Utils to convert JSON records to Beam
Row
.A
FileWriteSchemaTransformFormatProvider
for JSON format.An implementation of
TypedSchemaTransformProvider
for JsonIO.write(java.lang.String)
.Configuration for writing to BigQuery with Storage Write API.
Builder for
JsonWriteTransformProvider.JsonWriteConfiguration
.A service interface for defining one-time initialization of the JVM during pipeline execution.
Helpers for executing
JvmInitializer
implementations.Checkpoint for a
KafkaUnboundedReader
.A tuple to hold topic, partition, and offset that comprise the checkpoint for a single
partition.
A
PTransform
that commits offsets of KafkaRecord
.An unbounded source and a sink for Kafka topics.
A
PTransform
to read from Kafka topics.Exposes
KafkaIO.TypedWithoutMetadata
as an external transform for cross-language
usage.Parameters class to expose the Read transform to an external SDK.
A
PTransform
to read from KafkaSourceDescriptor
.A
PTransform
to read from Kafka topics.A
PTransform
to write to a Kafka topic with KVs .Exposes
KafkaIO.Write
as an external transform for cross-language usage.Parameters class to expose the Write transform to an external SDK.
A
PTransform
to write to a Kafka topic with ProducerRecord's.Initialize KafkaIO feature flags on worker.
Utility methods for translating
KafkaIO
transforms to and from RunnerApi
representations.Common utility functions and default configurations for
KafkaIO.Read
and KafkaIO.ReadSourceDescriptors
.Stores and exports metrics for a batch of Kafka Client RPCs.
Metrics of a batch of RPCs.
No-op implementation of
KafkaResults
.An interface for providing custom timestamp for elements written to Kafka.
Configuration for reading from a Kafka topic.
Builder for the
KafkaReadSchemaTransformConfiguration
.KafkaRecord contains key and value of the record as well as metadata for the record (topic name,
partition id, and offset).
Coder
for KafkaRecord
.Helper class to create per worker metrics for Kafka Sink stages.
Quick Overview
Represents a Kafka source description.
Kafka table provider.
This is a copy of Kafka's
TimestampType
.A keyed implementation of a
BufferingElementsHandler
.An immutable tuple of keyed
PCollections
with key type K.A utility class to help ensure coherence of tag and input PCollection types.
Keys<K>
takes a PCollection
of KV<K, V>
s and returns a
PCollection<K>
of the keys.Thrown when the Kinesis client was throttled due to rate limits.
IO to read from Kinesis streams.
Implementation of
KinesisIO.read()
.Configuration of Kinesis record aggregation.
Implementation of
KinesisIO.write()
.Result of
KinesisIO.write()
.PipelineOptions for
KinesisIO
.A registrar containing the default
KinesisIOOptions
.Kinesis interface for custom partitioner.
An explicit partitioner that always returns a
Nonnull
explicit hash key.KinesisClientRecord
enhanced with utility methods.Exposes
KinesisIO.Write
and KinesisIO.Read
as an external transform for cross-language
usage.A bounded source and sink for Kudu.
An interface used by the KuduIO Write to convert an input record into an Operation to apply as
a mutation in Kudu.
Implementation of
KuduIO.read()
.A
PTransform
that writes to Kudu.An immutable key/value pair.
A
Comparator
that orders KVs
by the natural ordering of their keys.A
Comparator
that orders KVs
by the natural ordering of their values.A
KvCoder
encodes KV
s.KvSwap<K, V>
takes a PCollection<KV<K, V>>
and returns a PCollection<KV<V,
K>>
, where all the keys and values have been swapped.KeySelector
that retrieves a key from a KV
.Util class for building/parsing labeled
MetricName
.Builder class for a labeled
MetricName
.Category tags for tests which validate that a Beam runner can handle keys up to a given size.
Tests if a runner supports 100KB keys.
Tests if a runner supports 100MB keys.
Tests if a runner supports 10KB keys.
Tests if a runner supports 10MB keys.
Tests if a runner supports 1MB keys.
HttpRequestInitializer for recording request to response latency of Http-based API calls.
Combine.CombineFn
that wraps an AggregateFn
.A
Coder
which is able to take any existing coder and wrap it such that it is only invoked
in the outer context
.Utilities for replacing or wrapping unknown coders with
LengthPrefixCoder
.Standard collection of metrics used to record source and sinks information for lineage tracking.
Lineage metrics resource types.
A
FileReadSchemaTransformFormatProvider
that reads lines as Strings.AutoService
registrar for the LocalFileSystem
.Helper functions for producing a
ResourceId
that references a local file or directory.An implementation of
TypedSchemaTransformProvider
for Logging.A
SchemaTransform
for logging.A logical endpoint is a pair of an instruction ID corresponding to the
BeamFnApi.ProcessBundleRequest
and the transform within the processing graph.A consumer of
Beam Log Entries
.Pipeline visitor that fills lookup table of
PTransform
to AppliedPTransform
for
usage in FlinkBatchPortablePipelineTranslator.BatchTranslationContext
.Top-level
PTransform
s that build and instantiate turnkey
transforms.A Factory which creates
ManagedChannel
instances.A ManagedFactory produces instances and tears down any produced instances when it is itself
closed.
Pipeline options to tune DockerEnvironment.
Register the
ManualDockerEnvironmentOptions
.A
WatermarkEstimator
which is controlled manually from within a DoFn
.A
ControlClientPool
backed by a client map.PTransform
s for mapping a simple function over the elements of a PCollection
.A
PTransform
that adds exception handling to MapElements
.MapKeys
maps a SerializableFunction<K1,K2>
over keys of a
PCollection<KV<K1,V>>
and returns a PCollection<KV<K2, V>>
.This interface allows you to implement a custom mapper to read and persist elements from/to
Cassandra.
Factory class for creating instances that will map a struct to a connector model.
Util class for mapping plugins.
A
ReadableState
cell mapping keys to values.Map to tuple function.
MapValues
maps a SerializableFunction<V1,V2>
over values of a
PCollection<KV<K,V1>>
and returns a PCollection<KV<K, V2>>
.The result of
FileSystem.match(java.util.List<java.lang.String>)
.MatchResult.Metadata
of a matched file.Builder class for
MatchResult.Metadata
.Status of a
MatchResult
.For internal use only; no backwards-compatibility guarantees.
For internal use only; no backwards-compatibility guarantees.
Represents the
PrimitiveViewT
supplied to the ViewFn
when it declares to use
the iterable materialization
.Represents the
PrimitiveViewT
supplied to the ViewFn
when it declares to use
the multimap materialization
.PTransform
s for computing the maximum of the elements in a PCollection
, or the
maximum of the values associated with each key in a PCollection
of KV
s.PTransform
s for computing the arithmetic mean (a.k.a.Options that are used to control the Memory Monitor.
For internal use only; no backwards compatibility guarantees.
Base class for publishing messages to a Solace broker.
Interface for receiving messages from a Solace broker.
A
Coder
for MatchResult.Metadata
.This class generates a SpannerConfig for the change stream metadata database by copying only the
necessary fields from the SpannerConfig of the primary database.
Data access object for creating and dropping the metadata table.
Data access object for managing the state of the metadata Bigtable table.
Helper methods that simplifies some conversion and extraction of metadata table content.
The interface to handle CRUD of
BeamSql
table metadata.Marker interface for all user-facing metrics.
Implements matching for metrics filters.
Metrics are keyed by the step name they are associated with and the name of the metric.
The name of a metric consists of a
MetricName.getNamespace()
and a MetricName.getName()
.The name of a metric.
The results of a query for metrics.
The results of a single current metric.
Methods for interacting with the metrics of a pipeline that has been executed.
Helper for pretty-printing
Flink metrics
.The
Metrics
is a utility class for producing various kinds of metrics for reporting
properties of an executing pipeline.Accumulator of
MetricsContainerStepMap
.For resilience,
Accumulators
are required to be wrapped in a Singleton.AccumulatorV2
for Beam metrics captured in MetricsContainerStepMap
.Spark Listener which checkpoints
MetricsContainerStepMap
values for fault-tolerance.Holds the metrics for a single step.
AccumulatorV2
implementation for MetricsContainerStepMap
.Manages and provides the metrics container associated with each thread.
Set the
MetricsContainer
for the associated MetricsEnvironment
.Simple POJO representing a filter for querying metrics.
Builder for creating a
MetricsFilter
.Extension of
PipelineOptions
that defines MetricsSink
specific options.A
DefaultValueFactory
that obtains the class of the NoOpMetricsSink
if it
exists on the classpath, and throws an exception otherwise.Interface for all metric sinks.
A
Source
that accommodates Spark's micro-batch oriented nature and wraps an UnboundedSource
.A timestamp represented as microseconds since the epoch.
PTransform
s for computing the minimum of the elements in a PCollection
, or the
minimum of the values associated with each key in a PCollection
of KV
s.Represents a modification in a table emitted within a
DataChangeRecord
.Represents the type of modification applied in the
DataChangeRecord
.IO to read and write data on MongoDB GridFS.
Encapsulate the MongoDB GridFS connection logic.
Interface for the parser that is used to parse the GridFSDBFile into the appropriate types.
Callback for the parser to use to submit data.
A
PTransform
to read data from MongoDB GridFS.A
BoundedSource
for MongoDB GridFS.A
PTransform
to write data to MongoDB GridFS.Function that is called to write the data to the give GridFS OutputStream.
IO to read and write data on MongoDB.
A
PTransform
to read data from MongoDB.A
PTransform
to write to a MongoDB database.Configures
Metric
s throughout various features of RequestResponseIO
.A helper class for monitoring jobs submitted to the service.
An interface that can be used for defining callbacks to receive a list of JobMessages
containing monitoring information.
A handler that logs monitoring messages.
Comparator for sorting rows in increasing order based on timestamp.
An object that configures
FileSystems.copy(java.util.List<org.apache.beam.sdk.io.fs.ResourceId>, java.util.List<org.apache.beam.sdk.io.fs.ResourceId>, org.apache.beam.sdk.io.fs.MoveOptions...)
, FileSystems.rename(java.util.List<org.apache.beam.sdk.io.fs.ResourceId>, java.util.List<org.apache.beam.sdk.io.fs.ResourceId>, org.apache.beam.sdk.io.fs.MoveOptions...)
, and FileSystems.delete(java.util.Collection<org.apache.beam.sdk.io.fs.ResourceId>, org.apache.beam.sdk.io.fs.MoveOptions...)
.Defines the standard
MoveOptions
.An unbounded source for MQTT broker.
A POJO describing a MQTT connection.
A
PTransform
to read from a MQTT broker.A
PTransform
to write and send a message to a MQTT server.A container class for MQTT message metadata, including the topic name and payload.
DoFunctions ignore outputs that are not the main output.
A
ReadableState
cell mapping keys to bags of values.Mutable state mutates when events apply to it.
A bundle of mutations that must be submitted atomically.
A duration represented in nanoseconds.
A timestamp represented as nanoseconds since the epoch.
Category for integration tests that require Docker.
Category tag for validation tests which utilize
TestPipeline
for execution and expect to
be executed by a PipelineRunner
.This is a Beam IO to read from, and write data to, Neo4j.
This describes all the information needed to create a Neo4j
Session
.Wraps a
Neo4jIO.DriverConfiguration
to provide a Driver
.This is the class which handles the work behind the
Neo4jIO.readAll()
method.An interface used by
Neo4jIO.ReadAll
for converting each row of a Neo4j Result
record
Record
into an element of the resulting PCollection
.This is the class which handles the work behind the
Neo4jIO.writeUnwind()
method.A
Trigger
which never fires.The actual trigger class for
Never
triggers.Represent new partition as a result of splits and merges.
NFA
is an implementation of non-deterministic finite automata.This is a utility class to represent rowCount, rate and window.
This is a metadata used for row count and rate estimation.
Handler API.
A non-keyed implementation of a
BufferingElementsHandler
.Abstract base class for
WindowFns
that do not merge windows.A no-op implementation of Counter.
Construct an oauth credential to be used by the SDK and the SDK workers.
A no-op implementation of Histogram.
For internal use only; no backwards compatibility guarantees.
A
StepContext
for Spark Batch Runner execution.doc.
Synchronously compute the earliest partition watermark, by delegating the call to
.
invalid reference
PartitionMetadataDao#getUnfinishedMinWatermark()
Indicates that we are missing a schema for a type.
A
NullableCoder
encodes nullable values of type T
using a nested Coder<T>
that does not tolerate null
values.A
HttpRequestInitializer
for requests that don't have credentials.NoOp implementation of a size estimator.
NoOp implementation of a throughput estimator.
Reference counting object pool to easily share invalid input: '&' destroy objects.
Client pool to easily share AWS clients per configuration.
A
BoundedSource
that uses offsets to define starting and ending positions.A
Source.Reader
that implements code common to readers of all OffsetBasedSource
s.A restriction represented by a range of integers [from, to).
A coder for
OffsetRange
s.A
RangeTracker
for non-negative positions of type long
.A
RestrictionTracker
for claiming offsets in an OffsetRange
in a monotonically
increasing fashion.A logical type representing a union of fields.
Represents a single OneOf value.
Describes an order.
Transform for processing ordered events.
The result of the ordered processing.
A
ReadableState
cell containing a list of values sorted by timestamp.Parent class for Ordered Processing configuration handlers.
Parent class for Ordered Processing configuration handlers to handle processing of the events
where global sequence is used.
Indicates the status of ordered processing for a particular key.
The
OrderKey
class stores the information to sort a column.A
Trigger
that executes according to its main trigger until its "finally" trigger fires.Creates factories which determine an underlying
StreamObserver
implementation to use in
to interact with fn execution APIs.Creates an outbound observer for the given inbound observer.
A factory that can create output receivers during an executable stage.
A representation used by
Step
s to reference the
output of other Step
s.Output tag filter.
Helper routines for packages.
Provides information about the pane an element belongs to.
A Coder for encoding PaneInfo instances.
Enumerates the possibilities for the timing of this pane firing related to the input and output
watermarks for its computation.
ParDo
is the core element-wise transform in Apache Beam, invoking a user-specified
function on each of the elements of the input PCollection
to produce zero or more output
elements, all of which are collected into the output PCollection
.A
PTransform
that, when applied to a PCollection<InputT>
, invokes a
user-specified DoFn<InputT, OutputT>
on all its elements, which can emit elements to
any of the PTransform
's output PCollection
s, which are bundled into a result
PCollectionTuple
.A
PTransform
that, when applied to a PCollection<InputT>
, invokes a
user-specified DoFn<InputT, OutputT>
on all its elements, with all its outputs
collected into an output PCollection<OutputT>
.ParDo translator.
A
PTransformOverrideFactory
that provides overrides for applications of a ParDo
in the direct runner.Jet
Processor
implementation for Beam's ParDo primitive (when no
user-state is being used).Jet
Processor
supplier that will provide instances of ParDoP
.A function to handle stateful processing in Apache Beam's SparkRunner.
An iterator implementation that processes timers from
SparkTimerInternals
.IO to read and write Parquet files.
Implementation of
ParquetIO.parseGenericRecords(SerializableFunction)
.Implementation of
ParquetIO.parseFilesGenericRecords(SerializableFunction)
.Implementation of
ParquetIO.read(Schema)
.Implementation of
ParquetIO.readFiles(Schema)
.Implementation of
ParquetIO.sink(org.apache.avro.Schema)
.TableProvider
for ParquetIO
for consumption by Beam SQL.A
FileWriteSchemaTransformFormatProvider
for Parquet format.Exception thrown when Beam SQL is unable to parse the statement.
PTransform
for parsing JSON Strings
.The result of parsing a single file with Tika: contains the file's location, metadata, extracted
text, and optionally an error.
Partition
takes a PCollection<T>
and a PartitionFn
, uses the
PartitionFn
to split the elements of the input PCollection
into N
partitions,
and returns a PCollectionList<T>
that bundles N
PCollection<T>
s
containing the split elements.A function object that chooses an output partition for an element.
A function object that chooses an output partition for an element.
A partition end record serves as a notification that the client should stop reading the
partition.
This class is part of the process for
ReadChangeStreamPartitionDoFn
SDF.A partition event record describes key range changes for a change stream partition.
This class is part of the process for
ReadChangeStreamPartitionDoFn
SDF.A
WindowFn
that places each value into exactly one window based on its timestamp and
never merges windows.Model for the partition metadata database table used in the Connector.
Partition metadata builder for better user experience.
The state at which a partition can be in the system:
CREATED: the partition has been created, but no query has been done against it yet.
Data access object for creating and dropping the partition metadata table.
Data access object for the Connector metadata tables.
Represents the execution of a read / write transaction in Cloud Spanner.
Represents a result from executing a Cloud Spanner read / write transaction.
This class is responsible for transforming a
Struct
to a PartitionMetadata
.Configuration for a partition metadata table.
There can be a race when many splits and merges happen to a single partition in quick succession.
Output result of
DetectNewPartitionsDoFn
containing
information required to stream a partition.A partition start record serves as a notification that the client should schedule the partitions
to be queried.
This class is part of the process for invalid input: '{@link
org.apache.beam.sdk.io.gcp.spanner.changestreams.dofn..ReadChangeStreamPartitionDoFn'} SDF.
An assertion on the contents of a
PCollection
incorporated into the pipeline.Default transform to check that a PAssert was successful.
A transform that applies an assertion-checking function over iterables of
ActualT
to
the entirety of the contents of its input.A transform that applies an assertion-checking function to the sole element of a
PCollection
.Builder interface for assertions applicable to iterables and PCollection contents.
Check that the passed-in matchers match the existing data.
An assertion checker that takes a single
PCollectionView<ActualT>
and an assertion over ActualT
, and checks it within a Beam pipeline.Track the place where an assertion is defined.
An
PAssert.IterableAssert
about the contents of a PCollection
.An assert about the contents of each
PCollection
in the given PCollectionList
.Builder interface for assertions applicable to a single value.
A base class for LogicalTypes that use the same Java type as the underlying base type.
For internal use only; no backwards compatibility guarantees.
PatternCondition
stores the function to decide whether a row is a match of a single
pattern.A
PCollection<T>
is an immutable collection of values of type
T
.The enumeration of cases for whether a
PCollection
is bounded.A
PCollectionList<T>
is an immutable list of homogeneously typed
PCollection<T>s
.A
PCollectionRowTuple
is an immutable tuple of PCollections
,
"keyed" by a string tag.A
PCollectionTuple
is an immutable tuple of heterogeneously-typed PCollections
, "keyed" by TupleTags
.A
PCollectionView<T>
is an immutable view of a PCollection
as a value of type T
that can be accessed as a side input to a ParDo
transform.For internal use only; no backwards compatibility guarantees.
Implementation which is able to adapt a multimap materialization to an in-memory
List<T>
.Implementation which is able to adapt an iterable materialization to an in-memory
List<T>
.Implementation which is able to adapt a multimap materialization to an in-memory
Map<K,
V>
.Implementation which is able to adapt an iterable materialization to an in-memory
Map<K,
V>
.Implementation which is able to adapt a multimap materialization to an in-memory
Map<K,
Iterable<V>>
.Implementation which is able to adapt an iterable materialization to an in-memory
Map<K,
Iterable<V>>
.Implementation which is able to adapt an iterable materialization to a
List<T>
.Deprecated.
Implementation which is able to adapt an iterable materialization to a
Iterable<T>
.Deprecated.
Implementation which is able to adapt a multimap materialization to a
List<T>
.Deprecated.
Implementation which is able to adapt a multimap materialization to a
Map<K, V>
.Deprecated.
Implementation which is able to adapt a multimap materialization to a
Map<K,
Iterable<V>>
.A class for
PCollectionView
implementations, with additional type parameters that are
not visible at pipeline assembly time when the view is used as a side input.Deprecated.
Implementation which is able to adapt an iterable materialization to a
T
.Stores values or metadata about values.
A coder for
PCollectionViews.ValueOrMetadata
.PCollectionView translator.
A
PTransform
which produces a sequence of elements at fixed runtime intervals.A
PTransform
which generates a sequence of timestamped elements at given runtime
intervals.The interface for things that might be input to a
PTransform
.A
Pipeline
manages a directed acyclic graph of PTransforms
, and the
PCollections
that the PTransforms
consume and produce.For internal use only; no backwards-compatibility guarantees.
Control enum for indicating whether or not a traversal should process the contents of a
composite transform or not.
Default no-op
Pipeline.PipelineVisitor
that enters all composite transforms.Handles failures in the form of exceptions.
PipelineOptions are used to configure Pipelines.
DefaultValueFactory
which supplies an ID that is guaranteed to be unique within the
given process.Enumeration of the possible states for a given check.
A
DefaultValueFactory
that obtains the class of the DirectRunner
if it exists
on the classpath, and throws an exception otherwise.Returns a normalized job name constructed from
ApplicationNameOptions.getAppName()
, the
local system user name (if available), the current time, and a random integer.Returns a user agent string constructed from
ReleaseInfo.getName()
and ReleaseInfo.getVersion()
, in the format [name]/[version]
.Constructs a
PipelineOptions
or any derived interface that is composable to any other
derived interface of PipelineOptions
via the PipelineOptions.as(java.lang.Class<T>)
method.A fluent
PipelineOptions
builder.PipelineOptions
creators have the ability to automatically have their PipelineOptions
registered with this SDK by creating a ServiceLoader
entry and a
concrete implementation of this interface.Validates that the
PipelineOptions
conforms to all the Validation
criteria.Result of
Pipeline.run()
.Possible job states, for both completed and ongoing jobs.
A
PipelineRunner
runs a Pipeline
.The pipeline translator translates a Beam
Pipeline
into a Spark correspondence, that can
then be evaluated.Shared, mutable state during the translation of a pipeline and omitted afterwards.
Unresolved translation, allowing to optimize the generated Spark DAG.
PipelineTranslator
for executing a Pipeline
in Spark in batch mode.Utilities for pipeline translation.
Class wrapper for a CDAP plugin.
Builder class for a
Plugin
.Class for getting any filled
PluginConfig
configuration object.Class for CDAP plugin constants.
Format types.
Format provider types.
Hadoop types.
Plugin types.
A set of utilities to generate getter and setter classes for POJOs.
PortablePipelineRunner
that bundles the input pipeline along with all dependencies,
artifacts, etc.Contains common code for writing and reading portable pipeline jars.
Pipeline options common to all portable runners.
Result of a portable
PortablePipelineRunner.run(RunnerApi.Pipeline, JobInfo)
.Runs a portable Beam pipeline on some execution engine.
Registrar for the portable runner.
A DoFn class to gather metrics about the emitted
DataChangeRecord
s.The interface for things that might be output from a
PTransform
.An
Iterable
that returns PrefetchableIterator
s.This class contains static utility functions that operate on or return objects of type
PrefetchableIterable
.A default implementation that caches an iterator to be returned when
PrefetchableIterables.Default.prefetch()
is
invoked.Iterator
that supports prefetching the next set of records.Prepare an input
PCollection
for writing to BigQuery.A
PTransformOverrideFactory
that produces PrimitiveParDoSingleFactory.ParDoSingle
instances from ParDo.SingleOutput
instances.A single-output primitive
ParDo
.A translator for
PrimitiveParDoSingleFactory.ParDoSingle
.Registers the
PrismPipelineOptions
and TestPrismPipelineOptions
.A
PipelineRunner
executed on Prism.Utility methods for creating
BeamFnApi.ProcessBundleDescriptor
instances.A container type storing references to the key, value, and window
Coder
used when
handling bag user state requests.A container type storing references to the value, and window
Coder
used when handling
side input state requests.A container type storing references to the key, timer and payload coders and the remote input
destination used when handling timer requests.
Environment for process-based execution.
An
EnvironmentFactory
which forks processes based on the parameters in the Environment.Provider of ProcessEnvironmentFactory.
A function that computes an output value of type
OutputT
from an input value of type
InputT
and is Serializable
.A simple process manager which forks processes and kills them if necessary.
Coder
for ProducerRecord
.A
ProjectionConsumer
is a Schema
-aware operation (such as a DoFn
or PTransform
) that
has a FieldAccessDescriptor
describing which fields the
operation accesses.A factory for operations that execute a projection on a
Schema
-aware PCollection
.Constant property names used by the SDK in CloudWorkflow specifications.
A
CoderProviderRegistrar
for standard types used with Google Protobuf.Utility class for working with Protocol Buffer (Proto) data.
A
Coder
using Google Protocol Buffers binary format.ProtoDomain is a container class for Protobuf descriptors.
A set of
Schema.LogicalType
classes to represent protocol buffer types.A Fixed32 type.
A Fixed64 type.
A SFixed32 type.
An SFixed64 type.
A SInt32 type.
A SIn64 type.
A UInt32 type.
A UIn64 type.
Helpers for implementing the "Provider" pattern.
Options needed for a Pub/Sub Lite Publisher.
This class is required to handle callbacks from Solace, to find out if messages were actually
published or there were any kind of error.
An (abstract) helper class for talking to Pubsub via an underlying transport.
A message received from Pubsub.
A message to be sent to Pubsub.
Path representing a cloud project id.
Factory for creating clients.
Path representing a Pubsub schema.
Path representing a Pubsub subscription.
Path representing a Pubsub topic.
A
CoderProviderRegistrar
for standard types used with PubsubIO
.A helper class for talking to Pubsub via grpc.
Read and Write
PTransform
s for Cloud Pub/Sub streams.Class representing a Cloud Pub/Sub Subscription.
Class representing a Cloud Pub/Sub Topic.
Implementation of read methods.
Implementation of write methods.
A Pubsub client using JSON transport.
I/O transforms for reading from Google Pub/Sub Lite.
A sink which publishes messages to Pub/Sub Lite.
Pub/Sub Lite table provider.
Class representing a Pub/Sub message.
A coder for PubsubMessage treating the raw bytes being decoded as the message's payload.
Common util functions for converting between PubsubMessage proto and
PubsubMessage
.Provides a
SchemaCoder
for PubsubMessage
, including the topic and all fields of a
PubSub message from server.A coder for PubsubMessage including all fields of a PubSub message from server.
A coder for PubsubMessage including attributes and the message id from the PubSub server.
A coder for PubsubMessage including attributes.
A coder for PubsubMessage treating the raw bytes being decoded as the message's payload, with the
message id from the PubSub server.
A coder for PubsubMessage including the topic from the PubSub server.
Properties that can be set when using Google Cloud Pub/Sub with the Apache Beam SDK.
Configuration for reading from Pub/Sub.
An implementation of
TypedSchemaTransformProvider
for Pub/Sub reads configured using
PubsubReadSchemaTransformConfiguration
.An implementation of
SchemaIOProvider
for reading and writing JSON/AVRO payloads with
PubsubIO
.TableProvider
for PubsubIO
for consumption by Beam SQL.A (partial) implementation of
PubsubClient
for use by unit tests.Closing the factory will validate all expected messages were processed.
A PTransform which streams messages to Pubsub.
Users should use
instead.
invalid reference
PubsubIO#read
Configuration for writing to Pub/Sub.
An implementation of
TypedSchemaTransformProvider
for Pub/Sub reads configured using
PubsubWriteSchemaTransformConfiguration
.Class for reading and writing from Apache Pulsar.
Class representing a Pulsar Message record.
For internal use.
For internal use.
For internal use.
A logical type for PythonCallableSource objects.
Wrapper for invoking external Python transforms.
Pipeline options for
PythonExternalTransform
.A registrar for
PythonExternalTransformOptions
.Wrapper for invoking external Python
Map
transforms..Utility to bootstrap and start a Beam Python service.
The
Quantifier
class is intended for storing the information of the quantifier for a
pattern variable.Main action class for querying a partition change stream.
An interface that planners should implement to convert sql statement to
BeamRelNode
or
SqlNode
.Converts a resolved Zeta SQL query represented by a tree to corresponding Calcite representation.
QueryTrait.
A IO to publish or consume messages with a RabbitMQ broker.
A
PTransform
to consume messages from RabbitMQ server.A
PTransform
to publish messages to a RabbitMQ server.It contains the message payload, and additional metadata like routing key or attributes.
An implementation of a client-side throttler that enforces a gradual ramp-up, broadly in line
with Datastore best practices.
An elastic-sized byte array which allows you to manipulate it as a stream, or access it directly.
A
Coder
which encodes the valid parts of this stream.A
Comparator
that compares two byte arrays lexicographically.A
RangeTracker
is a thread-safe helper object for implementing dynamic work rebalancing
in position-based BoundedSource.BoundedReader
subclasses.Implement this interface to create a
RateLimitPolicy
.Default rate limiter that throttles reading from a shard using an exponential backoff if the
response is empty or if the consumer is throttled by AWS.
This corresponds to an integer union tag and value.
A
PTransform
for reading from a Source
.PTransform
that reads from a BoundedSource
.Helper class for building
Read
transforms.PTransform
that reads from a UnboundedSource
.A
Coder
for FileIO.ReadableFile
.A
State
that can be read via ReadableState.read()
.For internal use only; no backwards-compatibility guarantees.
Reads each file in the input
PCollection
of FileIO.ReadableFile
using given parameters
for splitting files into offset ranges and for creating a FileBasedSource
for a file.A class to handle errors which occur during file reads.
Reads each file of the input
PCollection
and outputs each element as the value of a
KV
, where the key is the filename from which that value came.Parameters class to expose the transform to an external SDK.
This class is part of
ReadChangeStreamPartitionDoFn
SDF.A SDF (Splittable DoFn) class which is responsible for performing a change stream query for a
given partition.
RestrictionTracker used by
ReadChangeStreamPartitionDoFn
to keep
track of the progress of the stream and to split the restriction for runner initiated
checkpoints.This restriction tracker delegates most of its behavior to an internal
TimestampRangeTracker
.Util for invoking
Source.Reader
methods that might require a MetricsContainerImpl
to be active.Transform for reading from Apache Pulsar.
A
ReadOnlyTableProvider
provides in-memory read only set of BeamSqlTable
BeamSqlTables
.Encapsulates a spanner read operation.
Source translator.
doc.
This
DoFn
reads Cloud Spanner 'information_schema.*' tables to build the SpannerSchema
.Helper class for source operations.
Class for building an instance for
Receiver
that uses Apache Beam mechanisms instead of
Spark environment.A
PTransform
using the Recommendations AI API (https://cloud.google.com/recommendations).A
PTransform
connecting to the Recommendations AI API
(https://cloud.google.com/recommendations) and creating CatalogItem
s.A
PTransform
connecting to the Recommendations AI API
(https://cloud.google.com/recommendations) and creating UserEvent
s.The RecommendationAIIO class acts as a wrapper around the
s that interact with
the Recommendation AI API (https://cloud.google.com/recommendations).
invalid reference
PTransform
A
PTransform
using the Recommendations AI API (https://cloud.google.com/recommendations).A
PTransform
using the Recommendations AI API (https://cloud.google.com/recommendations).This class just transforms to PublishResult to be able to capture the windowing with the right
strategy.
Helper Class based on
Row
, it provides Metadata associated with each Record when reading
from file(s) using ContextualTextIO
.RedisConnectionConfiguration
describes and wraps a connectionConfiguration to Redis
server or cluster.An IO to manipulate Redis key/value database.
Implementation of
RedisIO.read()
.Implementation of
RedisIO.readKeyPatterns()
.AÂ
PTransform
to write to a Redis server.Determines the method used to insert data in Redis.
AÂ
PTransform
to write stream key pairs (https://redis.io/topics/streams-intro) to a
Redis server.A family of
PTransforms
that returns a PCollection
equivalent to its
input but functions as an operational hint to a runner that redistributing the data in some way
is likely useful.Noop transform that hints to the runner to try to redistribute the work evenly, or via whatever
clever strategy the runner comes up with.
Registers translators for the Redistribute family of transforms.
ExecutableStageContext.Factory
which counts ExecutableStageContext reference for book
keeping.Interface for creator which extends Serializable.
A set of reflection helper methods.
Represents a class and a schema.
Represents a type descriptor and a schema.
PTransform
s to use Regular Expressions to process elements in a PCollection
.Regex.MatchesName<String>
takes a PCollection<String>
and returns a
PCollection<List<String>>
representing the value extracted from all the Regex groups of the
input PCollection
to the number of times that element occurs in the input.Regex.Find<String>
takes a PCollection<String>
and returns a
PCollection<String>
representing the value extracted from the Regex groups of the input
PCollection
to the number of times that element occurs in the input.Regex.Find<String>
takes a PCollection<String>
and returns a
PCollection<List<String>>
representing the value extracted from the Regex groups of the input
PCollection
to the number of times that element occurs in the input.Regex.MatchesKV<KV<String, String>>
takes a PCollection<String>
and returns a
PCollection<KV<String, String>>
representing the key and value extracted from the Regex
groups of the input PCollection
to the number of times that element occurs in the
input.Regex.Find<String>
takes a PCollection<String>
and returns a
PCollection<String>
representing the value extracted from the Regex groups of the input
PCollection
to the number of times that element occurs in the input.Regex.MatchesKV<KV<String, String>>
takes a PCollection<String>
and returns a
PCollection<KV<String, String>>
representing the key and value extracted from the Regex
groups of the input PCollection
to the number of times that element occurs in the
input.Regex.Matches<String>
takes a PCollection<String>
and returns a
PCollection<String>
representing the value extracted from the Regex groups of the input
PCollection
to the number of times that element occurs in the input.Regex.MatchesKV<KV<String, String>>
takes a PCollection<String>
and returns a
PCollection<KV<String, String>>
representing the key and value extracted from the Regex
groups of the input PCollection
to the number of times that element occurs in the
input.Regex.MatchesName<String>
takes a PCollection<String>
and returns a
PCollection<String>
representing the value extracted from the Regex groups of the input
PCollection
to the number of times that element occurs in the input.Regex.MatchesNameKV<KV<String, String>>
takes a PCollection<String>
and returns
a PCollection<KV<String, String>>
representing the key and value extracted from the
Regex groups of the input PCollection
to the number of times that element occurs in the
input.Regex.ReplaceAll<String>
takes a PCollection<String>
and returns a
PCollection<String>
with all Strings that matched the Regex being replaced with the
replacement string.Regex.ReplaceFirst<String>
takes a PCollection<String>
and returns a
PCollection<String>
with the first Strings that matched the Regex being replaced with the
replacement string.Regex.Split<String>
takes a PCollection<String>
and returns a
PCollection<String>
with the input string split into individual items in a list.Hamcrest matcher to assert a string matches a pattern.
PTransforms
for converting between explicit and implicit form of various Beam
values.This transforms turns a side input into a singleton PCollection that can be used as the main
input for another transform.
Simple
Function
to bring the windowing information into the value from the implicit
background representation of the PCollection
.This is the implementation of NodeStatsMetadata.
A bundle capable of handling input data elements for a
bundle descriptor
by
forwarding them to a remote environment for processing.A handle to an available remote
RunnerApi.Environment
.A
RemoteEnvironment
which uses the default RemoteEnvironment.close()
behavior.Options that are used to control configuration of the remote environment.
Register the
RemoteEnvironmentOptions
.An execution-time only
RunnerApi.PTransform
which represents an SDK harness reading from a BeamFnApi.RemoteGrpcPort
.An execution-time only
RunnerApi.PTransform
which represents a write from within an SDK harness to
a BeamFnApi.RemoteGrpcPort
.A pair of
which specifies the arguments to a
Coder
and
invalid reference
BeamFnApi.Target
FnDataService
to send data to a remote harness.A pair of
Coder
and FnDataReceiver
which can be registered to receive elements
for a LogicalEndpoint
.A transform for renaming fields inside an existing schema.
The class implementing the actual PTransform.
A
Trigger
that fires according to its subtrigger forever.PTransform
for reading from and writing to Web APIs.Describes the run-time requirements of a
Contextful
, such as access to side inputs.For internal use only; no backwards compatibility guarantees.
Implementation of
Reshuffle.viaRandomKey()
.For internal use only; no backwards compatibility guarantees.
An object that configures
ResourceId.resolve(java.lang.String, org.apache.beam.sdk.io.fs.ResolveOptions)
.Defines the standard resolve options.
Provides a definition of a resource hint known to the SDK.
Pipeline authors can use resource hints to provide additional information to runners about the
desired aspects of the execution environment.
Options that are used to control configuration of the remote environment.
Register the
ResourceHintsOptions
.An identifier which represents a file-like resource.
A
Coder
for ResourceId
.A utility to test
ResourceId
implementations.An interrupter for restriction tracker of type T.
Manages access to the restriction and keeps track of its claimed part for a splittable
DoFn
.All
RestrictionTracker
s SHOULD implement this interface to improve auto-scaling and
splitting performance.A representation for the amount of known completed and remaining work.
A representation of the truncate result.
Support utilities for interacting with
RestrictionTrackers
.Interface allowing a runner to observe the calls to
RestrictionTracker.tryClaim(PositionT)
.A class that manages retrying of callables based on the exceptions they throw.
Configuration of the retry behavior for AWS SDK clients.
Implements a request initializer that adds retry handlers to all HttpRequests.
Models a Cassandra token range.
Row
is an immutable tuple-like schema to represent one element in a PCollection
.Builder for
Row
.Builder for
Row
that bases a row on another row.Bundle of rows according to the configured
Factory
as input for benchmarks.A sub-class of SchemaCoder that can only encode
Row
instances.Translator for row coders.
A convenience class for applying row updates to BigQuery using
BigQueryIO.applyRowMutations()
.This class indicates how to apply a row update to BigQuery.
A selector interface for extracting fields from a row.
A Concrete subclass of
Row
that delegates to a set of provided FieldValueGetter
s.Concrete subclass of
Row
that explicitly stores all fields of the row.Quality of Service manager options for Firestore RPCs.
Mutable Builder class for creating instances of
RpcQosOptions
.Wrapper for invoking external Python
RunInference
.Construct S3ClientBuilder from S3 pipeline options.
Object used to configure
S3FileSystem
.AutoService
registrar for the S3FileSystem
.A registrar that creates
S3FileSystemConfiguration
instances from PipelineOptions
.Options used to configure Amazon Web Services S3.
Provide the default s3 upload buffer size in bytes: 64MB if more than 512MB in RAM are
available and 5MB otherwise.
PTransform
s for taking samples of the elements in a PCollection
, or samples of
the values associated with each key in a PCollection
of KV
s.CombineFn
that computes a fixed-size sample of a collection of values.Classes that represent various SBE semantic types.
Representation of SBE's LocalMktDate.
Represents SBE's TimeOnly composite type.
Represents SBE's TZTimestamp composite type.
Represents SBE's uint16 type.
Represents SBE's uint32 type.
Represents SBE's uint64 type.
Represents SBE's uint8 type.
Representation of SBE's UTCDateOnly.
Represents SBE's UTCTimeOnly composite type.
Represents SBE's UTCTimestamp composite type.
Represents an SBE schema.
Options for configuring schema generation from an
Ir
.Builder for
SbeSchema.IrOptions
.Utilities for easier interoperability with the Spark Scala API.
A scalar function that can be executed as part of a SQL query.
Annotates the single method in a
ScalarFn
implementation that is to be applied to SQL
function arguments.Reflection-based implementation logic for
ScalarFn
.Beam-customized version from
ScalarFunctionImpl
, to
address BEAM-5921.Builder class for building
Schema
objects.Control whether nullable is included in equivalence check.
Field of a row.
Builder for
Schema.Field
.A descriptor of a single field type.
A LogicalType allows users to define a custom schema type.
An enumerated list of type constructors.
A wrapper for a
GenericRecord
and the TableSchema
representing the schema of the
table (or query) it was generated from.Each IO in Beam has one table schema, by extending
SchemaBaseBeamTable
.SchemaCoder
is used as the coder for types that have schemas registered.Translator for Schema coders.
Can be put on a constructor or a static method, in which case that constructor or method will be
used to created instance of the class by Beam's schema code.
When used on a POJO field or a JavaBean getter, that field or getter is ignored from the inferred
schema.
Provides an instance of
ConvertHelpers.ConvertedSchemaInformation
.An abstraction to create schema capable and aware IOs.
Provider to create
SchemaIO
instances for use in Beam SQL and other SDKs.A general
TableProvider
for IOs for consumption by Beam SQL.A schema represented as a serialized proto bytes.
Concrete implementations of this class allow creation of schema service objects that vend a
Schema
for a specific type.SchemaProvider
creators have the ability to automatically have their schemaProvider
registered with this SDK by creating a ServiceLoader
entry
and a concrete implementation of this interface.An abstraction representing schema capable and aware transforms.
Provider to create
SchemaTransform
instances for use in Beam SQL and other SDKs.A
PTransformTranslation.TransformPayloadTranslator
implementation that translates between a Java SchemaTransform
and a protobuf payload for that transform.Utility methods for translating schemas.
A creator interface for user types that have schemas.
Provides utility functions for working with Beam
Schema
types.A
JdbcIO.RowMapper
implementation that converts JDBC
results into Beam Row
objects.A set of utility functions for schemas.
Visitor that zips schemas, and accepts pairs of fields and their types.
Context referring to a current position in a schema.
KeySelector
that retrieves a key from a KV<KV<element, KV<restriction,
watermarkState>>, size>
.A high-level client for an SDK harness.
Options that are used to control configuration of the SDK harness.
The default implementation which detects how much memory to use for a process wide cache.
A
DefaultValueFactory
which constructs an instance of the class specified by maxCacheMemoryUsageMbClass
to compute the maximum amount of
memory to allocate to the process wide cache within an SDK harness instance.The set of log levels that can be used in the SDK harness.
Specifies the maximum amount of memory to use within the current SDK harness instance.
Defines a log level override for a specific class, package, or name.
A
PTransform
for selecting a subset of fields from a schema type.A
PTransform
representing a flattened schema.Helper methods to select subrows out of rows.
A class to execute requests to SEMP v2 with Basic Auth authentication.
This interface defines methods for interacting with a Solace message broker using the Solace
Element Management Protocol (SEMP).
This interface serves as a blueprint for creating SempClient objects, which are used to interact
with a Solace message broker using the Solace Element Management Protocol (SEMP).
Default accumulator used to combine sequence ranges.
Util methods to help with serialization / deserialization.
A union of the
BiConsumer
and Serializable
interfaces.A union of the
BiFunction
and Serializable
interfaces.A
Coder
for Java classes that implement Serializable
.A
CoderProviderRegistrar
which registers a CoderProvider
which can handle
serializable types.A
Comparator
that is also Serializable
.A wrapper to allow Hadoop
Configuration
s to be serialized using Java's standard
serialization mechanisms.A function that computes an output value of type
OutputT
from an input value of type
InputT
, is Serializable
, and does not allow checked exceptions to be declared.Useful
SerializableFunction
overrides.A wrapper around
Ir
that fulfils Java's Serializable
contract.A
Matcher
that is also Serializable
.Static class for building and using
SerializableMatcher
instances.SerializableRexFieldAccess.
SerializableRexInputRef.
SerializableRexNode.
SerializableRexNode.Builder.
A
gRPC server
factory.Creates a
gRPC Server
using the default server factory.Factory that constructs client-accessible URLs from a local server address and port.
A
WindowFn
that windows values into sessions separated by periods with no input for at
least the duration specified by Sessions.getGapDuration()
.The SessionService interface provides a set of methods for managing a session with the Solace
messaging system.
This abstract class serves as a blueprint for creating `SessionServiceFactory` objects.
The
PTransform
s that allow to compute different set functions across PCollection
s.A
ReadableState
cell containing a set of elements.Provided by user and called within
DoFn.Setup
and @{link
org.apache.beam.sdk.transforms.DoFn.Teardown} lifecycle methods of Call
's DoFn
.A key and a shard number.
Function for assigning
ShardedKey
s to input elements for sharded WriteFiles
.Standard shard naming templates.
Broadcast helper for side inputs.
BroadcastVariableInitializer
that initializes the broadcast input as a Map
from
window to side input.Metadata class for side inputs in Spark runner.
Utility class for creating and managing side input readers in the Spark runner.
SideInputValues
serves as a Kryo serializable container that contains a materialized view
of side inputs.General
SideInputValues
for BoundedWindows
in two possible
states.Specialized
SideInputValues
for use with the GlobalWindow
in two possible
states.Factory function for load
SideInputValues
from a Dataset
.A
SerializableFunction
which is not a functional interface.A specialized
ConstantInputDStream
that emits its RDD exactly once.Deprecated.
replace with a
DefaultJobBundleFactory
when appropriate if the EnvironmentFactory
is a DockerEnvironmentFactory
, or create an
InProcessJobBundleFactory
and inline the creation of the environment if appropriate.IO to read and write data on SingleStoreDB.
A POJO describing a SingleStoreDB
DataSource
by providing all properties needed to
create it.A
PTransform
for reading data from SingleStoreDB.A
PTransform
for reading data from SingleStoreDB.An interface used by
SingleStoreIO.Read
and SingleStoreIO.ReadWithPartitions
for converting each row of the
ResultSet
into an element of the resulting PCollection
.A RowMapper that provides a Coder for resulting PCollection.
A RowMapper that requires initialization.
An interface used by the SingleStoreIO
SingleStoreIO.Read
to set the parameters of the PreparedStatement
.An interface used by the SingleStoreIO
SingleStoreIO.Write
to map a data from each element of PCollection
to a List of Strings.A
PTransform
for writing data to SingleStoreDB.Configuration for reading from SingleStoreDB.
An implementation of
TypedSchemaTransformProvider
for SingleStoreDB read jobs configured
using SingleStoreSchemaTransformReadConfiguration
.Configuration for writing to SingleStoreDB.
An implementation of
TypedSchemaTransformProvider
for SingleStoreDB write jobs configured
using SingleStoreSchemaTransformWriteConfiguration
.Singleton keyed word item.
Singleton keyed work item coder.
A Flink combine runner takes elements pre-grouped by window and produces output after seeing all
input.
Standard Sink Metrics.
This class is used to estimate the size in bytes of a given element.
PTransform
s to compute the estimate frequency of each element in a stream.Implements the
Combine.CombineFn
of SketchFrequencies
transforms.Implementation of
SketchFrequencies.globally()
.Implementation of
SketchFrequencies.perKey()
.Wrap StreamLib's Count-Min Sketch to support counting all user types by hashing the encoded
user type using the supplied deterministic coder.
A
LogWriter
which uses an SLF4J Logger
as the underlying log backend.A
WindowFn
that windows values into possibly overlapping fixed-size timestamp-based
windows.Wraps an existing coder with Snappy compression.
This is an AutoValue representation of an Iceberg
Snapshot
.Class for preparing configuration for batch write and read.
Implemenation of
SnowflakeServices.BatchService
used in production.POJO describing single Column within Snowflake Table.
Interface for data types to provide SQLs for themselves.
IO to read and write data on Snowflake.
Combines list of
String
to provide one String
with paths where files were
staged for write.Interface for user-defined function mapping parts of CSV line into T.
A POJO describing a
DataSource
, providing all properties allowing to create a DataSource
.Wraps
SnowflakeIO.DataSourceConfiguration
to provide DataSource.Implementation of
SnowflakeIO.read()
.Removes temporary staged files after reading.
Parses
String
from incoming data in PCollection
to have proper format for CSV
files.Interface for user-defined function mapping T into array of Objects.
Implementation of
SnowflakeIO.write()
.Interface which defines common methods for interacting with Snowflake.
Class for preparing configuration for streaming write.
Implementation of
SnowflakeServices.StreamingService
used in production.POJO representing schema of Table in Snowflake.
Exposes
SnowflakeIO.Read
and SnowflakeIO.Write
as an external transform for
cross-language usage.IO to send notifications via SNS.
Implementation of
SnsIO.write()
.Creates a
SocketAddress
based upon a supplied string.Provides core data models and utilities for working with Solace messages in the context of Apache
Beam pipelines.
The correlation key is an object that is passed back to the client during the event broker ack
or nack.
Represents a Solace message destination (either a Topic or a Queue).
Represents a Solace destination type.
The result of writing a message to Solace.
Represents a Solace queue.
Represents a Solace message record with its associated metadata.
A utility class for mapping
BytesXMLMessage
instances to Solace.Record
objects.Represents a Solace topic.
Checkpoint for an unbounded Solace source.
A
PTransform
to read and write from/to Solace event
broker.The
SolaceIO.Write
transform's output return this type, containing the successful
publishes (SolaceOutput.getSuccessfulPublish()
).Transforms for reading and writing data from/to Solr.
A POJO describing a connection configuration to Solr.
A
PTransform
reading data from Solr.A POJO describing a replica of Solr.
A POJO encapsulating a configuration for retry behavior when issuing requests to Solr.
A
PTransform
writing data to Solr.A Flink combine runner that first sorts the elements by window and then does one pass that merges
windows and outputs results.
SortValues<PrimaryKeyT, SecondaryKeyT, ValueT>
takes a PCollection<KV<PrimaryKeyT,
Iterable<KV<SecondaryKeyT, ValueT>>>>
with elements consisting of a primary key and iterables
over <secondary key, value>
pairs, and returns a PCollection<KV<PrimaryKeyT,
Iterable<KV<SecondaryKeyT, ValueT>>>
of the same elements but with values sorted by a secondary
key.Base class for defining input formats and creating a
Source
for reading the input.The interface that readers of custom input sources must implement.
Wrapper for executing a
Source
as a Flink InputFormat
.InputSplit
for SourceInputFormat
.Standard
Source
Metrics.Classes implementing Beam
Source
RDD
s.A
SourceRDD.Unbounded
is the implementation of a micro-batch in a SourceDStream
.This class can be used as a mapper for each
SourceRecord
retrieved.SourceRecordJson
implementation.Interface used to map a Kafka source record.
Helper functions and test harnesses for checking correctness of
Source
implementations.Expected outcome of
BoundedSource.BoundedReader.splitAtFraction(double)
.Manages lifecycle of
DatabaseClient
and Spanner
instances.Configuration for a Cloud Spanner client.
Builder for
SpannerConfig
.Reading from Cloud Spanner
A
PTransform
that create a transaction.A builder for
SpannerIO.CreateTransaction
.A failure handling strategy.
Implementation of
SpannerIO.read()
.Implementation of
SpannerIO.readAll()
.Interface to display the name of the metadata table on Dataflow UI.
A
PTransform
that writes Mutation
objects to Google Cloud Spanner.Same as
SpannerIO.Write
but supports grouped mutations.A provider for reading from Cloud Spanner using a Schema Transform Provider.
Encapsulates Cloud Spanner Schema.
Exception to signal that Spanner schema retrieval failed.
Exposes
SpannerIO.WriteRows
and SpannerIO.ReadRows
as an external transform for
cross-language usage.The results of a
SpannerIO.write()
transform.An implementation of
Window.Assign
for the Spark runner.Translates a bounded portable pipeline into a Spark job.
Predicate to determine whether a URN is a Spark native transform.
A Spark
Source
that is tailored to expose a SparkBeamMetric
, wrapping an
underlying MetricResults
instance.A Spark
Source
that is tailored to expose a SparkBeamMetric
, wrapping an
underlying MetricResults
instance.A
CombineFnBase.GlobalCombineFn
with a CombineWithContext.Context
for the SparkRunner.Accumulator of WindowedValues holding values for different windows.
Type of the accumulator.
Spark runner
PipelineOptions
handles Spark execution-related configurations, such as the
master address, and other user-related knobs.Returns Spark's default storage level for the Dataset or RDD API based on the respective
runner.
Returns the default checkpoint directory of /tmp/${job.name}.
A custom
PipelineOptions
to work with properties related to JavaSparkContext
.Returns an empty list, to avoid handling null.
Singleton class that contains one
ExecutableStageContext.Factory
per job.An implementation of
GroupByKeyViaGroupByKeyOnly.GroupAlsoByWindow
logic for grouping by windows and controlling
trigger firings and pane accumulation.Processes Spark's input data iterators using Beam's
DoFnRunner
.Creates a job invocation to manage the Spark runner's execution of a portable pipeline.
Driver program that starts a job server for the Spark runner.
Spark runner-specific Configuration for the jobServer.
Pipeline visitor for translating a Beam pipeline into equivalent Spark operations.
SparkPCollectionView is used to pass serialized views to lambdas.
Type of side input.
Spark runner
PipelineOptions
handles Spark execution-related configurations, such as the
master address, batch-interval, and other user-related knobs.Represents a Spark pipeline execution result.
Runs a portable pipeline on Apache Spark.
Translator to support translation between Beam transformations and Spark transformations.
Interface for portable Spark translators.
Pipeline options specific to the Spark portable runner running a streaming job.
Holds current processing context for
SparkInputDataProcessor
.Streaming sources for Spark
Receiver
.A
PTransform
to read from Spark Receiver
.The SparkRunner translate operations defined on a pipeline to a representation executable by
Spark, and then submitting the job to Spark to be executed.
Evaluator on the pipeline.
Pipeline runner which translates a Beam pipeline into equivalent Spark operations, without
running them.
PipelineResult of running a
Pipeline
using SparkRunnerDebugger
Use SparkRunnerDebugger.DebugSparkPipelineResult.getDebugString()
to get a String
representation of the Pipeline
translated into
Spark native operations.Custom
KryoRegistrator
s for Beam's Spark runner needs and registering used class in spark
translation for better serialization performance.Registers the
SparkPipelineOptions
.Registers the
SparkRunner
.A
JavaStreamingContext
factory for resilience.KryoRegistrator
for Spark to serialize broadcast variables used for side-inputs.SideInputReader using broadcasted
SideInputValues
.A
SideInputReader
for the SparkRunner.An implementation of
StateInternals
for the SparkRunner.Translates an unbounded portable pipeline into a Spark job.
Translation context used to lazily store Spark datasets during streaming portable pipeline
translation and compute them after translation.
Spark runner
PipelineOptions
handles Spark execution-related configurations, such as the
master address, and other user-related knobs.A Spark runner build on top of Spark's SQL Engine (Structured
Streaming framework).
Contains the
PipelineRunnerRegistrar
and PipelineOptionsRegistrar
for the SparkStructuredStreamingRunner
.Registers the
SparkStructuredStreamingPipelineOptions
.Registers the
SparkStructuredStreamingRunner
.An implementation of
TimerInternals
for the SparkRunner.PTransform
overrides for Spark runner.Translation context used to lazily store Spark data sets during portable pipeline translation and
compute them after translation.
A "composite" InputDStream implementation for
UnboundedSource
s.A metadata holder for an input stream partition.
A representation of a split result.
Flink operator for executing splittable
DoFns
.A
SplunkEvent
describes a single payload sent to Splunk's Http Event Collector (HEC)
endpoint.A builder class for creating a
SplunkEvent
.A
Coder
for SplunkEvent
objects.An unbounded sink for Splunk's Http Event Collector (HEC).
Class
SplunkIO.Write
provides a PTransform
that allows writing SplunkEvent
records into a Splunk HTTP Event Collector end-point using HTTP POST requests.A class for capturing errors that occur while writing
SplunkEvent
to Splunk's Http Event
Collector (HEC) end point.A builder class for creating a
SplunkWriteError
.Adapter for
Analyzer
to simplify the API for parsing the query and resolving the AST.Parse tree for
UNIQUE
, PRIMARY KEY
constraints.Parse tree for column.
Exception thrown when BeamSQL cannot convert sql to BeamRelNode.
Parse tree for
CREATE EXTERNAL TABLE
statement.Parse tree for
CREATE FUNCTION
statement.Utilities concerning
SqlNode
for DDL.Parse tree for
DROP TABLE
statement.A separate SqlOperators table for those functions that do not exist or not compatible with
Calcite.
SQL parse tree node to represent
SET
and RESET
statements.SqlTransform
is the DSL interface of Beam SQL.Beam
Schema.LogicalType
s corresponding to SQL data types.IO to read (unbounded) from and write to SQS queues.
A
PTransform
to read/receive messages from SQS.Deprecated.
superseded by
SqsIO.WriteBatches
A
PTransform
to send messages to SQS.Mapper to create a
SendMessageBatchRequestEntry
from a unique batch entry id and the
input T
.A more convenient
SqsIO.WriteBatches.EntryMapperFn
variant that already sets the entry id.Result of
SqsIO.writeBatches()
.Configuration class for reading data from an AWS SQS queue.
An implementation of
TypedSchemaTransformProvider
for jobs reading data from AWS SQS
queues and configured via SqsReadConfiguration
.Customer provided key for use with Amazon S3 server-side encryption.
A bundle factory scoped to a particular
ExecutableStage
, which has all of the resources it
needs to provide new RemoteBundles
.Interface for staging files needed for running a Dataflow pipeline.
A state cell, supporting a
State.clear()
operation.State and Timers wrapper.
For internal use only; no backwards-compatibility guarantees.
For internal use only; no backwards-compatibility guarantees.
For internal use only; no backwards-compatibility guarantees.
The
StateDelegator
is able to delegate BeamFnApi.StateRequest
s to a set of registered
handlers.Allows callers to deregister from receiving further state requests.
Jet
Processor
implementation for Beam's stateful ParDo primitive.Jet
Processor
supplier that will provide instances of StatefulParDoP
.A specialized evaluator for ParDo operations in Spark Streaming context that is invoked when
stateful streaming is detected in the DoFn.
Handler for
StateRequests
.A set of utility methods which construct
StateRequestHandler
s.A handler for bag user state.
A factory which constructs
StateRequestHandlers.BagUserStateHandler
s.A handler for iterable side inputs.
A handler for multimap side inputs.
Marker interface that denotes some type of side input handler.
A factory which constructs
StateRequestHandlers.MultimapSideInputHandler
s.A specification of a persistent state cell.
Cases for doing a "switch" on the type of
StateSpec
.A base class for a visitor with a default method for cases it is not interested in.
A class containing
StateSpec
mappingFunctions.Static methods for working with
StateSpecs
.A
provision service
that returns a static response to all calls.A
RemoteEnvironment
that connects to Dataflow runner harness.An
EnvironmentFactory
that creates StaticRemoteEnvironment used by a runner harness that
would like to use an existing InstructionRequestHandler.Provider for StaticRemoteEnvironmentFactory.
A set of utilities for inferring a Beam
Schema
from static Java types.Constants and variables for CDC support.
A transform that converts messages to protocol buffers in preparation for writing to BigQuery.
This DoFn flushes and optionally (if requested) finalizes Storage API streams.
This
PTransform
manages loads into BigQuery using the Storage API.Class used to wrap elements being sent to the Storage API sinks.
A transform to write sharded records to BigQuery using the Storage API.
A transform to write sharded records to BigQuery using the Storage API (Streaming).
Write records to the Storage API using a standard batch approach.
Deprecated.
Legacy non-portable source which can be replaced by a DoFn with timers.
PTransform that performs streaming BigQuery write.
Stores and exports metrics for a batch of Streaming Inserts RPCs.
No-op implementation of
StreamingInsertsResults
.Metrics of a batch of InsertAll RPCs.
Deprecated.
tests which use unbounded PCollections should be in the category
UsesUnboundedPCollections
.Options used to configure streaming.
StateRequestHandler
that uses SideInputHandler
to
access the broadcast state that represents side inputs.Class for creating context object of different CDAP classes with stream source type.
Supports translation between a Beam transform, and Spark's operations on DStreams.
Registers classes specialized by the Spark runner.
Translator matches Beam transformation with the appropriate evaluator.
This transform takes in key-value pairs of
TableRow
entries and the TableDestination
it should be written to.Position for
ReadChangeStreamPartitionProgressTracker
.Stream TransformTranslator interface.
Combine.CombineFn
s for aggregating strings or bytes with an optional delimiter (default comma).A
Combine.CombineFn
that aggregates bytes with a byte array as delimiter.A
Combine.CombineFn
that aggregates strings with a string as delimiter.A
Coder
that wraps a Coder<String>
and encodes/decodes values via string
representations.StringFunctions.
A metric that reports set of unique string values.
Implementation of
StringSet
.The result of a
StringSet
metric.Empty
StringSetResult
, representing no values reported and is immutable.A collection of static methods for manipulating datastructure representations transferred via the
Dataflow API.
A wrapper around a byte[] that uses structural, value-based equality rather than byte[]'s normal
object identity.
A (Key, Coder) pair that uses the structural value of the key (as provided by
Coder.structuralValue(Object)
) to perform equality and hashing.An abstract base class to implement a
Coder
that defines equality, hashing, and printing
via the class name and recursively using StructuredCoder.getComponents()
.An implementation of AwsCredentialsProvider that periodically sends an
AssumeRoleWithWebIdentityRequest
to the AWS Security Token Service to maintain short-lived
sessions to use for authentication.Builder class for
StsAssumeRoleForFederatedCredentialsProvider
.Output of
PAssert
.PTransform
s for computing the sum of the elements in a PCollection
, or the sum of
the values associated with each key in a PCollection
of KV
s.A
StreamObserver
which provides synchronous access access to an underlying StreamObserver
.Represents the metadata of a
BeamSqlTable
.Builder class for
Table
.A wrapper for a
KuduTable
and the TableAndRecord
representing a typed record.Encapsulates a BigQuery table destination.
A coder for
TableDestination
objects.A
Coder
for TableDestination
that includes time partitioning information.A
Coder
for TableDestination
that includes time partitioning and clustering
information.Represents a parsed table name that is specified in a FROM clause (and other places).
Helper class to extract table identifiers from the query.
A
TableProvider
handles the metadata CRUD of a specified kind of tables.Utility methods to resolve a table, given a top-level Calcite schema and a table path.
Utility methods for converting JSON
TableRow
objects to dynamic protocol message, for use
with the Storage write API.A descriptor for ClickHouse table schema.
A column in ClickHouse table.
A descriptor for a column type.
An enumeration of possible kinds of default values in ClickHouse.
An enumeration of possible types in ClickHouse.
An updatable cache for table schemas.
Helper utilities for handling schema-update responses.
For internal use only; no backwards-compatibility guarantees.
PTransform
s for getting information about quantiles in a stream.Implementation of
TDigestQuantiles.globally()
.Implementation of
TDigestQuantiles.perKey()
.Implements the
Combine.CombineFn
of TDigestQuantiles
transforms.A PTransform that returns its input, but also applies its input to an auxiliary PTransform, akin
to the shell
tee
command, which is named after the T-splitter used in plumbing.Test rule which creates a new table with specified schema, with randomized name and exposes few
APIs to work with it.
Interface to implement a polling assertion.
Mocked table for bounded data sources.
A set of options used to configure the
TestPipeline
.TestDataflowRunner
is a pipeline runner that wraps a DataflowRunner
when running
tests against the TestPipeline
.A
TestRule
that validates that all submitted tasks finished and were completed.A union of the
ExecutorService
and TestRule
interfaces.Test Flink runner.
A JobService for tests.
A creator of test pipelines that can be used inside of tests that can be configured to run
locally or against a remote pipeline runner.
An exception thrown in case an abandoned
PTransform
is
detected, that is, a PTransform
that has not been run.An exception thrown in case a test finishes without invoking
Pipeline.run()
.Implementation detail of
TestPipeline.newProvider(T)
, do not use.TestPipelineOptions
is a set of options for test pipelines.Matcher which will always pass.
Factory for
PipelineResult
matchers which always pass.Options for
TestPortableRunner
.Factory for default config.
Register
TestPortablePipelineOptions
.TestPortableRunner
is a pipeline runner that wraps a PortableRunner
when running
tests against the TestPipeline
.PipelineOptions
for use with the TestPrismRunner
.Test rule which creates a new topic and subscription with randomized names and exposes the APIs
to work with them.
Test rule which observes elements of the
PCollection
and checks whether they match the
success criteria.A
SparkPipelineOptions
for tests.A factory to provide the default watermark to stop a pipeline that reads from an unbounded
source.
The SparkRunner translate operations defined on a pipeline to a representation executable by
Spark, and then submitting the job to Spark to be executed.
A testing input that generates an unbounded
PCollection
of elements, advancing the
watermark and processing time as elements are emitted.An incomplete
TestStream
.A
TestStream.Event
that produces elements.An event in a
TestStream
.The types of
TestStream.Event
that are supported by TestStream
.A
TestStream.Event
that advances the processing time clock.Coder for
TestStream
.A
TestStream.Event
that advances the watermark.Utility methods which enable testing of
StreamObserver
s.A builder for a test
CallStreamObserver
that performs various callbacks.Flink source for executing
TestStream
.Base class for mocked table.
Test in-memory table provider for use in tests.
TableWitRows.
Utility functions for mock classes.
A mocked unbounded table.
Register
TestUniversalRunner.Options
.Registrar for the portable runner.
PTransform
s for reading and writing text files.Deprecated.
Use
Compression
.Implementation of
TextIO.read()
.Deprecated.
See
TextIO.readAll()
for details.Implementation of
TextIO.readFiles()
.Implementation of
TextIO.sink()
.Implementation of
TextIO.write()
.This class is used as the default return value of
TextIO.write()
.TextJsonTable
is a BeamSqlTable
that reads text files and converts them according
to the JSON format.This returns a row count estimation for files associated with a file pattern.
Builder for
TextRowCountEstimator
.This strategy stops sampling if we sample enough number of bytes.
This strategy stops sampling when total number of sampled bytes are more than some threshold.
An exception that will be thrown if the estimator cannot get an estimation of the number of
lines.
This strategy samples all the files.
Sampling Strategy shows us when should we stop reading further files.
Implementation detail of
TextIO.Read
.TextTable
is a BeamSqlTable
that reads text files and converts them according to
the specified format.Text table provider.
Read-side converter for
TextTable
with format 'csv'
.Read-side converter for
TextTable
with format 'lines'
.Write-side converter for for
TextTable
with format 'lines'
.A
Coder
that encodes Integer Integers
as the ASCII bytes of their textual,
decimal, representation.PTransform
s for reading and writing TensorFlow TFRecord files.Deprecated.
Use
Compression
.Implementation of
TFRecordIO.read()
.Implementation of
TFRecordIO.readFiles()
.Implementation of
TFRecordIO.write()
.Configuration for reading from TFRecord.
Builder for
TFRecordReadSchemaTransformConfiguration
.Configuration for reading from TFRecord.
A
Coder
using a Thrift TProtocol
to
serialize/deserialize elements.PTransform
s for reading and writing files containing Thrift encoded data.Implementation of
ThriftIO.readFiles(java.lang.Class<T>)
.Implementation of
ThriftIO.sink(org.apache.thrift.protocol.TProtocolFactory)
.Writer to write Thrift object to
OutputStream
.Schema provider for generated thrift types.
An estimator to calculate the throughput of the outputted elements from a DoFn.
A
BiConsumer
which can throw Exception
s.A
BiFunction
which can throw Exception
s.Transforms for parsing arbitrary files using Apache Tika.
Implementation of
TikaIO.parse()
.Implementation of
TikaIO.parseFiles()
.A time without a time-zone.
TimeDomain
specifies whether an operation is based on timestamps of elements or current
"real-world" time as reported while processing.A timer for a specified time domain that can be set to register the desire for further processing
at particular time in its specified time domain.
A factory that passes timers to
TimerReceiverFactory.timerDataConsumer
.Interface for interacting with time.
A specification for a
Timer
.Static methods for working with
TimerSpecs
.Utility class for handling timers in the Spark runner.
A marker class used to identify timer keys and values in Spark transformations.
Policies for combining timestamps that occur within a window.
Convert between different Timestamp and Instant classes.
An immutable pair of a value and a timestamp.
A
Coder
for TimestampedValue
.This encoder/decoder writes a com.google.cloud.Timestamp object as a pair of long and int to avro
and reads a Timestamp object from the same pair.
TimestampFunctions.
A
WatermarkEstimator
that observes the timestamps of all records output from a DoFn
.A timestamp policy to assign event time for messages in a Kafka partition and watermark for it.
The context contains state maintained in the reader for the partition.
An extendable factory to create a
TimestampPolicy
for each partition at runtime by
KafkaIO reader.Assigns Kafka's log append time (server side ingestion time) to each record.
A simple policy that uses current time for event time and watermark.
Internal policy to support deprecated withTimestampFn API.
A
TimestampPrefixingWindowCoder
wraps arbitrary user custom window coder.A restriction represented by a range of timestamps [from, to).
A
RestrictionTracker
for claiming positions in a TimestampRange
in a
monotonically increasing fashion.For internal use only; no backwards-compatibility guarantees.
For internal use only; no backwards-compatibility guarantees.
For internal use only; no backwards-compatibility guarantees.
Provides methods in order to convert timestamp to nanoseconds representation and back.
A helper class for converting between Dataflow API and SDK time representations.
Time conversion utilities.
Creates a
PTransform
that serializes UTF-8 JSON objects from a Schema
-aware
PCollection (i.e.PTransform
s for finding the largest (or smallest) set of elements in a
PCollection
, or the largest (or smallest) set of values associated with each key in a
PCollection
of KV
s.Deprecated.
use
Top.Natural
insteadA
Serializable
Comparator
that that uses the compared elements' natural
ordering.Serializable
Comparator
that that uses the reverse of the compared elements'
natural ordering.Deprecated.
use
Top.Reversed
insteadCombineFn
for Top
transforms that combines a bunch of T
s into a single
count
-long List<T>
, using compareFn
to choose the largest T
s.The
Coder
for encoding and decoding TopicPartition
in Beam.PTransforms
for converting a PCollection<?>
, PCollection<KV<?,?>>
, or PCollection<Iterable<?>>
to a PCollection<String>
.A transaction object.
Describe a
PTransform
evaluator.A
Runnable
that will execute a PTransform
on some bundle of input.Provides a mapping of
RunnerApi.FunctionSpec
to a PTransform
, together with
mappings of its inputs and outputs to maps of PCollections.A utility that can be used to manage a Beam Transform Service.
A
TransformTranslator
knows how to translate a particular subclass of PTransform
for the Cloud Dataflow service.A
TransformTranslator
provides the capability to translate a specific primitive or
composite PTransform
into its Spark correspondence.Supports translation between a Beam transform, and Spark's operations on RDDs.
The interface for a
TransformTranslator
to build a Dataflow step.The interface provided to registered callbacks for interacting with the
DataflowRunner
,
including reading and writing the values of PCollection
s and side inputs.Translator matches Beam transformation with the appropriate evaluator.
A set of utilities to help translating Beam transformations into Spark transformations.
doc.
A SparkCombineFn function applied to grouped KVs.
A utility class to filter
TupleTag
s.Helpers for cloud communication.
Triggers control when the elements for a specific key and window are output.
For internal use only; no backwards-compatibility guarantees.
A
TupleTag
is a typed tag to use as the key of a heterogeneously typed tuple, like PCollectionTuple
.A
TupleTagList
is an immutable list of heterogeneously typed TupleTags
.TVFSlidingWindowFn assigns window based on input row's "window_start" and "window_end"
timestamps.
Provides static constants or utils for TVF streaming.
doc.
Twister pipeline translator for batch pipelines.
Twister2BatchTranslationContext.
Twister2 wrapper for Bounded Source.
Empty Source wrapper.
Twister2PipelineExecutionEnvironment.
Twister2PipelineOptions.
Represents a Twister2 pipeline execution result.
Twister2PipelineTranslator, both batch and streaming translators need to extend from this.
A
PipelineRunner
that executes the operations in the pipeline by first translating them
to a Twister2 Plan and then executing them either locally or on a Twister2 cluster, depending on
the configuration.AutoService registrar - will register Twister2Runner and Twister2Options as possible pipeline
runner services.
Pipeline options registrar.
Pipeline runner registrar.
Sink Function that collects results.
Twister pipeline translator for stream pipelines.
Twister2StreamingTranslationContext.
A
PipelineRunner
that executes the operations in the pipeline by first translating them
to a Twister2 Plan and then executing them either locally or on a Twister2 cluster, depending on
the configuration.Twister2TranslationContext.
Represents a type of a column within Cloud Spanner.
A
Combine.CombineFn
delegating all relevant calls to given delegate.A description of a Java type, including actual generic parameters where possible.
A utility class for creating
TypeDescriptor
objects for different types, such as Java
primitive types, containers and KVs
of other TypeDescriptor
objects, and
extracting type variables of parameterized types (e.g.A helper interface for use with
TypeDescriptors.extractFromTypeParameters(Object, Class, TypeVariableExtractor)
.Like
SchemaTransformProvider
except uses a configuration object instead of Schema and
Row.Captures a free type variable that can be used in
TypeDescriptor.where(org.apache.beam.sdk.values.TypeParameter<X>, org.apache.beam.sdk.values.TypeDescriptor<X>)
.Implement
AggregateFunction
to take a Combine.CombineFn
as UDAF.Beam-customized version from
ReflectiveFunctionBase
, to address BEAM-5921.Helps build lists of
FunctionParameter
.Provider for user-defined functions written in Java.
Defines Java UDFs for use in tests.
Provider for UDF and UDAF.
This DoFn is the responsible for writing to Solace in batch mode (holding up any messages), and
emit the corresponding output (success or fail; only for persistent messages), so the
SolaceIO.Write connector can be composed with other subsequent transforms in the pipeline.
DStream holder Can also crate a DStream from a supplied queue of values, but mainly for testing.
This DoFn encapsulates common code used both for the
UnboundedBatchedSolaceWriter
and
UnboundedStreamingSolaceWriter
.A
Source
that reads an unbounded amount of input and, because of that, supports some
additional operations such as checkpointing, watermarks, and record ids.A marker representing the progress and state of an
UnboundedSource.UnboundedReader
.A checkpoint mark that does nothing when finalized.
A
Reader
that reads an unbounded amount of input.Jet
Processor
implementation for reading from an unbounded Beam
source.Wrapper for executing
UnboundedSources
as a Flink Source.This DoFn is the responsible for writing to Solace in streaming mode (one message at a time, not
holding up any message), and emit the corresponding output (success or fail; only for persistent
messages), so the SolaceIO.Write connector can be composed with other subsequent transforms in
the pipeline.
A UnionCoder encodes RawUnionValues.
Generate unique IDs that can be used to differentiate different jobs and partitions.
A base class for logical types that are not understood by the Java SDK.
Combines the source event which failed to process with the failure reason.
Options for controlling what to do with unsigned types, specifically whether to use a higher bit
count or, in the case of uint64, a string.
Defines the exact behavior for unsigned types.
Builder for
UnsignedOptions
.A legacy snapshot which does not care about schema compatibility.
Builds a MongoDB UpdateConfiguration object.
Update destination schema based on data that is about to be copied into it.
Implements a response intercepter that logs the upload id if the upload id header exists and it
is the first request (does not have upload_id parameter in the request).
Base
Exception
for signaling errors in user custom code.Extends
UserCodeQuotaException
to allow the user custom code to specifically signal a
Quota or API overuse related error.A
UserCodeExecutionException
that signals an error with a remote system.An extension of
UserCodeQuotaException
to specifically signal a user code timeout.Holds user defined function definitions.
Category tag for validation tests which utilize
Metrics
.Category tag for validation tests which utilize splittable
ParDo
with a DoFn.BoundedPerElement
DoFn
.Category tag for validation tests which utilize
BoundedTrie
.Category tag for validation tests which use
DoFn.BundleFinalizer
.Category tag for validation tests which utilize
Metrics
.Category tag for validation tests which utilize
Counter
.Category tag for validation tests which utilize custom window merging.
Category tag for validation tests which utilize
Distribution
.Category tag for tests which relies on a pre-defined port, such as expansion service or transform
service.
Category tag for tests which validate that currect failure message is provided by failed
pipeline.
Category tag for validation tests which utilize
Gauge
.Category for tests that use
Impulse
transformations.Category tag for tests which use the expansion service in Java.
Category tag for validation tests which use key.
Category tag for validation tests which utilize --tempRoot from
TestPipelineOptions
and
and expect a default KMS key enable for the bucket specified.Category tag for validation tests which utilize looping timers in
ParDo
.Category tag for validation tests which utilize
MapState
.Category tag for validation tests which utilize the metrics pusher feature.
Category tag for validation tests which utilize
MultimapState
.Category tag for validation tests which utilize
DoFn.OnWindowExpiration
.Category tag for validation tests which utilize
OrderedListState
.Category tag for the ParDoLifecycleTest for exclusion (BEAM-3241).
Category tag for validation tests which rely on a runner providing per-key ordering.
Category tag for validation tests which rely on a runner providing per-key ordering in between
transforms in the same ProcessBundleRequest.
Category tag for validation tests which utilize timers in
ParDo
.Category tag for tests which use the expansion service in Python.
Category tag for validation tests which utilize
DoFn.RequiresTimeSortedInput
in stateful
ParDo
.Category tag for validation tests which utilize schemas.
Category tag for tests which validate that the SDK harness executes in a well formed environment.
Category tag for validation tests which utilize
SetState
.Category tag for validation tests which use sideinputs.
Category tag for validation tests which use multiple side inputs with different coders.
Category tag for validation tests which utilize stateful
ParDo
.Category for tests that enforce strict event-time ordering of fired timers, even in situations
where multiple tests mutually set one another and watermark hops arbitrarily far to the future.
Category tag for validation tests which utilize
StringSet
.Category tag for tests that use System metrics.
Category tag for tests that use
TestStream
, which is not a part of the Beam model but a
special feature currently only implemented by the direct runner and the Flink Runner (streaming).Subcategory for
UsesTestStream
tests which use TestStream
# across multiple
stages.Category tag for validation tests which use outputTimestamp.
Subcategory for
UsesTestStream
tests which use the processing time feature of TestStream
.Category tag for validation tests which use timerMap.
Category tag for validation tests which utilize timers in
ParDo
.Category tag for validation tests which use triggered sideinputs.
Category tag for validation tests which utilize at least one unbounded
PCollection
.Category tag for validation tests which utilize splittable
ParDo
with a DoFn.UnboundedPerElement
DoFn
.Various common methods used by the Jet based runner.
A wrapper of
byte[]
that can be used as a hash-map key.A Uuid storable in a Pub/Sub Lite attribute.
A coder for a Uuid.
Options for deduplicating Pub/Sub Lite messages based on the UUID they were published with.
A transform for deduplicating Pub/Sub Lite messages based on the UUID they were published with.
Base class for types representing UUID as two long values.
Category tag for tests which validate that a Beam runner is correctly implemented.
Validation
represents a set of annotations that can be used to annotate getter properties
on PipelineOptions
with information representing the validation criteria to be used when
validating with the PipelineOptionsValidator
.This criteria specifies that the value must be not null.
Kryo serializer for
ValueAndCoderLazySerializable
.A holder object that lets you serialize an element with a Coder with minimal wasted space.
Represents the capture type of a change stream.
An immutable tuple of value, timestamp, window, and pane.
A coder for
ValueInSingleWindow
.A
ValueProvider
abstracts the notion of fetching a value that may or may not be currently
available.For internal use only; no backwards compatibility guarantees.
ValueProvider.NestedValueProvider
is an implementation of ValueProvider
that allows for
wrapping another ValueProvider
object.ValueProvider.RuntimeValueProvider
is an implementation of ValueProvider
that allows for a
value to be provided at execution time rather than at graph construction time.For internal use only; no backwards compatibility guarantees.
ValueProvider.StaticValueProvider
is an implementation of ValueProvider
that allows for a
static value to be provided.Utilities for working with the
ValueProvider
interface.Values<V>
takes a PCollection
of KV<K, V>
s and returns a
PCollection<V>
of the values.A
ReadableState
cell containing a single value.For internal use only; no backwards compatibility guarantees.
A LogicalType representing a variable-length byte array with specified maximum length.
A LogicalType representing a variable-length string with specified maximum length.
Combine.CombineFn
for Variance on Number
types.Benchmarks for
VarInt
and variants.Output to
Blackhole
.Input from randomly generated bytes.
Output to
ByteStringOutputStream
.Input from randomly generated longs.
Factory class for PTransforms integrating with Google Cloud AI - VideoIntelligence service.
A PTransform taking a PCollection of
ByteString
and an optional side input with a
context map and emitting lists of VideoAnnotationResults
for each element.A PTransform taking a PCollection of
KV
of ByteString
and VideoContext
and emitting lists of VideoAnnotationResults
for each element.A PTransform taking a PCollection of
String
and an optional side input with a context
map and emitting lists of VideoAnnotationResults
for each element.Transforms for creating
PCollectionViews
from PCollections
(to read them as side inputs).For internal use only; no backwards-compatibility guarantees.
For internal use only; no backwards-compatibility guarantees.
For internal use only; no backwards-compatibility guarantees.
For internal use only; no backwards-compatibility guarantees.
For internal use only; no backwards-compatibility guarantees.
For internal use only; no backwards-compatibility guarantees.
Provides an index to value mapping using a random starting index and also provides an offset
range for each window seen.
For internal use only; no backwards-compatibility guarantees.
Jet
Processor
implementation for Beam's side input producing
primitives.Delays processing of each window in a
PCollection
until signaled.Implementation of
Wait.on(org.apache.beam.sdk.values.PCollection<?>...)
.Given a "poll function" that produces a potentially growing set of outputs for an input, this
transform simultaneously continuously watches the growth of output sets of all inputs, until a
per-input termination condition is reached.
A function that computes the current set of outputs for the given input, in the form of a
Watch.Growth.PollResult
.The result of a single invocation of a
Watch.Growth.PollFn
.A strategy for determining whether it is time to stop polling the current input regardless of
whether its output is complete or not.
A
WatermarkEstimator
which is used for estimating output watermarks of a splittable
DoFn
.Support utilties for interacting with
WatermarkEstimator
s.A set of
WatermarkEstimator
s that users can use to advance the output watermark for their
associated splittable DoFn
s.Concrete implementation of a
ManualWatermarkEstimator
.A watermark estimator that observes timestamps of records output from a DoFn reporting the
timestamp of the last element seen as the current watermark.
A watermark estimator that tracks wall time.
Interface which allows for accessing the current watermark and watermark estimator state.
For internal use only; no backwards-compatibility guarantees.
WatermarkParameters
contains the parameters used for watermark computation.Implement this interface to define a custom watermark calculation heuristic.
Implement this interface to create a
WatermarkPolicy
.ArrivalTimeWatermarkPolicy uses
WatermarkPolicyFactory.CustomWatermarkPolicy
for watermark computation.CustomWatermarkPolicy uses parameters defined in
WatermarkParameters
to compute
watermarks.Watermark policy where the processing time is used as the event time.
Defines the behavior for a OIDC web identity token provider.
Window
logically divides up or groups the elements of a PCollection
into finite
windows according to a WindowFn
.A Primitive
PTransform
that assigns windows to elements based on a WindowFn
.Specifies the conditions under which a final pane will be created when a window is permanently
closed.
Specifies the conditions under which an on-time pane will be created when a window is closed.
Flink operator for executing window
DoFns
.A value along with Beam's windowing information and all other metadata.
Implementations of
WindowedValue
and static utility methods.Coder for
WindowedValue
.A parameterized coder for
WindowedValue
.A
WindowedValues
which holds exactly single window per value.Deprecated.
Use ParamWindowedValueCoder instead, it is a general purpose implementation of the
same concept but makes timestamp, windows and pane info configurable.
Abstract class for
WindowedValue
coder.The argument to the
Window
transform used to assign elements into windows and to
determine how windows are merged.A utility class for testing
WindowFn
s.Jet
Processor
implementation for Beam's GroupByKeyOnly +
GroupAlsoByWindow primitives.A
WindowingStrategy
describes the windowing behavior for a specific collection of values.The accumulation modes that can be used with windowing.
An implementation of
TypedSchemaTransformProvider
for WindowInto.A function that takes the windows of elements in a main input and maps them to the appropriate
window in a
PCollectionView
consumed as a side input
.Helpers to construct coders for gRPC port reads and writes.
A collection of utilities for writing transforms that can handle exceptions raised during
processing of elements.
A simple handler that extracts information from an exception to a
Map<String, String>
and returns a KV
where the key is the input element that failed processing, and the
value is the map of exception attributes.The value type passed as input to exception handlers.
An intermediate output type for PTransforms that allows an output collection to live alongside
a collection of elements that failed the transform.
WithKeys<K, V>
takes a PCollection<V>
, and either a constant key of type
K
or a function from V
to K
, and returns a PCollection<KV<K, V>>
, where
each of the values in the input PCollection
has been paired with either the constant key
or a key computed from the value.A
and
MetricRegistry
decorator-like that supports
invalid reference
AggregatorMetric
SparkBeamMetric
as Gauges
.A
MetricRegistry
decorator-like that supports BeamMetricSet
s as Gauges
.A
PTransform
for assigning timestamps to all the elements of a PCollection
.Duplicated from beam-examples-java to avoid dependency.
A PTransform that converts a PCollection containing lines of text into a PCollection of
formatted word counts.
A SimpleFunction that converts a Word and Count into a printable string.
Options supported by
WordCount
.Workarounds for dealing with limitations of Flink or its libraries.
KeySelector
that retrieves a key from a KeyedWorkItem
.Wrapper class for
ReceiverSupervisor
that doesn't use Spark Environment.Parameters class to expose the transform to an external SDK.
Enum containing all supported dispositions during writing to table phase.
A
PTransform
that writes to a FileBasedSink
.The result of a
WriteFiles
transform.Return type of
JmsIO.Write
transform.The result of a
BigQueryIO.Write
transform.Transform for writing to Apache Pulsar.
Transforms for reading and writing XML files using JAXB mappers.
Implementation of
XmlIO.read()
.Deprecated.
Use
Compression
instead.Implementation of
XmlIO.readFiles()
.Implementation of
XmlIO.sink(java.lang.Class<T>)
.Implementation of
XmlIO.write()
.Implementation of
XmlIO.read()
.A
FileWriteSchemaTransformFormatProvider
for XML format.Allows one to invoke Beam YAML
transforms from Java.
Utility methods for ZetaSQL invalid input: '<'=> Beam translation.
Utility methods for ZetaSQL invalid input: '<'=> Calcite translation.
Exception to be thrown by the Beam ZetaSQL planner.
ZetaSQLQueryPlanner.
ZetaSQL-specific extension to
ScalarFunctionImpl
.This class is a copy of Uncollect.java in Calcite:
https://github.com/apache/calcite/blob/calcite-1.20.0/core/src/main/java/org/apache/calcite/rel/core/Uncollect.java
except that in deriveUncollectRowType() it does not unwrap array elements of struct type.
This is a class to indicate that a TVF is a ZetaSQL SQL native UDTVF.
Wraps an existing coder with Zstandard compression.
ApproximateCountDistinct
in thezetasketch
extension module, which makes use of theHllCount
implementation.