See: Description
Package | Description |
---|---|
org.apache.beam.runners.apex |
Implementation of the Beam runner for Apache Apex.
|
org.apache.beam.runners.dataflow |
Provides a Beam runner that executes pipelines on the Google Cloud Dataflow service.
|
org.apache.beam.runners.dataflow.options |
Provides
PipelineOptions specific to Google Cloud Dataflow. |
org.apache.beam.runners.dataflow.util |
Provides miscellaneous internal utilities used by the Google Cloud Dataflow runner.
|
org.apache.beam.runners.direct |
Defines the
PipelineOptions.DirectRunner which executes both
Bounded and Unbounded Pipelines on the local machine. |
org.apache.beam.runners.direct.portable |
Defines the
PipelineOptions.DirectRunner which executes both
Bounded and Unbounded Pipelines on the local machine. |
org.apache.beam.runners.direct.portable.artifact |
Provides local implementations of the Artifact API services.
|
org.apache.beam.runners.direct.portable.job |
An execution engine for Beam Pipelines that uses the Java Runner harness and the Fn API to
execute.
|
org.apache.beam.runners.flink |
Internal implementation of the Beam runner for Apache Flink.
|
org.apache.beam.runners.flink.metrics |
Internal metrics implementation of the Beam runner for Apache Flink.
|
org.apache.beam.runners.fnexecution |
Utilities used by runners to interact with the fn execution components of the Beam Portability
Framework.
|
org.apache.beam.runners.fnexecution.artifact |
Pipeline execution-time artifact-management services, including abstract implementations of the
Artifact Retrieval Service.
|
org.apache.beam.runners.fnexecution.control |
Utilities for a Beam runner to interact with the Fn API
Control Service via java abstractions. |
org.apache.beam.runners.fnexecution.data |
Utilities for a Beam runner to interact with the Fn API
Data Service via java abstractions. |
org.apache.beam.runners.fnexecution.environment |
Classes used to instantiate and manage SDK harness environments.
|
org.apache.beam.runners.fnexecution.environment.testing |
Test utilities for the environment management package.
|
org.apache.beam.runners.fnexecution.jobsubmission |
Job management services for use in beam runners.
|
org.apache.beam.runners.fnexecution.logging |
Classes used to log informational messages over the
Beam Fn Logging Service . |
org.apache.beam.runners.fnexecution.provisioning |
Provision api services.
|
org.apache.beam.runners.fnexecution.splittabledofn |
Utilities for a Beam runner to interact with a remotely running splittable DoFn.
|
org.apache.beam.runners.fnexecution.state |
State API services.
|
org.apache.beam.runners.fnexecution.wire |
Wire coders for communications between runner and SDK harness.
|
org.apache.beam.runners.gearpump |
Internal implementation of the Beam runner for Apache Gearpump.
|
org.apache.beam.runners.gearpump.translators |
Gearpump specific translators.
|
org.apache.beam.runners.gearpump.translators.functions |
Gearpump specific wrappers for Beam DoFn.
|
org.apache.beam.runners.gearpump.translators.io |
Gearpump specific wrappers for Beam I/O.
|
org.apache.beam.runners.gearpump.translators.utils |
Utilities for translators.
|
org.apache.beam.runners.local |
Utilities useful when executing a pipeline on a single machine.
|
org.apache.beam.runners.reference |
Support for executing a pipeline locally over the Beam fn API.
|
org.apache.beam.runners.reference.testing |
Testing utilities for the reference runner.
|
org.apache.beam.runners.spark |
Internal implementation of the Beam runner for Apache Spark.
|
org.apache.beam.runners.spark.aggregators |
Provides internal utilities for implementing Beam aggregators using Spark accumulators.
|
org.apache.beam.runners.spark.coders |
Beam coders and coder-related utilities for running on Apache Spark.
|
org.apache.beam.runners.spark.io |
Spark-specific transforms for I/O.
|
org.apache.beam.runners.spark.metrics |
Provides internal utilities for implementing Beam metrics using Spark accumulators.
|
org.apache.beam.runners.spark.metrics.sink |
Spark sinks that supports beam metrics and aggregators.
|
org.apache.beam.runners.spark.stateful |
Spark-specific stateful operators.
|
org.apache.beam.runners.spark.util |
Internal utilities to translate Beam pipelines to Spark.
|
org.apache.beam.sdk |
Provides a simple, powerful model for building both batch and streaming parallel data processing
Pipeline s. |
org.apache.beam.sdk.annotations |
Defines annotations used across the SDK.
|
org.apache.beam.sdk.coders |
Defines
Coders to specify how data is encoded to and
decoded from byte strings. |
org.apache.beam.sdk.extensions.gcp.auth |
Defines classes related to interacting with
Credentials for pipeline
creation and execution containing Google Cloud Platform components. |
org.apache.beam.sdk.extensions.gcp.options |
Defines
PipelineOptions for configuring pipeline execution
for Google Cloud Platform components. |
org.apache.beam.sdk.extensions.gcp.storage |
Defines IO connectors for Google Cloud Storage.
|
org.apache.beam.sdk.extensions.jackson |
Utilities for parsing and creating JSON serialized objects.
|
org.apache.beam.sdk.extensions.joinlibrary |
Utilities for performing SQL-style joins of keyed
PCollections . |
org.apache.beam.sdk.extensions.protobuf |
Defines a
Coder for Protocol Buffers messages, ProtoCoder . |
org.apache.beam.sdk.extensions.sketching |
Utilities for computing statistical indicators using probabilistic sketches.
|
org.apache.beam.sdk.extensions.sorter |
Utility for performing local sort of potentially large sets of values.
|
org.apache.beam.sdk.extensions.sql |
BeamSQL provides a new interface to run a SQL statement with Beam.
|
org.apache.beam.sdk.extensions.sql.example.model |
Java classes used to for modeling the examples.
|
org.apache.beam.sdk.extensions.sql.impl |
Implementation classes of BeamSql.
|
org.apache.beam.sdk.extensions.sql.impl.parser |
Beam SQL parsing additions to Calcite SQL.
|
org.apache.beam.sdk.extensions.sql.impl.planner |
BeamQueryPlanner is the main interface. |
org.apache.beam.sdk.extensions.sql.impl.rel |
BeamSQL specified nodes, to replace
RelNode . |
org.apache.beam.sdk.extensions.sql.impl.rule |
RelOptRule to generate BeamRelNode . |
org.apache.beam.sdk.extensions.sql.impl.schema |
define table schema, to map with Beam IO components.
|
org.apache.beam.sdk.extensions.sql.impl.transform |
PTransform used in a BeamSql pipeline. |
org.apache.beam.sdk.extensions.sql.impl.transform.agg |
Implementation of standard SQL aggregation functions, e.g.
|
org.apache.beam.sdk.extensions.sql.impl.udf |
UDF classes.
|
org.apache.beam.sdk.extensions.sql.impl.utils |
Utility classes.
|
org.apache.beam.sdk.extensions.sql.meta |
Metadata related classes.
|
org.apache.beam.sdk.extensions.sql.meta.provider |
Table providers.
|
org.apache.beam.sdk.extensions.sql.meta.provider.bigquery |
Table schema for BigQuery.
|
org.apache.beam.sdk.extensions.sql.meta.provider.hcatalog |
Table schema for HCatalog.
|
org.apache.beam.sdk.extensions.sql.meta.provider.kafka |
Table schema for KafkaIO.
|
org.apache.beam.sdk.extensions.sql.meta.provider.pubsub |
Table schema for
PubsubIO . |
org.apache.beam.sdk.extensions.sql.meta.provider.test |
Table schema for in-memory test data.
|
org.apache.beam.sdk.extensions.sql.meta.provider.text |
Table schema for text files.
|
org.apache.beam.sdk.extensions.sql.meta.store |
Meta stores.
|
org.apache.beam.sdk.fn |
The top level package for the Fn Execution Java libraries.
|
org.apache.beam.sdk.fn.channel |
gRPC channel management.
|
org.apache.beam.sdk.fn.data |
Classes to interact with the portability framework data plane.
|
org.apache.beam.sdk.fn.function |
Java 8 functional interface extensions.
|
org.apache.beam.sdk.fn.stream |
gRPC stream management.
|
org.apache.beam.sdk.fn.test |
Utilities for testing use of this package.
|
org.apache.beam.sdk.fn.windowing |
Common utilities related to windowing during execution of a pipeline.
|
org.apache.beam.sdk.io | |
org.apache.beam.sdk.io.amqp |
Transforms for reading and writing using AMQP 1.0 protocol.
|
org.apache.beam.sdk.io.aws.options |
Defines
PipelineOptions for configuring pipeline execution
for Amazon Web Services components. |
org.apache.beam.sdk.io.aws.s3 |
Defines IO connectors for Amazon Web Services S3.
|
org.apache.beam.sdk.io.aws.sns |
Defines IO connectors for Amazon Web Services SNS.
|
org.apache.beam.sdk.io.aws.sqs |
Defines IO connectors for Amazon Web Services SQS.
|
org.apache.beam.sdk.io.cassandra |
Transforms for reading and writing from/to Apache Cassandra.
|
org.apache.beam.sdk.io.clickhouse |
Transform for writing to ClickHouse.
|
org.apache.beam.sdk.io.elasticsearch |
Transforms for reading and writing from Elasticsearch.
|
org.apache.beam.sdk.io.fs |
Apache Beam FileSystem interfaces and their default implementations.
|
org.apache.beam.sdk.io.gcp.bigquery |
Defines transforms for reading and writing from Google BigQuery.
|
org.apache.beam.sdk.io.gcp.bigtable |
Defines transforms for reading and writing from Google Cloud Bigtable.
|
org.apache.beam.sdk.io.gcp.common |
Defines common Google Cloud Platform IO support classes.
|
org.apache.beam.sdk.io.gcp.datastore |
Provides an API for reading from and writing to Google Cloud Datastore over different
versions of the Cloud Datastore Client libraries.
|
org.apache.beam.sdk.io.gcp.pubsub |
Defines transforms for reading and writing from Google
Cloud Pub/Sub.
|
org.apache.beam.sdk.io.gcp.spanner |
Provides an API for reading from and writing to Google Cloud Spanner.
|
org.apache.beam.sdk.io.gcp.testing |
Defines utilities for unit testing Google Cloud Platform components of Apache Beam pipelines.
|
org.apache.beam.sdk.io.hadoop |
Classes shared by Hadoop based IOs.
|
org.apache.beam.sdk.io.hadoop.format |
Defines transforms for writing to Data sinks that implement
HadoopFormatIO . |
org.apache.beam.sdk.io.hadoop.inputformat |
Defines transforms for reading from Data sources which implement Hadoop Input Format.
|
org.apache.beam.sdk.io.hbase |
Transforms for reading and writing from/to Apache HBase.
|
org.apache.beam.sdk.io.hcatalog |
Transforms for reading and writing using HCatalog.
|
org.apache.beam.sdk.io.hcatalog.test |
Test utilities for HCatalog IO.
|
org.apache.beam.sdk.io.hdfs |
FileSystem implementation for any Hadoop FileSystem . |
org.apache.beam.sdk.io.jdbc |
Transforms for reading and writing from JDBC.
|
org.apache.beam.sdk.io.jms |
Transforms for reading and writing from JMS (Java Messaging Service).
|
org.apache.beam.sdk.io.kafka |
Transforms for reading and writing from Apache Kafka.
|
org.apache.beam.sdk.io.kafka.serialization |
Kafka serializers and deserializers.
|
org.apache.beam.sdk.io.kinesis |
Transforms for reading and writing from Amazon Kinesis.
|
org.apache.beam.sdk.io.mongodb |
Transforms for reading and writing from MongoDB.
|
org.apache.beam.sdk.io.mqtt |
Transforms for reading and writing from MQTT.
|
org.apache.beam.sdk.io.parquet |
Transforms for reading and writing from Parquet.
|
org.apache.beam.sdk.io.range |
Provides thread-safe helpers for implementing dynamic work rebalancing in position-based bounded
sources.
|
org.apache.beam.sdk.io.redis |
Transforms for reading and writing from Redis.
|
org.apache.beam.sdk.io.solr |
Transforms for reading and writing from/to Solr.
|
org.apache.beam.sdk.io.tika |
Transform for reading and parsing files with Apache Tika.
|
org.apache.beam.sdk.io.xml |
Transforms for reading and writing Xml files.
|
org.apache.beam.sdk.metrics |
Metrics allow exporting information about the execution of a pipeline.
|
org.apache.beam.sdk.options |
Defines
PipelineOptions for configuring pipeline execution. |
org.apache.beam.sdk.schemas | |
org.apache.beam.sdk.schemas.annotations | |
org.apache.beam.sdk.schemas.parser |
Defines utilities for deailing with schemas.
|
org.apache.beam.sdk.schemas.parser.generated |
Defines utilities for deailing with schemas.
|
org.apache.beam.sdk.schemas.transforms |
Defines transforms that work on PCollections with schemas..
|
org.apache.beam.sdk.schemas.utils |
Defines utilities for deailing with schemas.
|
org.apache.beam.sdk.state |
Classes and interfaces for interacting with state.
|
org.apache.beam.sdk.testing |
Defines utilities for unit testing Apache Beam pipelines.
|
org.apache.beam.sdk.transforms |
Defines
PTransform s for transforming data in a pipeline. |
org.apache.beam.sdk.transforms.display |
Defines
HasDisplayData for annotating components
which provide display data used within
UIs and diagnostic tools. |
org.apache.beam.sdk.transforms.join |
Defines the
CoGroupByKey transform for joining
multiple PCollections. |
org.apache.beam.sdk.transforms.splittabledofn |
Defines utilities related to splittable
DoFn . |
org.apache.beam.sdk.transforms.windowing | |
org.apache.beam.sdk.values |
Defines
PCollection and other classes for representing data in
a Pipeline . |
The Apache Beam SDK for Java provides a simple and elegant programming model to express your data processing pipelines; see the Apache Beam website for more information and getting started instructions.
The easiest way to use the Apache Beam SDK for Java is via one of the released artifacts from the Maven Central Repository.
Version numbers use the form major.minor.incremental and are incremented as follows:
Please note that APIs marked
@Experimental
may change at any point and are not guaranteed to remain compatible across versions.