Package org.apache.beam.sdk.managed
Class Managed
java.lang.Object
org.apache.beam.sdk.managed.Managed
Top-level
PTransform
s that build and instantiate turnkey
transforms.
Available transforms
This API currently supports two operations: read(java.lang.String)
and write(java.lang.String)
.
Please check the Managed IO
configuration page to see available transforms and config options.
Building a Managed turnkey transform
Turnkey transforms are represented as SchemaTransform
s, which means each one has a
defined configuration. A given transform can be built with a Map<String, Object>
that
specifies arguments using like so:
PCollection<Row> rows = pipeline.apply(
Managed.read(ICEBERG)
.withConfig(ImmutableMap.<String, Object>.builder()
.put("foo", "abc")
.put("bar", 123)
.build()))
.getOutput();
Instead of specifying configuration arguments directly in the code, one can provide the
location to a YAML file that contains this information. Say we have the following
config.yaml
file:
foo: "abc"
bar: 123
The file's path can be passed in to the Managed API like so:
PCollection<Row> inputRows = pipeline.apply(Create.of(...));
inputRows.apply(Managed.write(ICEBERG).withConfigUrl("path/to/config.yaml"));
Runner specific features
Google Cloud Dataflow supports additional management features forManaged
including
automatically upgrading transforms to the latest supported version. For more details and
examples, please see Dataflow
managed I/O.-
Nested Class Summary
Nested Classes -
Field Summary
Fields -
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionstatic Managed.ManagedTransform
Instantiates aManaged.ManagedTransform
transform for the specified source.static Managed.ManagedTransform
Instantiates aManaged.ManagedTransform
transform for the specified sink.
-
Field Details
-
ICEBERG
- See Also:
-
ICEBERG_CDC
- See Also:
-
KAFKA
- See Also:
-
BIGQUERY
- See Also:
-
READ_TRANSFORMS
-
WRITE_TRANSFORMS
-
-
Constructor Details
-
Managed
public Managed()
-
-
Method Details
-
read
Instantiates aManaged.ManagedTransform
transform for the specified source. The supported managed sources are:ICEBERG
: Read from Apache Iceberg tables using IcebergIOICEBERG_CDC
: CDC Read from Apache Iceberg tables using IcebergIOKAFKA
: Read from Apache Kafka topics using KafkaIOBIGQUERY
: Read from GCP BigQuery tables using BigQueryIO
-
write
Instantiates aManaged.ManagedTransform
transform for the specified sink. The supported managed sinks are:
-