apache_beam.runners.interactive.dataproc.dataproc_cluster_manager module¶
-
class
apache_beam.runners.interactive.dataproc.dataproc_cluster_manager.
DataprocClusterManager
(cluster_metadata: apache_beam.runners.interactive.dataproc.types.ClusterMetadata)[source]¶ Bases:
object
Self-contained cluster manager that controls the lifecyle of a Dataproc cluster connected by one or more pipelines under Interactive Beam.
Initializes the DataprocClusterManager with properties required to interface with the Dataproc ClusterControllerClient.
-
create_cluster
(cluster: dict) → None[source]¶ Attempts to create a cluster using attributes that were initialized with the DataprocClusterManager instance.
Parameters: cluster – Dictionary representing Dataproc cluster. Read more about the schema for clusters here: https://cloud.google.com/python/docs/reference/dataproc/latest/google.cloud.dataproc_v1.types.Cluster
-
create_flink_cluster
() → None[source]¶ Calls _create_cluster with a configuration that enables FlinkRunner.
-
cleanup
() → None[source]¶ Deletes the cluster that uses the attributes initialized with the DataprocClusterManager instance.
-
get_cluster_details
() → google.cloud.dataproc_v1.types.clusters.Cluster[source]¶ Gets the Dataproc_v1 Cluster object for the current cluster manager.
-
parse_master_url_and_dashboard
(line: str) → Tuple[str, str][source]¶ Parses the master_url and YARN application_id of the Flink process from an input line. The line containing both the master_url and application id is always formatted as such: {text} Found Web Interface {master_url} of application ‘{application_id}’.n
Truncated example where ‘…’ represents additional text between segments: … google-dataproc-startup[000]: … activate-component-flink[0000]: …org.apache.flink.yarn.YarnClusterDescriptor… [] - Found Web Interface example-master-url:50000 of application ‘application_123456789000_0001’.
Returns the flink_master_url and dashboard link as a tuple.
-