apache_beam.runners.interactive.dataproc.dataproc_cluster_manager module¶
-
class
apache_beam.runners.interactive.dataproc.dataproc_cluster_manager.
DataprocClusterManager
(cluster_metadata: apache_beam.runners.interactive.dataproc.types.MasterURLIdentifier)[source]¶ Bases:
object
The DataprocClusterManager object simplifies the operations required for creating and deleting Dataproc clusters for use under Interactive Beam.
Initializes the DataprocClusterManager with properties required to interface with the Dataproc ClusterControllerClient.
-
create_cluster
(cluster: dict) → None[source]¶ Attempts to create a cluster using attributes that were initialized with the DataprocClusterManager instance.
Parameters: cluster – Dictionary representing Dataproc cluster. Read more about the schema for clusters here: https://cloud.google.com/python/docs/reference/dataproc/latest/google.cloud.dataproc_v1.types.Cluster
-
create_flink_cluster
() → None[source]¶ Calls _create_cluster with a configuration that enables FlinkRunner.
-
cleanup
() → None[source]¶ Deletes the cluster that uses the attributes initialized with the DataprocClusterManager instance.
-
get_cluster_details
(cluster_metadata: apache_beam.runners.interactive.dataproc.types.MasterURLIdentifier) → google.cloud.dataproc_v1.types.clusters.Cluster[source]¶ Gets the Dataproc_v1 Cluster object for the current cluster manager.
-
wait_for_cluster_to_provision
(cluster_metadata: apache_beam.runners.interactive.dataproc.types.MasterURLIdentifier) → None[source]¶
-
get_staging_location
(cluster_metadata: apache_beam.runners.interactive.dataproc.types.MasterURLIdentifier) → str[source]¶ Gets the staging bucket of an existing Dataproc cluster.
-
parse_master_url_and_dashboard
(cluster_metadata: apache_beam.runners.interactive.dataproc.types.MasterURLIdentifier, line: str) → Tuple[str, str][source]¶ Parses the master_url and YARN application_id of the Flink process from an input line. The line containing both the master_url and application id is always formatted as such: {text} Found Web Interface {master_url} of application ‘{application_id}’.n
Truncated example where ‘…’ represents additional text between segments: … google-dataproc-startup[000]: … activate-component-flink[0000]: …org.apache.flink.yarn.YarnClusterDescriptor… [] - Found Web Interface example-master-url:50000 of application ‘application_123456789000_0001’.
Returns the flink_master_url and dashboard link as a tuple.
-