apache_beam.io.gcp.datastore.v1.helper module

Cloud Datastore helper functions.

For internal use only; no backwards-compatibility guarantees.

apache_beam.io.gcp.datastore.v1.helper.key_comparator(k1, k2)[source]

A comparator for Datastore keys.

Comparison is only valid for keys in the same partition. The comparison here is between the list of paths for each key.

apache_beam.io.gcp.datastore.v1.helper.compare_path(p1, p2)[source]

A comparator for key path.

A path has either an id or a name field defined. The comparison works with the following rules:

1. If one path has id defined while the other doesn’t, then the one with id defined is considered smaller. 2. If both paths have id defined, then their ids are compared. 3. If no id is defined for both paths, then their names are compared.

apache_beam.io.gcp.datastore.v1.helper.get_datastore(project)[source]

Returns a Cloud Datastore client.

apache_beam.io.gcp.datastore.v1.helper.make_request(project, namespace, query)[source]

Make a Cloud Datastore request for the given query.

apache_beam.io.gcp.datastore.v1.helper.make_partition(project, namespace)[source]

Make a PartitionId for the given project and namespace.

apache_beam.io.gcp.datastore.v1.helper.retry_on_rpc_error(exception)[source]

A retry filter for Cloud Datastore RPCErrors.

apache_beam.io.gcp.datastore.v1.helper.fetch_entities(project, namespace, query, datastore)[source]

A helper method to fetch entities from Cloud Datastore.

Parameters:
  • project – Project ID
  • namespace – Cloud Datastore namespace
  • query – Query to be read from
  • datastore – Cloud Datastore Client
Returns:

An iterator of entities.

apache_beam.io.gcp.datastore.v1.helper.is_key_valid(key)[source]

Returns True if a Cloud Datastore key is complete.

A key is complete if its last element has either an id or a name.

apache_beam.io.gcp.datastore.v1.helper.write_mutations(datastore, project, mutations, throttler, rpc_stats_callback=None, throttle_delay=1)[source]

A helper function to write a batch of mutations to Cloud Datastore.

If a commit fails, it will be retried upto 5 times. All mutations in the batch will be committed again, even if the commit was partially successful. If the retry limit is exceeded, the last exception from Cloud Datastore will be raised.

Parameters:
  • datastore – googledatastore.connection.Datastore
  • project – str, project id
  • mutations – list of google.cloud.proto.datastore.v1.datastore_pb2.Mutation
  • rpc_stats_callback – a function to call with arguments successes and failures and throttled_secs; this is called to record successful and failed RPCs to Datastore and time spent waiting for throttling.
  • throttler – AdaptiveThrottler, to use to select requests to be throttled.
  • throttle_delay – float, time in seconds to sleep when throttled.
Returns a tuple of:
CommitResponse, the response from Datastore; int, the latency of the successful RPC in milliseconds.
apache_beam.io.gcp.datastore.v1.helper.make_latest_timestamp_query(namespace)[source]

Make a Query to fetch the latest timestamp statistics.

apache_beam.io.gcp.datastore.v1.helper.make_kind_stats_query(namespace, kind, latest_timestamp)[source]

Make a Query to fetch the latest kind statistics.

class apache_beam.io.gcp.datastore.v1.helper.QueryIterator(project, namespace, query, datastore)[source]

Bases: object

A iterator class for entities of a given query.

Entities are read in batches. Retries on failures.