apache_beam.io.gcp.datastore.v1new.query_splitter module¶
Implements a Cloud Datastore query splitter.
For internal use only. No backwards compatibility guarantees.
-
exception
apache_beam.io.gcp.datastore.v1new.query_splitter.
QuerySplitterError
[source]¶ Bases:
exceptions.Exception
Top-level error type.
-
exception
apache_beam.io.gcp.datastore.v1new.query_splitter.
SplitNotPossibleError
[source]¶ Bases:
apache_beam.io.gcp.datastore.v1new.query_splitter.QuerySplitterError
Raised when some parameter of the query does not allow splitting.
-
apache_beam.io.gcp.datastore.v1new.query_splitter.
get_splits
(client, query, num_splits)[source]¶ Returns a list of sharded queries for the given Cloud Datastore query.
This will create up to the desired number of splits, however it may return less splits if the desired number of splits is unavailable. This will happen if the number of split points provided by the underlying Datastore is less than the desired number, which will occur if the number of results for the query is too small.
This implementation of the QuerySplitter uses the __scatter__ property to gather random split points for a query.
Note: This implementation is derived from the java query splitter in https://github.com/GoogleCloudPlatform/google-cloud-datastore/blob/master/java/datastore/src/main/java/com/google/datastore/v1/client/QuerySplitterImpl.java
Parameters: - client – the datastore client.
- query – the query to split.
- num_splits – the desired number of splits.
Returns: A list of split queries, of a max length of num_splits
Raises: QuerySplitterError if split could not be performed owing to query or split – parameters.