Class DatastoreV1.Read

java.lang.Object
org.apache.beam.sdk.transforms.PTransform<PBegin,PCollection<com.google.datastore.v1.Entity>>
org.apache.beam.sdk.io.gcp.datastore.DatastoreV1.Read
All Implemented Interfaces:
Serializable, HasDisplayData
Enclosing class:
DatastoreV1

public abstract static class DatastoreV1.Read extends PTransform<PBegin,PCollection<com.google.datastore.v1.Entity>>
A PTransform that reads the result rows of a Cloud Datastore query as Entity objects.
See Also:
  • Field Details

    • NUM_QUERY_SPLITS_MAX

      public static final int NUM_QUERY_SPLITS_MAX
      An upper bound on the number of splits for a query.
      See Also:
  • Constructor Details

    • Read

      public Read()
  • Method Details

    • getProjectId

      public abstract @Nullable ValueProvider<String> getProjectId()
    • getDatabaseId

      public abstract @Nullable ValueProvider<String> getDatabaseId()
    • getQuery

      public abstract @Nullable com.google.datastore.v1.Query getQuery()
    • getLiteralGqlQuery

      public abstract @Nullable ValueProvider<String> getLiteralGqlQuery()
    • getNamespace

      public abstract @Nullable ValueProvider<String> getNamespace()
    • getNumQuerySplits

      public abstract int getNumQuerySplits()
    • getLocalhost

      public abstract @Nullable String getLocalhost()
    • getReadTime

      public abstract @Nullable Instant getReadTime()
    • toString

      public abstract String toString()
      Overrides:
      toString in class PTransform<PBegin,PCollection<com.google.datastore.v1.Entity>>
    • withDatabaseId

      public DatastoreV1.Read withDatabaseId(String databaseId)
      Returns a new DatastoreV1.Read that reads from the Cloud Datastore for the specified database.
    • withProjectId

      public DatastoreV1.Read withProjectId(String projectId)
      Returns a new DatastoreV1.Read that reads from the Cloud Datastore for the specified project.
    • withProjectId

      public DatastoreV1.Read withProjectId(ValueProvider<String> projectId)
    • withQuery

      public DatastoreV1.Read withQuery(com.google.datastore.v1.Query query)
      Returns a new DatastoreV1.Read that reads the results of the specified query.

      Note: Normally, DatastoreIO will read from Cloud Datastore in parallel across many workers. However, when the Query is configured with a limit using Query.Builder.setLimit(com.google.protobuf.Int32Value), then all results will be read by a single worker in order to ensure correct results.

    • withLiteralGqlQuery

      public DatastoreV1.Read withLiteralGqlQuery(String gqlQuery)
      Returns a new DatastoreV1.Read that reads the results of the specified GQL query. See GQL Reference to know more about GQL grammar.

      Note: This query is executed with literals allowed, so the users should ensure that the query is originated from trusted sources to avoid any security vulnerabilities via SQL Injection.

      Cloud Datastore does not a provide a clean way to translate a gql query string to Query, so we end up making a query to the service for translation but this may read the actual data, although it will be a small amount. It needs more validation through production use cases before marking it as stable.

    • withLiteralGqlQuery

      public DatastoreV1.Read withLiteralGqlQuery(ValueProvider<String> gqlQuery)
    • withNamespace

      public DatastoreV1.Read withNamespace(String namespace)
      Returns a new DatastoreV1.Read that reads from the given namespace.
    • withNamespace

      public DatastoreV1.Read withNamespace(ValueProvider<String> namespace)
    • withNumQuerySplits

      public DatastoreV1.Read withNumQuerySplits(int numQuerySplits)
      Returns a new DatastoreV1.Read that reads by splitting the given query into numQuerySplits.

      The semantics for the query splitting is defined below:

      • Any value less than or equal to 0 will be ignored, and the number of splits will be chosen dynamically at runtime based on the query data size.
      • Any value greater than NUM_QUERY_SPLITS_MAX will be capped at NUM_QUERY_SPLITS_MAX.
      • If the query has a user limit set, or contains inequality filters, then numQuerySplits will be ignored and no split will be performed.
      • Under certain cases Cloud Datastore is unable to split query to the requested number of splits. In such cases we just use whatever the Cloud Datastore returns.
    • withLocalhost

      public DatastoreV1.Read withLocalhost(String localhost)
      Returns a new DatastoreV1.Read that reads from a Datastore Emulator running at the given localhost address.
    • withReadTime

      public DatastoreV1.Read withReadTime(Instant readTime)
      Returns a new DatastoreV1.Read that reads at the specified readTime.
    • getNumEntities

      public long getNumEntities(PipelineOptions options, String ourKind, @Nullable String namespace)
      Returns Number of entities available for reading.
    • expand

      public PCollection<com.google.datastore.v1.Entity> expand(PBegin input)
      Description copied from class: PTransform
      Override this method to specify how this PTransform should be expanded on the given InputT.

      NOTE: This method should not be called directly. Instead apply the PTransform should be applied to the InputT using the apply method.

      Composite transforms, which are defined in terms of other transforms, should return the output of one of the composed transforms. Non-composite transforms, which do not apply any transforms internally, should return a new unbound output and register evaluators (via backend-specific registration methods).

      Specified by:
      expand in class PTransform<PBegin,PCollection<com.google.datastore.v1.Entity>>
    • populateDisplayData

      public void populateDisplayData(DisplayData.Builder builder)
      Description copied from class: PTransform
      Register display data for the given transform or component.

      populateDisplayData(DisplayData.Builder) is invoked by Pipeline runners to collect display data via DisplayData.from(HasDisplayData). Implementations may call super.populateDisplayData(builder) in order to register display data in the current namespace, but should otherwise use subcomponent.populateDisplayData(builder) to use the namespace of the subcomponent.

      By default, does not register any display data. Implementors may override this method to provide their own display data.

      Specified by:
      populateDisplayData in interface HasDisplayData
      Overrides:
      populateDisplayData in class PTransform<PBegin,PCollection<com.google.datastore.v1.Entity>>
      Parameters:
      builder - The builder to populate with display data.
      See Also: