Class FirestoreV1

java.lang.Object
org.apache.beam.sdk.io.gcp.firestore.FirestoreV1

@Immutable public final class FirestoreV1 extends Object
FirestoreV1 provides an API which provides lifecycle managed PTransforms for Cloud Firestore v1 API.

This class is part of the Firestore Connector DSL and should be accessed via FirestoreIO.v1().

All PTransforms provided by this API use GcpOptions on PipelineOptions for credentials access and projectId resolution. As such, the lifecycle of gRPC clients and project information is scoped to the bundle level, not the worker level.

Operations

Read

The currently supported read operations and their execution behavior are as follows:

RPC Execution Behavior Example Usage
FirestoreV1.PartitionQuery Parallel Streaming
 PCollectioninvalid input: '<'PartitionQueryRequest> partitionQueryRequests = ...;
 PCollectioninvalid input: '<'RunQueryRequest> runQueryRequests = partitionQueryRequests
     .apply(FirestoreIO.v1().read().partitionQuery().build());
 PCollectioninvalid input: '<'RunQueryResponse> runQueryResponses = runQueryRequests
     .apply(FirestoreIO.v1().read().runQuery().build());
 
FirestoreV1.RunQuery Sequential Streaming
 PCollectioninvalid input: '<'RunQueryRequest> runQueryRequests = ...;
 PCollectioninvalid input: '<'RunQueryResponse> runQueryResponses = runQueryRequests
     .apply(FirestoreIO.v1().read().runQuery().build());
 
FirestoreV1.BatchGetDocuments Sequential Streaming
 PCollectioninvalid input: '<'BatchGetDocumentsRequest> batchGetDocumentsRequests = ...;
 PCollectioninvalid input: '<'BatchGetDocumentsResponse> batchGetDocumentsResponses = batchGetDocumentsRequests
     .apply(FirestoreIO.v1().read().batchGetDocuments().build());
 
FirestoreV1.ListCollectionIds Sequential Paginated
 PCollectioninvalid input: '<'ListCollectionIdsRequest> listCollectionIdsRequests = ...;
 PCollectioninvalid input: '<'ListCollectionIdsResponse> listCollectionIdsResponses = listCollectionIdsRequests
     .apply(FirestoreIO.v1().read().listCollectionIds().build());
 
FirestoreV1.ListDocuments Sequential Paginated
 PCollectioninvalid input: '<'ListDocumentsRequest> listDocumentsRequests = ...;
 PCollectioninvalid input: '<'ListDocumentsResponse> listDocumentsResponses = listDocumentsRequests
     .apply(FirestoreIO.v1().read().listDocuments().build());
 

PartitionQuery should be preferred over other options if at all possible, becuase it has the ability to parallelize execution of multiple queries for specific sub-ranges of the full results. When choosing the value to set for PartitionQueryRequest.Builder.setPartitionCount(long), ensure you are picking a value this makes sense for your data set and your max number of workers. If you find that a partition query is taking a unexpectedly long time, try increasing the number of partitions. Depending on how large your dataset is increasing as much as 10x can significantly reduce total partition query wall time.

You should only ever use ListDocuments if the use of show_missing is needed to access a document. RunQuery and PartitionQuery will always be faster if the use of show_missing is not needed.

Write

To write a PCollection to Cloud Firestore use write(), picking the behavior of the writer.

Writes use Cloud Firestore's BatchWrite api which provides fine grained write semantics.

The default behavior is to fail a bundle if any single write fails with a non-retryable error.

 PCollectioninvalid input: '<'Write> writes = ...;
 PCollectioninvalid input: '<'FirestoreV1.WriteSuccessSummary> sink = writes
     .apply(FirestoreIO.v1().write().batchWrite().build());
 
Alternatively, if you'd rather output write failures to a Dead Letter Queue add withDeadLetterQueue when building your writer.
 PCollectioninvalid input: '<'Write> writes = ...;
 PCollectioninvalid input: '<'FirestoreV1.WriteFailure> writeFailures = writes
     .apply(FirestoreIO.v1().write().batchWrite().withDeadLetterQueue().build());
 

Permissions

Permission requirements depend on the PipelineRunner that is used to execute the pipeline. Please refer to the documentation of corresponding PipelineRunners for more details.

Please see Security for server client libraries > Roles for security and permission related information specific to Cloud Firestore.

Optionally, Cloud Firestore V1 Emulator, running locally, could be used for testing purposes by providing the host port information via FirestoreOptions.setEmulatorHost(String). In such a case, all the Cloud Firestore API calls are directed to the Emulator.

See Also:
  • Method Details

    • read

      public FirestoreV1.Read read()
      The class returned by this method provides the ability to create PTransforms for read operations available in the Firestore V1 API provided by FirestoreStub.

      This method is part of the Firestore Connector DSL and should be accessed via FirestoreIO.v1().

      Returns:
      Type safe builder factory for read operations.
      See Also:
    • write

      public FirestoreV1.Write write()
      The class returned by this method provides the ability to create PTransforms for write operations available in the Firestore V1 API provided by FirestoreStub.

      This method is part of the Firestore Connector DSL and should be accessed via FirestoreIO.v1().

      Returns:
      Type safe builder factory for write operations.
      See Also: