@Experimental(value=SOURCE_SINK) public class SpannerIO extends java.lang.Object
Transforms
for reading from and writing to Google Cloud Spanner.
To read from Cloud Spanner, apply SpannerIO.Read
transformation. It will return a
PCollection
of Structs
, where each element represents an individual row
returned from the read operation. Both Query and Read APIs are supported. See more information
about reading from Cloud Spanner
To execute a query, specify a SpannerIO.Read.withQuery(Statement)
or
SpannerIO.Read.withQuery(String)
during the construction of the transform.
PCollection<Struct> rows = p.apply(
SpannerIO.read()
.withInstanceId(instanceId)
.withDatabaseId(dbId)
.withQuery("SELECT id, name, email FROM users"));
To use the Read API, specify a table name
and a
list of columns
.
PCollection<Struct> rows = p.apply(
SpannerIO.read()
.withInstanceId(instanceId)
.withDatabaseId(dbId)
.withTable("users")
.withColumns("id", "name", "email"));
To optimally read using index, specify the index name using SpannerIO.Read.withIndex(java.lang.String)
.
The transform is guaranteed to be executed on a consistent snapshot of data, utilizing the
power of read only transactions. Staleness of data can be controlled using SpannerIO.Read.withTimestampBound(com.google.cloud.spanner.TimestampBound)
or SpannerIO.Read.withTimestamp(Timestamp)
methods. Read more about transactions in
Cloud Spanner.
It is possible to read several PCollections
within a single transaction.
Apply createTransaction()
transform, that lazily creates a transaction. The
result of this transformation can be passed to read operation using SpannerIO.Read.withTransaction(PCollectionView)
.
SpannerConfig spannerConfig = ...
PCollectionView<Transaction> tx =
p.apply(
SpannerIO.createTransaction()
.withSpannerConfig(spannerConfig)
.withTimestampBound(TimestampBound.strong()));
PCollection<Struct> users = p.apply(
SpannerIO.read()
.withSpannerConfig(spannerConfig)
.withQuery("SELECT name, email FROM users")
.withTransaction(tx));
PCollection<Struct> tweets = p.apply(
SpannerIO.read()
.withSpannerConfig(spannerConfig)
.withQuery("SELECT user, tweet, date FROM tweets")
.withTransaction(tx));
The Cloud Spanner SpannerIO.Write
transform writes to Cloud Spanner by executing a
collection of input row Mutations
. The mutations grouped into batches for
efficiency.
To configure the write transform, create an instance using write()
and then specify
the destination Cloud Spanner instance (SpannerIO.Write.withInstanceId(String)
and destination
database (SpannerIO.Write.withDatabaseId(String)
). For example:
// Earlier in the pipeline, create a PCollection of Mutations to be written to Cloud Spanner.
PCollection<Mutation> mutations = ...;
// Write mutations.
mutations.apply(
"Write", SpannerIO.write().withInstanceId("instance").withDatabaseId("database"));
The default size of the batch is set to 1MB, to override this use SpannerIO.Write.withBatchSizeBytes(long)
. Setting batch size to a small value or zero practically disables
batching.
The transform does not provide same transactional guarantees as Cloud Spanner. In particular,
Use MutationGroup
to ensure that a small set mutations is bundled together. It is
guaranteed that mutations in a group are submitted in the same transaction. Build SpannerIO.Write
transform, and call SpannerIO.Write.grouped()
method. It will return a
transformation that can be applied to a PCollection of MutationGroup.
Modifier and Type | Class and Description |
---|---|
static class |
SpannerIO.CreateTransaction
A
PTransform that create a transaction. |
static class |
SpannerIO.FailureMode
A failure handling strategy.
|
static class |
SpannerIO.Read
Implementation of
read() . |
static class |
SpannerIO.ReadAll
Implementation of
readAll() . |
static class |
SpannerIO.Write
A
PTransform that writes Mutation objects to Google Cloud Spanner. |
static class |
SpannerIO.WriteGrouped
Same as
SpannerIO.Write but supports grouped mutations. |
Modifier and Type | Method and Description |
---|---|
static SpannerIO.CreateTransaction |
createTransaction()
Returns a transform that creates a batch transaction.
|
static SpannerIO.Read |
read()
Creates an uninitialized instance of
SpannerIO.Read . |
static SpannerIO.ReadAll |
readAll()
|
static SpannerIO.Write |
write()
Creates an uninitialized instance of
SpannerIO.Write . |
@Experimental(value=SOURCE_SINK) public static SpannerIO.Read read()
SpannerIO.Read
. Before use, the SpannerIO.Read
must be
configured with a SpannerIO.Read.withInstanceId(java.lang.String)
and SpannerIO.Read.withDatabaseId(java.lang.String)
that identify the
Cloud Spanner database.@Experimental(value=SOURCE_SINK) public static SpannerIO.ReadAll readAll()
@Experimental public static SpannerIO.CreateTransaction createTransaction()
TimestampBound.strong()
transaction is created, to override this use SpannerIO.CreateTransaction.withTimestampBound(TimestampBound)
.@Experimental public static SpannerIO.Write write()
SpannerIO.Write
. Before use, the SpannerIO.Write
must be
configured with a SpannerIO.Write.withInstanceId(java.lang.String)
and SpannerIO.Write.withDatabaseId(java.lang.String)
that identify
the Cloud Spanner database being written.