Class KuduIO
For more information, see the online documentation at Kudu.
Reading from Kudu
KuduIO
provides a source to read and returns a bounded collection of entities as
PCollection<T>
. An entity is built by parsing a Kudu RowResult
using the provided
SerializableFunction
.
The following example illustrates various options for configuring the IO:
pipeline.apply(
KuduIO.<String>read()
.withMasterAddresses("kudu1:8051,kudu2:8051,kudu3:8051")
.withTable("table")
.withParseFn(
(SerializableFunction<RowResult, String>) input -> input.getString(COL_NAME))
.withCoder(StringUtf8Coder.of()));
// above options illustrate a typical minimum set, returns PCollection<String>
withCoder(...)
may be omitted if it can be inferred from the @{CoderRegistry}.
However, when using a Lambda Expression or an anonymous inner class to define the function, type
erasure will prohibit this. In such cases you are required to explicitly set the coder as in the
above example.
Optionally, you can provide withPredicates(...)
to apply a query to filter rows from
the kudu table.
Optionally, you can provide withProjectedColumns(...)
to limit the columns returned
from the Kudu scan to improve performance. The columns required in the ParseFn
must be
declared in the projected columns.
Optionally, you can provide withBatchSize(...)
to set the number of bytes returned
from the Kudu scanner in each batch.
Optionally, you can provide withFaultTolerent(...)
to enforce the read scan to resume
a scan on another tablet server if the current server fails.
Writing to Kudu
The Kudu sink executes a set of operations on a single table. It takes as input a PCollection
and a KuduIO.FormatFunction
which is responsible for converting the
input into an idempotent transformation on a row.
To configure a Kudu sink, you must supply the Kudu master addresses, the table name and a
KuduIO.FormatFunction
to convert the input records, for example:
PCollection<MyType> data = ...;
FormatFunction<MyType> fn = ...;
data.apply("write",
KuduIO.write()
.withMasterAddresses("kudu1:8051,kudu2:8051,kudu3:8051")
.withTable("table")
.withFormatFn(fn));
KuduIO
does not support authentication in this release.
-
Nested Class Summary
Nested ClassesModifier and TypeClassDescriptionstatic interface
An interface used by the KuduIO Write to convert an input record into an Operation to apply as a mutation in Kudu.static class
Implementation ofread()
.static class
APTransform
that writes to Kudu. -
Method Summary
Modifier and TypeMethodDescriptionstatic <T> KuduIO.Read
<T> read()
static <T> KuduIO.Write
<T> write()
-
Method Details
-
read
-
write
-