Class KuduIO
For more information, see the online documentation at Kudu.
Reading from Kudu
KuduIO provides a source to read and returns a bounded collection of entities as
PCollection<T>. An entity is built by parsing a Kudu RowResult using the provided
SerializableFunction.
The following example illustrates various options for configuring the IO:
pipeline.apply(
KuduIO.<String>read()
.withMasterAddresses("kudu1:8051,kudu2:8051,kudu3:8051")
.withTable("table")
.withParseFn(
(SerializableFunction<RowResult, String>) input -> input.getString(COL_NAME))
.withCoder(StringUtf8Coder.of()));
// above options illustrate a typical minimum set, returns PCollection<String>
withCoder(...) may be omitted if it can be inferred from the @{CoderRegistry}.
However, when using a Lambda Expression or an anonymous inner class to define the function, type
erasure will prohibit this. In such cases you are required to explicitly set the coder as in the
above example.
Optionally, you can provide withPredicates(...) to apply a query to filter rows from
the kudu table.
Optionally, you can provide withProjectedColumns(...) to limit the columns returned
from the Kudu scan to improve performance. The columns required in the ParseFn must be
declared in the projected columns.
Optionally, you can provide withBatchSize(...) to set the number of bytes returned
from the Kudu scanner in each batch.
Optionally, you can provide withFaultTolerent(...) to enforce the read scan to resume
a scan on another tablet server if the current server fails.
Writing to Kudu
The Kudu sink executes a set of operations on a single table. It takes as input a PCollection and a KuduIO.FormatFunction which is responsible for converting the
input into an idempotent transformation on a row.
To configure a Kudu sink, you must supply the Kudu master addresses, the table name and a
KuduIO.FormatFunction to convert the input records, for example:
PCollection<MyType> data = ...;
FormatFunction<MyType> fn = ...;
data.apply("write",
KuduIO.write()
.withMasterAddresses("kudu1:8051,kudu2:8051,kudu3:8051")
.withTable("table")
.withFormatFn(fn));
KuduIO does not support authentication in this release.
-
Nested Class Summary
Nested ClassesModifier and TypeClassDescriptionstatic interfaceAn interface used by the KuduIO Write to convert an input record into an Operation to apply as a mutation in Kudu.static classImplementation ofread().static classAPTransformthat writes to Kudu. -
Method Summary
Modifier and TypeMethodDescriptionstatic <T> KuduIO.Read<T> read()static <T> KuduIO.Write<T> write()
-
Method Details
-
read
-
write
-