@Experimental(value=SOURCE_SINK) public class KuduIO extends java.lang.Object
For more information, see the online documentation at Kudu.
KuduIO
provides a source to read and returns a bounded collection of entities as
PCollection<T>
. An entity is built by parsing a Kudu RowResult
using the provided
SerializableFunction
.
The following example illustrates various options for configuring the IO:
pipeline.apply(
KuduIO.<String>read()
.withMasterAddresses("kudu1:8051,kudu2:8051,kudu3:8051")
.withTable("table")
.withParseFn(
(SerializableFunction<RowResult, String>) input -> input.getString(COL_NAME))
.withCoder(StringUtf8Coder.of()));
// above options illustrate a typical minimum set, returns PCollection<String>
withCoder(...)
may be omitted if it can be inferred from the @{CoderRegistry}.
However, when using a Lambda Expression or an anonymous inner class to define the function, type
erasure will prohibit this. In such cases you are required to explicitly set the coder as in the
above example.
Optionally, you can provide withPredicates(...)
to apply a query to filter rows from
the kudu table.
Optionally, you can provide withProjectedColumns(...)
to limit the columns returned
from the Kudu scan to improve performance. The columns required in the ParseFn
must be
declared in the projected columns.
Optionally, you can provide withBatchSize(...)
to set the number of bytes returned
from the Kudu scanner in each batch.
Optionally, you can provide withFaultTolerent(...)
to enforce the read scan to resume
a scan on another tablet server if the current server fails.
The Kudu sink executes a set of operations on a single table. It takes as input a PCollection
and a KuduIO.FormatFunction
which is responsible for converting the
input into an idempotent transformation on a row.
To configure a Kudu sink, you must supply the Kudu master addresses, the table name and a
KuduIO.FormatFunction
to convert the input records, for example:
PCollection<MyType> data = ...;
FormatFunction<MyType> fn = ...;
data.apply("write",
KuduIO.write()
.withMasterAddresses("kudu1:8051,kudu2:8051,kudu3:8051")
.withTable("table")
.withFormatFn(fn));
KuduIO
does not support authentication in this release.Modifier and Type | Class and Description |
---|---|
static interface |
KuduIO.FormatFunction<T>
An interface used by the KuduIO Write to convert an input record into an Operation to apply as
a mutation in Kudu.
|
static class |
KuduIO.Read<T>
Implementation of
read() . |
static class |
KuduIO.Write<T>
A
PTransform that writes to Kudu. |
Modifier and Type | Method and Description |
---|---|
static <T> KuduIO.Read<T> |
read() |
static <T> KuduIO.Write<T> |
write() |
public static <T> KuduIO.Read<T> read()
public static <T> KuduIO.Write<T> write()