@Experimental(value=SOURCE_SINK) public final class KinesisIO extends java.lang.Object
Example usages:
 p.apply(KinesisIO.read()
     .withStreamName("streamName")
     .withInitialPositionInStream(InitialPositionInStream.LATEST)
  .apply( ... ) // other transformations
 At a minimum you have to provide:
InitialPositionInStream.LATEST, InitialPositionInStream.TRIM_HORIZON, or
       alternatively, using an arbitrary point in time with KinesisIO.Read.withInitialTimestampInStream(Instant).
 Kinesis IO uses arrival time for watermarks by default. To use processing time instead, use
 KinesisIO.Read.withProcessingTimeWatermarkPolicy():
 
 p.apply(KinesisIO.read()
    .withStreamName("streamName")
    .withInitialPositionInStream(InitialPositionInStream.LATEST)
    .withProcessingTimeWatermarkPolicy())
 It is also possible to specify a custom watermark policy to control watermark computation
 using KinesisIO.Read.withCustomWatermarkPolicy(WatermarkPolicyFactory). This requires implementing
 WatermarkPolicy with a corresponding WatermarkPolicyFactory.
 
By default Kinesis IO will poll the Kinesis getRecords() API as fast as possible as
 long as records are returned. The RateLimitPolicyFactory.DefaultRateLimiter will start
 throttling once getRecords() returns an empty response or if API calls get throttled by
 AWS.
 
A RateLimitPolicy is always applied to each shard individually.
 
You may provide a custom rate limit policy using KinesisIO.Read.withCustomRateLimitPolicy(RateLimitPolicyFactory). This requires implementing RateLimitPolicy with a corresponding RateLimitPolicyFactory.
 
Example usages:
 PCollection<KV<String, byte[]>> data = ...;
 data.apply(KinesisIO.write()
     .withStreamName("streamName")
     .withPartitionKey(KV::getKey)
     .withSerializer(KV::getValue);
 Note: Usage of KV is just for illustration purposes here.
 
At a minimum you have to provide:
KinesisPartitioner to distribute records across shards of the stream
   ClientConfiguration, see below.
 KinesisPartitioner is one of the
 key considerations when writing to Kinesis. Typically, you should aime to evenly distribute data
 across all shards of the stream.
 Partition keys are used as input to a hash function that maps the partition key and associated data to a specific shard. If the cardinality of your partition keys is of the same order of magnitude as the number of shards in the stream, the hash function will likely not distribute your keys evenly among shards. This may result in heavily skewed shards with some shards not utilized at all.
If you require finer control over the distribution of records, override KinesisPartitioner.getExplicitHashKey(Object) according to your needs. However, this might
 impact record aggregation.
 
Records of the same effective hash key get aggregated. The effective hash key is:
To provide shard aware aggregation in 2., hash key ranges of shards are loaded and refreshed periodically. This allows to aggregate records into a number of aggregates that matches the number of shards in the stream to max out Kinesis API limits the best possible way.
Note:There's an important downside to consider when using shard aware aggregation: records get assigned to a shard (via an explicit hash key) on the client side, but respective client side state can't be guaranteed to always be up-to-date. If a shard gets split, all aggregates are mapped to the lower child shard until state is refreshed. Timing, however, will diverge between the different workers.
If using an KinesisPartitioner.ExplicitPartitioner or disabling shard refresh via KinesisIO.RecordAggregation, no shard details will be loaded (and used).
 
Record aggregation can be entirely disabled using KinesisIO.Write.withRecordAggregationDisabled().
 
AWS clients for all AWS IOs can be configured using AwsOptions, e.g. --awsRegion=us-west-1. AwsOptions contain reasonable defaults based on default providers
 for Region and AwsCredentialsProvider.
 
If you require more advanced configuration, you may change the ClientBuilderFactory
 using AwsOptions.setClientBuilderFactory(Class).
 
Configuration for a specific IO can be overwritten using withClientConfiguration(),
 which also allows to configure the retry behavior for the respective IO.
 
Retries for failed requests can be configured using ClientConfiguration.Builder#retry(Consumer) and are handled by the AWS SDK unless there's a
 partial success (batch requests). The SDK uses a backoff strategy with equal jitter for computing
 the delay before the next retry.
 
Note: Once retries are exhausted the error is surfaced to the runner which may then opt to retry the current partition in entirety or abort if the max number of retries of the runner is reached.
| Modifier and Type | Class and Description | 
|---|---|
| static class  | KinesisIO.ReadImplementation of  read(). | 
| static class  | KinesisIO.RecordAggregationConfiguration of Kinesis record aggregation. | 
| static class  | KinesisIO.Write<T>Implementation of  write(). | 
| Constructor and Description | 
|---|
| KinesisIO() | 
| Modifier and Type | Method and Description | 
|---|---|
| static KinesisIO.Read | read()Returns a new  KinesisIO.Readtransform for reading from Kinesis. | 
| static <T> KinesisIO.Write<T> | write()Returns a new  KinesisIO.Writetransform for writing to Kinesis. | 
public static KinesisIO.Read read()
KinesisIO.Read transform for reading from Kinesis.public static <T> KinesisIO.Write<T> write()
KinesisIO.Write transform for writing to Kinesis.