Easy to use Java 8 DSL for the Beam Java SDK. Provides a high-level abstraction of Beam transformations, which is both easy to read and write. Can be used as a complement to existing Beam pipelines (convertible back and forth). You can have a glimpse of the API at WordCount example.
“Salted” join implementation
Implementation of a join, that can handle large scale join of highly skewed data sets. This implementation breaks the large keys into multiple splits, using key distribution approximated by count min sketch data structure.
In order to pick the right translation for the operator without user interference, we can leverage knowledge from previous pipeline runs. We want to provide a convenient and portable way to gather this knowledge.
Implementation of an easy to use Fluent API on top of Euphoria DSL.
An convenient API for multiple outputs.
Introduce API for converting streams to tables (KStream <-> KTable approach) and various types of (windowed and unwindowed) joins on them.