Class HadoopFormatIO.Read<K,V>

java.lang.Object
org.apache.beam.sdk.transforms.PTransform<PBegin,PCollection<KV<K,V>>>
org.apache.beam.sdk.io.hadoop.format.HadoopFormatIO.Read<K,V>
Type Parameters:
K - Type of keys to be read.
V - Type of values to be read.
All Implemented Interfaces:
Serializable, HasDisplayData
Enclosing class:
HadoopFormatIO

public abstract static class HadoopFormatIO.Read<K,V> extends PTransform<PBegin,PCollection<KV<K,V>>>
A PTransform that reads from any data source which implements Hadoop InputFormat. For e.g. Cassandra, Elasticsearch, HBase, Redis, Postgres, etc. See the class-level Javadoc on HadoopFormatIO for more information.
See Also:
  • Constructor Details

    • Read

      public Read()
  • Method Details

    • getConfiguration

      public abstract @Nullable SerializableConfiguration getConfiguration()
    • getKeyTranslationFunction

      public abstract @Nullable SimpleFunction<?,K> getKeyTranslationFunction()
    • getValueTranslationFunction

      public abstract @Nullable SimpleFunction<?,V> getValueTranslationFunction()
    • getKeyTypeDescriptor

      public abstract @Nullable TypeDescriptor<K> getKeyTypeDescriptor()
    • getKeyCoder

      public abstract @Nullable Coder<K> getKeyCoder()
    • getValueTypeDescriptor

      public abstract @Nullable TypeDescriptor<V> getValueTypeDescriptor()
    • getValueCoder

      public abstract @Nullable Coder<V> getValueCoder()
    • getSkipKeyClone

      public abstract @Nullable Boolean getSkipKeyClone()
    • getSkipValueClone

      public abstract @Nullable Boolean getSkipValueClone()
    • getinputFormatClass

      public abstract @Nullable TypeDescriptor<?> getinputFormatClass()
    • getinputFormatKeyClass

      public abstract @Nullable TypeDescriptor<?> getinputFormatKeyClass()
    • getinputFormatValueClass

      public abstract @Nullable TypeDescriptor<?> getinputFormatValueClass()
    • toBuilder

      public abstract org.apache.beam.sdk.io.hadoop.format.HadoopFormatIO.Read.Builder<K,V> toBuilder()
    • withConfiguration

      public HadoopFormatIO.Read<K,V> withConfiguration(org.apache.hadoop.conf.Configuration configuration)
      Reads from the source using the options provided by the given configuration.
    • withKeyTranslation

      public HadoopFormatIO.Read<K,V> withKeyTranslation(SimpleFunction<?,K> function)
      Transforms the keys read from the source using the given key translation function.
    • withKeyTranslation

      public HadoopFormatIO.Read<K,V> withKeyTranslation(SimpleFunction<?,K> function, Coder<K> coder)
      Transforms the keys read from the source using the given key translation function.
    • withValueTranslation

      public HadoopFormatIO.Read<K,V> withValueTranslation(SimpleFunction<?,V> function)
      Transforms the values read from the source using the given value translation function.
    • withValueTranslation

      public HadoopFormatIO.Read<K,V> withValueTranslation(SimpleFunction<?,V> function, Coder<V> coder)
      Transforms the values read from the source using the given value translation function.
    • withSkipKeyClone

      public HadoopFormatIO.Read<K,V> withSkipKeyClone(boolean value)
      Determines if key clone should be skipped or not (default is 'false').
    • withSkipValueClone

      public HadoopFormatIO.Read<K,V> withSkipValueClone(boolean value)
      Determines if value clone should be skipped or not (default is 'false').
    • expand

      public PCollection<KV<K,V>> expand(PBegin input)
      Description copied from class: PTransform
      Override this method to specify how this PTransform should be expanded on the given InputT.

      NOTE: This method should not be called directly. Instead apply the PTransform should be applied to the InputT using the apply method.

      Composite transforms, which are defined in terms of other transforms, should return the output of one of the composed transforms. Non-composite transforms, which do not apply any transforms internally, should return a new unbound output and register evaluators (via backend-specific registration methods).

      Specified by:
      expand in class PTransform<PBegin,PCollection<KV<K,V>>>
    • validateTransform

      public void validateTransform()
      Validates construction of this transform.
    • getDefaultCoder

      public <T> Coder<T> getDefaultCoder(TypeDescriptor<?> typeDesc, CoderRegistry coderRegistry)
      Returns the default coder for a given type descriptor. Coder Registry is queried for correct coder, if not found in Coder Registry, then check if the type descriptor provided is of type Writable, then WritableCoder is returned, else exception is thrown "Cannot find coder".