T - The type to read from the compressed file.@Experimental(value=SOURCE_SINK) public class CompressedSource<T> extends FileBasedSource<T>
CompressedSources wraps a delegate FileBasedSource that is able to read the decompressed file format.
 For example, use the following to read from a gzip-compressed file-based source:
 FileBasedSource<T> mySource = ...;
 PCollection<T> collection = p.apply(Read.from(CompressedSource
     .from(mySource)
     .withCompression(Compression.GZIP)));
 Supported compression algorithms are Compression.GZIP, Compression.BZIP2,
 Compression.ZIP and Compression.DEFLATE. User-defined compression types are
 supported by implementing a CompressedSource.DecompressingChannelFactory.
 
By default, the compression algorithm is selected from those supported in Compression
 based on the file name provided to the source, namely ".bz2" indicates Compression.BZIP2, ".gz" indicates Compression.GZIP, ".zip" indicates
 Compression.ZIP and ".deflate" indicates Compression.DEFLATE. If the file
 name does not match any of the supported algorithms, it is assumed to be uncompressed data.
| Modifier and Type | Class and Description | 
|---|---|
| static class  | CompressedSource.CompressedReader<T>Reader for a  CompressedSource. | 
| static class  | CompressedSource.CompressionModeDeprecated. 
 Use  Compressioninstead | 
| static interface  | CompressedSource.DecompressingChannelFactoryFactory interface for creating channels that decompress the content of an underlying channel. | 
FileBasedSource.FileBasedReader<T>, FileBasedSource.ModeOffsetBasedSource.OffsetBasedReader<T>BoundedSource.BoundedReader<T>Source.Reader<T>| Modifier and Type | Method and Description | 
|---|---|
| protected FileBasedSource<T> | createForSubrangeOfFile(MatchResult.Metadata metadata,
                       long start,
                       long end)Creates a  CompressedSourcefor a subrange of a file. | 
| protected FileBasedSource.FileBasedReader<T> | createSingleFileReader(PipelineOptions options)Creates a  FileBasedReaderto read a single file. | 
| static <T> CompressedSource<T> | from(FileBasedSource<T> sourceDelegate)Creates a  CompressedSourcefrom an underlyingFileBasedSource. | 
| CompressedSource.DecompressingChannelFactory | getChannelFactory() | 
| Coder<T> | getOutputCoder()Returns the delegate source's output coder. | 
| protected boolean | isSplittable()Determines whether a single file represented by this source is splittable. | 
| void | populateDisplayData(DisplayData.Builder builder)Register display data for the given transform or component. | 
| void | validate()Validates that the delegate source is a valid source and that the channel factory is not null. | 
| CompressedSource<T> | withCompression(Compression compression)Like  withDecompression(org.apache.beam.sdk.io.CompressedSource.DecompressingChannelFactory)but takes a canonicalCompression. | 
| CompressedSource<T> | withDecompression(CompressedSource.DecompressingChannelFactory channelFactory)Return a  CompressedSourcethat is like this one but will decompress its underlying file
 with the givenCompressedSource.DecompressingChannelFactory. | 
createReader, createSourceForSubrange, getEmptyMatchTreatment, getEstimatedSizeBytes, getFileOrPatternSpec, getFileOrPatternSpecProvider, getMaxEndOffset, getMode, getSingleFileMetadata, split, toStringgetBytesPerOffset, getEndOffset, getMinBundleSize, getStartOffsetgetDefaultOutputCoderpublic static <T> CompressedSource<T> from(FileBasedSource<T> sourceDelegate)
CompressedSource from an underlying FileBasedSource. The type of
 compression used will be based on the file name extension unless explicitly configured via
 withDecompression(org.apache.beam.sdk.io.CompressedSource.DecompressingChannelFactory).public CompressedSource<T> withDecompression(CompressedSource.DecompressingChannelFactory channelFactory)
CompressedSource that is like this one but will decompress its underlying file
 with the given CompressedSource.DecompressingChannelFactory.public CompressedSource<T> withCompression(Compression compression)
withDecompression(org.apache.beam.sdk.io.CompressedSource.DecompressingChannelFactory) but takes a canonical Compression.public void validate()
validate in class FileBasedSource<T>protected FileBasedSource<T> createForSubrangeOfFile(MatchResult.Metadata metadata, long start, long end)
CompressedSource for a subrange of a file. Called by superclass to create a
 source for a single file.createForSubrangeOfFile in class FileBasedSource<T>metadata - file backing the new FileBasedSource.start - starting byte offset of the new FileBasedSource.end - ending byte offset of the new FileBasedSource. May be Long.MAX_VALUE, in
     which case it will be inferred using FileBasedSource.getMaxEndOffset(org.apache.beam.sdk.options.PipelineOptions).protected final boolean isSplittable()
isSplittable in class FileBasedSource<T>protected final FileBasedSource.FileBasedReader<T> createSingleFileReader(PipelineOptions options)
FileBasedReader to read a single file.
 Uses the delegate source to create a single file reader for the delegate source. Utilizes the default decompression channel factory to not wrap the source reader if the file name does not represent a compressed file allowing for splitting of the source.
createSingleFileReader in class FileBasedSource<T>public void populateDisplayData(DisplayData.Builder builder)
SourcepopulateDisplayData(DisplayData.Builder) is invoked by Pipeline runners to collect
 display data via DisplayData.from(HasDisplayData). Implementations may call super.populateDisplayData(builder) in order to register display data in the current namespace,
 but should otherwise use subcomponent.populateDisplayData(builder) to use the namespace
 of the subcomponent.
 
By default, does not register any display data. Implementors may override this method to provide their own display data.
populateDisplayData in interface HasDisplayDatapopulateDisplayData in class FileBasedSource<T>builder - The builder to populate with display data.HasDisplayDatapublic final Coder<T> getOutputCoder()
getOutputCoder in class Source<T>public final CompressedSource.DecompressingChannelFactory getChannelFactory()