@DefaultAnnotation(value=org.checkerframework.checker.nullness.qual.NonNull.class)
org.apache.beam.sdk.io.AvroIO
, and TextIO
.See: Description
Interface | Description |
---|---|
CompressedSource.DecompressingChannelFactory |
Factory interface for creating channels that decompress the content of an underlying channel.
|
FileBasedSink.OutputFileHints |
Provides hints about how to generate output files, such as a suggested filename suffix (e.g.
|
FileBasedSink.WritableByteChannelFactory |
Implementations create instances of
WritableByteChannel used by FileBasedSink
and related classes to allow decorating, or otherwise transforming, the raw data that
would normally be written directly to the WritableByteChannel passed into FileBasedSink.WritableByteChannelFactory.create(WritableByteChannel) . |
FileIO.Sink<ElementT> |
Specifies how to write elements to individual files in
FileIO.write() and FileIO.writeDynamic() . |
FileIO.Write.FileNaming |
A policy for generating names for shard files.
|
FileSystemRegistrar |
A registrar that creates
FileSystem instances from PipelineOptions . |
ShardingFunction<UserT,DestinationT> |
Function for assigning
ShardedKey s to input elements for sharded WriteFiles . |
TextRowCountEstimator.SamplingStrategy |
Sampling Strategy shows us when should we stop reading further files.
|
UnboundedSource.CheckpointMark |
A marker representing the progress and state of an
UnboundedSource.UnboundedReader . |
Class | Description |
---|---|
BlockBasedSource<T> |
A
BlockBasedSource is a FileBasedSource where a file consists of blocks of
records. |
BlockBasedSource.Block<T> |
A
Block represents a block of records that can be read. |
BlockBasedSource.BlockBasedReader<T> |
A
Reader that reads records from a BlockBasedSource . |
BoundedReadFromUnboundedSource<T> |
PTransform that reads a bounded amount of data from an UnboundedSource , specified
as one or both of a maximum number of elements or a maximum period of time to read. |
BoundedSource<T> |
A
Source that reads a finite amount of input and, because of that, supports some
additional operations. |
BoundedSource.BoundedReader<T> |
A
Reader that reads a bounded amount of input and supports some additional operations,
such as progress estimation and dynamic work rebalancing. |
ClassLoaderFileSystem |
A read-only
FileSystem implementation looking up resources using a ClassLoader. |
ClassLoaderFileSystem.ClassLoaderFileSystemRegistrar |
AutoService registrar for the ClassLoaderFileSystem . |
ClassLoaderFileSystem.ClassLoaderResourceId | |
CompressedSource<T> |
A Source that reads from compressed files.
|
CompressedSource.CompressedReader<T> |
Reader for a
CompressedSource . |
CountingSource |
Most users should use
GenerateSequence instead. |
CountingSource.CounterMark |
The checkpoint for an unbounded
CountingSource is simply the last value produced. |
CountingSource.CounterMarkCoder |
A custom coder for
CounterMark . |
DefaultFilenamePolicy |
A default
FileBasedSink.FilenamePolicy for windowed and unwindowed files. |
DefaultFilenamePolicy.Params |
Encapsulates constructor parameters to
DefaultFilenamePolicy . |
DefaultFilenamePolicy.ParamsCoder |
A Coder for
DefaultFilenamePolicy.Params . |
DynamicFileDestinations |
Some helper classes that derive from
FileBasedSink.DynamicDestinations . |
FileBasedSink<UserT,DestinationT,OutputT> |
Abstract class for file-based output.
|
FileBasedSink.DynamicDestinations<UserT,DestinationT,OutputT> |
A class that allows value-dependent writes in
FileBasedSink . |
FileBasedSink.FilenamePolicy |
A naming policy for output files.
|
FileBasedSink.FileResult<DestinationT> |
Result of a single bundle write.
|
FileBasedSink.FileResultCoder<DestinationT> |
A coder for
FileBasedSink.FileResult objects. |
FileBasedSink.WriteOperation<DestinationT,OutputT> |
Abstract operation that manages the process of writing to
FileBasedSink . |
FileBasedSink.Writer<DestinationT,OutputT> |
Abstract writer that writes a bundle to a
FileBasedSink . |
FileBasedSource<T> |
A common base class for all file-based
Source s. |
FileBasedSource.FileBasedReader<T> |
A
reader that implements code common to readers of FileBasedSource s. |
FileIO |
General-purpose transforms for working with files: listing files (matching), reading and writing.
|
FileIO.Match |
Implementation of
FileIO.match() . |
FileIO.MatchAll |
Implementation of
FileIO.matchAll() . |
FileIO.MatchConfiguration |
Describes configuration for matching filepatterns, such as
EmptyMatchTreatment and
continuous watching for matching files. |
FileIO.ReadableFile |
A utility class for accessing a potentially compressed file.
|
FileIO.ReadMatches |
Implementation of
FileIO.readMatches() . |
FileIO.Write<DestinationT,UserT> |
Implementation of
FileIO.write() and FileIO.writeDynamic() . |
FileSystem<ResourceIdT extends ResourceId> |
File system interface in Beam.
|
FileSystems |
Clients facing
FileSystem utility. |
FileSystemUtils | |
GenerateSequence |
A
PTransform that produces longs starting from the given value, and either up to the
given limit or until Long.MAX_VALUE / until the given time elapses. |
GenerateSequence.External |
Exposes GenerateSequence as an external transform for cross-language usage.
|
GenerateSequence.External.ExternalConfiguration |
Parameters class to expose the transform to an external SDK.
|
LocalFileSystemRegistrar |
AutoService registrar for the LocalFileSystem . |
LocalResources |
Helper functions for producing a
ResourceId that references a local file or directory. |
OffsetBasedSource<T> |
A
BoundedSource that uses offsets to define starting and ending positions. |
OffsetBasedSource.OffsetBasedReader<T> |
A
Source.Reader that implements code common to readers of all OffsetBasedSource s. |
Read |
A
PTransform for reading from a Source . |
Read.Bounded<T> |
PTransform that reads from a BoundedSource . |
Read.Builder |
Helper class for building
Read transforms. |
Read.Unbounded<T> |
PTransform that reads from a UnboundedSource . |
ReadableFileCoder |
A
Coder for FileIO.ReadableFile . |
ReadAllViaFileBasedSource<T> |
Reads each file in the input
PCollection of FileIO.ReadableFile using given parameters
for splitting files into offset ranges and for creating a FileBasedSource for a file. |
ReadAllViaFileBasedSource.ReadFileRangesFnExceptionHandler |
A class to handle errors which occur during file reads.
|
ReadAllViaFileBasedSourceTransform<InT,T> | |
ReadAllViaFileBasedSourceTransform.AbstractReadFileRangesFn<InT,T> | |
ReadAllViaFileBasedSourceTransform.SplitIntoRangesFn | |
ReadAllViaFileBasedSourceWithFilename<T> |
Reads each file of the input
PCollection and outputs each element as the value of a
KV , where the key is the filename from which that value came. |
ShardNameTemplate |
Standard shard naming templates.
|
Source<T> |
Base class for defining input formats and creating a
Source for reading the input. |
Source.Reader<T> |
The interface that readers of custom input sources must implement.
|
TextIO |
PTransform s for reading and writing text files. |
TextIO.Read |
Implementation of
TextIO.read() . |
TextIO.ReadAll | Deprecated
See
TextIO.readAll() for details. |
TextIO.ReadFiles |
Implementation of
TextIO.readFiles() . |
TextIO.Sink |
Implementation of
TextIO.sink() . |
TextIO.TypedWrite<UserT,DestinationT> |
Implementation of
TextIO.write() . |
TextIO.Write |
This class is used as the default return value of
TextIO.write() . |
TextRowCountEstimator |
This returns a row count estimation for files associated with a file pattern.
|
TextRowCountEstimator.Builder |
Builder for
TextRowCountEstimator . |
TextRowCountEstimator.LimitNumberOfFiles |
This strategy stops sampling if we sample enough number of bytes.
|
TextRowCountEstimator.LimitNumberOfTotalBytes |
This strategy stops sampling when total number of sampled bytes are more than some threshold.
|
TextRowCountEstimator.SampleAllFiles |
This strategy samples all the files.
|
TextSource |
Implementation detail of
TextIO.Read . |
TFRecordIO |
PTransform s for reading and writing TensorFlow TFRecord files. |
TFRecordIO.Read |
Implementation of
TFRecordIO.read() . |
TFRecordIO.ReadFiles |
Implementation of
TFRecordIO.readFiles() . |
TFRecordIO.Sink | |
TFRecordIO.Write |
Implementation of
TFRecordIO.write() . |
UnboundedSource<OutputT,CheckpointMarkT extends UnboundedSource.CheckpointMark> |
A
Source that reads an unbounded amount of input and, because of that, supports some
additional operations such as checkpointing, watermarks, and record ids. |
UnboundedSource.CheckpointMark.NoopCheckpointMark |
A checkpoint mark that does nothing when finalized.
|
UnboundedSource.UnboundedReader<OutputT> |
A
Reader that reads an unbounded amount of input. |
WriteFiles<UserT,DestinationT,OutputT> |
A
PTransform that writes to a FileBasedSink . |
WriteFilesResult<DestinationT> |
The result of a
WriteFiles transform. |
Enum | Description |
---|---|
CompressedSource.CompressionMode | Deprecated
Use
Compression instead |
Compression |
Various compression types for reading/writing files.
|
FileBasedSink.CompressionType | Deprecated
use
Compression . |
FileBasedSource.Mode |
A given
FileBasedSource represents a file resource of one of these types. |
FileIO.ReadMatches.DirectoryTreatment |
Enum to control how directories are handled.
|
TextIO.CompressionType | Deprecated
Use
Compression . |
TFRecordIO.CompressionType | Deprecated
Use
Compression . |
Exception | Description |
---|---|
TextRowCountEstimator.NoEstimationException |
An exception that will be thrown if the estimator cannot get an estimation of the number of
lines.
|
org.apache.beam.sdk.io.AvroIO
, and TextIO
.
The classes in this package provide Read
transforms that create PCollections from
existing storage:
PCollection<TableRow> inputData = pipeline.apply(
BigQueryIO.readTableRows().from("apache-beam-testing.samples.weather_stations"));
and Write
transforms that persist PCollections to external storage:
PCollection<Integer> numbers = ...;
numbers.apply(TextIO.write().to("gs://my_bucket/path/to/numbers"));