public abstract static class TextIO.TypedWrite<UserT,DestinationT> extends PTransform<PCollection<UserT>,WriteFilesResult<DestinationT>>
TextIO.write().annotations, displayData, name, resourceHints| Constructor and Description |
|---|
TypedWrite() |
| Modifier and Type | Method and Description |
|---|---|
WriteFilesResult<DestinationT> |
expand(PCollection<UserT> input)
Override this method to specify how this
PTransform should be expanded on the given
InputT. |
void |
populateDisplayData(DisplayData.Builder builder)
Register display data for the given transform or component.
|
TextIO.TypedWrite<UserT,DestinationT> |
skipIfEmpty()
Don't write any output files if the PCollection is empty.
|
<NewDestinationT> |
to(FileBasedSink.DynamicDestinations<UserT,NewDestinationT,java.lang.String> dynamicDestinations)
Deprecated.
Use
FileIO.write() or FileIO.writeDynamic() with TextIO.sink()
instead. |
TextIO.TypedWrite<UserT,DestinationT> |
to(FileBasedSink.FilenamePolicy filenamePolicy)
Writes to files named according to the given
FileBasedSink.FilenamePolicy. |
TextIO.TypedWrite<UserT,DestinationT> |
to(ResourceId filenamePrefix)
Like
to(String). |
TextIO.TypedWrite<UserT,DefaultFilenamePolicy.Params> |
to(SerializableFunction<UserT,DefaultFilenamePolicy.Params> destinationFunction,
DefaultFilenamePolicy.Params emptyDestination)
Deprecated.
Use
FileIO.write() or FileIO.writeDynamic() with TextIO.sink()
instead. |
TextIO.TypedWrite<UserT,DestinationT> |
to(java.lang.String filenamePrefix)
Writes to text files with the given prefix.
|
TextIO.TypedWrite<UserT,DestinationT> |
to(ValueProvider<java.lang.String> outputPrefix)
Like
to(String). |
TextIO.TypedWrite<UserT,DestinationT> |
toResource(ValueProvider<ResourceId> filenamePrefix)
Like
to(ResourceId). |
TextIO.TypedWrite<UserT,DestinationT> |
withAutoSharding() |
TextIO.TypedWrite<UserT,DestinationT> |
withBadRecordErrorHandler(ErrorHandler<BadRecord,?> errorHandler)
See
FileIO.Write.withBadRecordErrorHandler(ErrorHandler) for details on usage. |
TextIO.TypedWrite<UserT,DestinationT> |
withBatchMaxBufferingDuration(@Nullable Duration batchMaxBufferingDuration)
Returns a new
TextIO.TypedWrite that will batch the input records using specified max
buffering duration. |
TextIO.TypedWrite<UserT,DestinationT> |
withBatchSize(@Nullable java.lang.Integer batchSize)
Returns a new
TextIO.TypedWrite that will batch the input records using specified batch
size. |
TextIO.TypedWrite<UserT,DestinationT> |
withBatchSizeBytes(@Nullable java.lang.Integer batchSizeBytes)
Returns a new
TextIO.TypedWrite that will batch the input records using specified batch size
in bytes. |
TextIO.TypedWrite<UserT,DestinationT> |
withCompression(Compression compression)
Returns a transform for writing to text files like this one but that compresses output using
the given
Compression. |
TextIO.TypedWrite<UserT,DestinationT> |
withDelimiter(char[] delimiter)
Specifies the delimiter after each string written.
|
TextIO.TypedWrite<UserT,DestinationT> |
withFooter(@Nullable java.lang.String footer)
Adds a footer string to each file.
|
TextIO.TypedWrite<UserT,DestinationT> |
withFormatFunction(@Nullable SerializableFunction<UserT,java.lang.String> formatFunction)
Deprecated.
Use
FileIO.write() or FileIO.writeDynamic() with TextIO.sink()
instead. |
TextIO.TypedWrite<UserT,DestinationT> |
withHeader(@Nullable java.lang.String header)
Adds a header string to each file.
|
TextIO.TypedWrite<UserT,DestinationT> |
withMaxNumWritersPerBundle(@Nullable java.lang.Integer maxNumWritersPerBundle)
Set the maximum number of writers created in a bundle before spilling to shuffle.
|
TextIO.TypedWrite<UserT,DestinationT> |
withNoSpilling()
|
TextIO.TypedWrite<UserT,DestinationT> |
withNumShards(int numShards)
Configures the number of output shards produced overall (when using unwindowed writes) or
per-window (when using windowed writes).
|
TextIO.TypedWrite<UserT,DestinationT> |
withNumShards(@Nullable ValueProvider<java.lang.Integer> numShards)
Like
withNumShards(int). |
TextIO.TypedWrite<UserT,DestinationT> |
withoutSharding()
Forces a single file as output and empty shard name template.
|
TextIO.TypedWrite<UserT,DestinationT> |
withShardNameTemplate(java.lang.String shardTemplate)
Uses the given
ShardNameTemplate for naming output files. |
TextIO.TypedWrite<UserT,DestinationT> |
withSuffix(java.lang.String filenameSuffix)
Configures the filename suffix for written files.
|
TextIO.TypedWrite<UserT,DestinationT> |
withTempDirectory(ResourceId tempDirectory)
Set the base directory used to generate temporary files.
|
TextIO.TypedWrite<UserT,DestinationT> |
withTempDirectory(ValueProvider<ResourceId> tempDirectory)
Set the base directory used to generate temporary files.
|
TextIO.TypedWrite<UserT,DestinationT> |
withWindowedWrites()
Preserves windowing of input elements and writes them to files based on the element's window.
|
TextIO.TypedWrite<UserT,DestinationT> |
withWritableByteChannelFactory(FileBasedSink.WritableByteChannelFactory writableByteChannelFactory)
Returns a transform for writing to text files like this one but that has the given
FileBasedSink.WritableByteChannelFactory to be used by the FileBasedSink during output. |
addAnnotation, compose, compose, getAdditionalInputs, getAnnotations, getDefaultOutputCoder, getDefaultOutputCoder, getDefaultOutputCoder, getKindString, getName, getResourceHints, setDisplayData, setResourceHints, toString, validate, validatepublic TextIO.TypedWrite<UserT,DestinationT> to(java.lang.String filenamePrefix)
prefix can reference any FileSystem on the classpath. This prefix is used by the DefaultFilenamePolicy to
generate filenames.
By default, a DefaultFilenamePolicy will be used built using the specified prefix
to define the base output directory and file prefix, a shard identifier (see withNumShards(int)), and a common suffix (if supplied using withSuffix(String)).
This default policy can be overridden using #to(FilenamePolicy), in which case
withShardNameTemplate(String) and withSuffix(String) should not be set.
Custom filename policies do not automatically see this prefix - you should explicitly pass
the prefix into your FileBasedSink.FilenamePolicy object if you need this.
If withTempDirectory(org.apache.beam.sdk.options.ValueProvider<org.apache.beam.sdk.io.fs.ResourceId>) has not been called, this filename prefix will be used to
infer a directory for temporary files.
public TextIO.TypedWrite<UserT,DestinationT> to(ResourceId filenamePrefix)
to(String).public TextIO.TypedWrite<UserT,DestinationT> to(ValueProvider<java.lang.String> outputPrefix)
to(String).public TextIO.TypedWrite<UserT,DestinationT> to(FileBasedSink.FilenamePolicy filenamePolicy)
FileBasedSink.FilenamePolicy. A
directory for temporary files must be specified using withTempDirectory(org.apache.beam.sdk.options.ValueProvider<org.apache.beam.sdk.io.fs.ResourceId>).@Deprecated public <NewDestinationT> TextIO.TypedWrite<UserT,NewDestinationT> to(FileBasedSink.DynamicDestinations<UserT,NewDestinationT,java.lang.String> dynamicDestinations)
FileIO.write() or FileIO.writeDynamic() with TextIO.sink()
instead.FileBasedSink.DynamicDestinations object to vend FileBasedSink.FilenamePolicy objects. These
objects can examine the input record when creating a FileBasedSink.FilenamePolicy. A directory for
temporary files must be specified using withTempDirectory(org.apache.beam.sdk.options.ValueProvider<org.apache.beam.sdk.io.fs.ResourceId>).@Deprecated public TextIO.TypedWrite<UserT,DefaultFilenamePolicy.Params> to(SerializableFunction<UserT,DefaultFilenamePolicy.Params> destinationFunction, DefaultFilenamePolicy.Params emptyDestination)
FileIO.write() or FileIO.writeDynamic() with TextIO.sink()
instead.DefaultFilenamePolicy.Params object that specifies where the
records should be written (base filename, file suffix, and shard template). The
emptyDestination parameter specified where empty files should be written for when the written
PCollection is empty.public TextIO.TypedWrite<UserT,DestinationT> toResource(ValueProvider<ResourceId> filenamePrefix)
to(ResourceId).@Deprecated public TextIO.TypedWrite<UserT,DestinationT> withFormatFunction(@Nullable SerializableFunction<UserT,java.lang.String> formatFunction)
FileIO.write() or FileIO.writeDynamic() with TextIO.sink()
instead.UserT to the output type. If #to(DynamicDestinations) is used, FileBasedSink.DynamicDestinations.formatRecord(Object) must be
used instead.public TextIO.TypedWrite<UserT,DestinationT> withBatchSize(@Nullable java.lang.Integer batchSize)
TextIO.TypedWrite that will batch the input records using specified batch
size. The default value is WriteFiles.FILE_TRIGGERING_RECORD_COUNT.
This option is used only for writing unbounded data with auto-sharding.
public TextIO.TypedWrite<UserT,DestinationT> withBatchSizeBytes(@Nullable java.lang.Integer batchSizeBytes)
TextIO.TypedWrite that will batch the input records using specified batch size
in bytes. The default value is WriteFiles.FILE_TRIGGERING_BYTE_COUNT.
This option is used only for writing unbounded data with auto-sharding.
public TextIO.TypedWrite<UserT,DestinationT> withBatchMaxBufferingDuration(@Nullable Duration batchMaxBufferingDuration)
TextIO.TypedWrite that will batch the input records using specified max
buffering duration. The default value is WriteFiles.FILE_TRIGGERING_RECORD_BUFFERING_DURATION.
This option is used only for writing unbounded data with auto-sharding.
public TextIO.TypedWrite<UserT,DestinationT> withTempDirectory(ValueProvider<ResourceId> tempDirectory)
public TextIO.TypedWrite<UserT,DestinationT> withTempDirectory(ResourceId tempDirectory)
public TextIO.TypedWrite<UserT,DestinationT> withShardNameTemplate(java.lang.String shardTemplate)
ShardNameTemplate for naming output files. This option may only be
used when using one of the default filename-prefix to() overrides - i.e. not when using
either #to(FilenamePolicy) or #to(DynamicDestinations).
See DefaultFilenamePolicy for how the prefix, shard name template, and suffix are
used.
public TextIO.TypedWrite<UserT,DestinationT> withSuffix(java.lang.String filenameSuffix)
#to(FilenamePolicy) or #to(DynamicDestinations).
See DefaultFilenamePolicy for how the prefix, shard name template, and suffix are
used.
public TextIO.TypedWrite<UserT,DestinationT> withNumShards(int numShards)
For unwindowed writes, constraining the number of shards is likely to reduce the performance of a pipeline. Setting this value is not recommended unless you require a specific number of output files.
numShards - the number of shards to use, or 0 to let the system decide.public TextIO.TypedWrite<UserT,DestinationT> withNumShards(@Nullable ValueProvider<java.lang.Integer> numShards)
withNumShards(int). Specifying null means runner-determined sharding.public TextIO.TypedWrite<UserT,DestinationT> withoutSharding()
For unwindowed writes, constraining the number of shards is likely to reduce the performance of a pipeline. Setting this value is not recommended unless you require a specific number of output files.
This is equivalent to .withNumShards(1).withShardNameTemplate("")
public TextIO.TypedWrite<UserT,DestinationT> withDelimiter(char[] delimiter)
Defaults to '\n'.
public TextIO.TypedWrite<UserT,DestinationT> withHeader(@Nullable java.lang.String header)
A null value will clear any previously configured header.
public TextIO.TypedWrite<UserT,DestinationT> withFooter(@Nullable java.lang.String footer)
A null value will clear any previously configured footer.
public TextIO.TypedWrite<UserT,DestinationT> withWritableByteChannelFactory(FileBasedSink.WritableByteChannelFactory writableByteChannelFactory)
FileBasedSink.WritableByteChannelFactory to be used by the FileBasedSink during output. The
default is value is Compression.UNCOMPRESSED.
A null value will reset the value to the default value mentioned above.
public TextIO.TypedWrite<UserT,DestinationT> withCompression(Compression compression)
Compression. The default value is Compression.UNCOMPRESSED.public TextIO.TypedWrite<UserT,DestinationT> withWindowedWrites()
If using to(FileBasedSink.FilenamePolicy). Filenames will be generated using
FileBasedSink.FilenamePolicy.windowedFilename(int, int, org.apache.beam.sdk.transforms.windowing.BoundedWindow, org.apache.beam.sdk.transforms.windowing.PaneInfo, org.apache.beam.sdk.io.FileBasedSink.OutputFileHints). See also WriteFiles.withWindowedWrites().
public TextIO.TypedWrite<UserT,DestinationT> withAutoSharding()
public TextIO.TypedWrite<UserT,DestinationT> withNoSpilling()
public TextIO.TypedWrite<UserT,DestinationT> withMaxNumWritersPerBundle(@Nullable java.lang.Integer maxNumWritersPerBundle)
public TextIO.TypedWrite<UserT,DestinationT> withBadRecordErrorHandler(ErrorHandler<BadRecord,?> errorHandler)
FileIO.Write.withBadRecordErrorHandler(ErrorHandler) for details on usage.public TextIO.TypedWrite<UserT,DestinationT> skipIfEmpty()
public WriteFilesResult<DestinationT> expand(PCollection<UserT> input)
PTransformPTransform should be expanded on the given
InputT.
NOTE: This method should not be called directly. Instead apply the PTransform should
be applied to the InputT using the apply method.
Composite transforms, which are defined in terms of other transforms, should return the output of one of the composed transforms. Non-composite transforms, which do not apply any transforms internally, should return a new unbound output and register evaluators (via backend-specific registration methods).
expand in class PTransform<PCollection<UserT>,WriteFilesResult<DestinationT>>public void populateDisplayData(DisplayData.Builder builder)
PTransformpopulateDisplayData(DisplayData.Builder) is invoked by Pipeline runners to collect
display data via DisplayData.from(HasDisplayData). Implementations may call super.populateDisplayData(builder) in order to register display data in the current namespace,
but should otherwise use subcomponent.populateDisplayData(builder) to use the namespace
of the subcomponent.
By default, does not register any display data. Implementors may override this method to provide their own display data.
populateDisplayData in interface HasDisplayDatapopulateDisplayData in class PTransform<PCollection<UserT>,WriteFilesResult<DestinationT>>builder - The builder to populate with display data.HasDisplayData