public abstract static class TextIO.TypedWrite<UserT,DestinationT> extends PTransform<PCollection<UserT>,WriteFilesResult<DestinationT>>
TextIO.write()
.name
Constructor and Description |
---|
TypedWrite() |
Modifier and Type | Method and Description |
---|---|
WriteFilesResult<DestinationT> |
expand(PCollection<UserT> input)
Override this method to specify how this
PTransform should be expanded
on the given InputT . |
void |
populateDisplayData(DisplayData.Builder builder)
Register display data for the given transform or component.
|
<NewDestinationT> |
to(FileBasedSink.DynamicDestinations<UserT,DestinationT,java.lang.String> dynamicDestinations)
Use a
FileBasedSink.DynamicDestinations object to vend FileBasedSink.FilenamePolicy objects. |
TextIO.TypedWrite<UserT,DestinationT> |
to(FileBasedSink.FilenamePolicy filenamePolicy)
Writes to files named according to the given
FileBasedSink.FilenamePolicy . |
TextIO.TypedWrite<UserT,DestinationT> |
to(ResourceId filenamePrefix)
Like
to(String) . |
TextIO.TypedWrite<UserT,DefaultFilenamePolicy.Params> |
to(SerializableFunction<UserT,DefaultFilenamePolicy.Params> destinationFunction,
DefaultFilenamePolicy.Params emptyDestination)
Write to dynamic destinations using the default filename policy.
|
TextIO.TypedWrite<UserT,DestinationT> |
to(java.lang.String filenamePrefix)
Writes to text files with the given prefix.
|
TextIO.TypedWrite<UserT,DestinationT> |
to(ValueProvider<java.lang.String> outputPrefix)
Like
to(String) . |
TextIO.TypedWrite<UserT,DestinationT> |
toResource(ValueProvider<ResourceId> filenamePrefix)
Like
to(ResourceId) . |
TextIO.TypedWrite<UserT,DestinationT> |
withCompression(Compression compression)
Returns a transform for writing to text files like this one but that compresses output using
the given
Compression . |
TextIO.TypedWrite<UserT,DestinationT> |
withFooter(java.lang.String footer)
Adds a footer string to each file.
|
TextIO.TypedWrite<UserT,DestinationT> |
withFormatFunction(SerializableFunction<UserT,java.lang.String> formatFunction)
Specifies a format function to convert
UserT to the output type. |
TextIO.TypedWrite<UserT,DestinationT> |
withHeader(java.lang.String header)
Adds a header string to each file.
|
TextIO.TypedWrite<UserT,DestinationT> |
withNumShards(int numShards)
Configures the number of output shards produced overall (when using unwindowed writes) or
per-window (when using windowed writes).
|
TextIO.TypedWrite<UserT,DestinationT> |
withoutSharding()
Forces a single file as output and empty shard name template.
|
TextIO.TypedWrite<UserT,DestinationT> |
withShardNameTemplate(java.lang.String shardTemplate)
Uses the given
ShardNameTemplate for naming output files. |
TextIO.TypedWrite<UserT,DestinationT> |
withSuffix(java.lang.String filenameSuffix)
Configures the filename suffix for written files.
|
TextIO.TypedWrite<UserT,DestinationT> |
withTempDirectory(ResourceId tempDirectory)
Set the base directory used to generate temporary files.
|
TextIO.TypedWrite<UserT,DestinationT> |
withTempDirectory(ValueProvider<ResourceId> tempDirectory)
Set the base directory used to generate temporary files.
|
TextIO.TypedWrite<UserT,DestinationT> |
withWindowedWrites()
Preserves windowing of input elements and writes them to files based on the element's window.
|
TextIO.TypedWrite<UserT,DestinationT> |
withWritableByteChannelFactory(FileBasedSink.WritableByteChannelFactory writableByteChannelFactory)
Returns a transform for writing to text files like this one but that has the given
FileBasedSink.WritableByteChannelFactory to be used by the FileBasedSink during output. |
getAdditionalInputs, getDefaultOutputCoder, getDefaultOutputCoder, getDefaultOutputCoder, getKindString, getName, toString, validate
public TextIO.TypedWrite<UserT,DestinationT> to(java.lang.String filenamePrefix)
prefix
can reference any FileSystem
on the classpath. This prefix is used by the DefaultFilenamePolicy
to
generate filenames.
By default, a DefaultFilenamePolicy
will be used built using the specified prefix
to define the base output directory and file prefix, a shard identifier (see withNumShards(int)
), and a common suffix (if supplied using withSuffix(String)
).
This default policy can be overridden using #to(FilenamePolicy)
, in which case
withShardNameTemplate(String)
and withSuffix(String)
should not be set.
Custom filename policies do not automatically see this prefix - you should explicitly pass
the prefix into your FileBasedSink.FilenamePolicy
object if you need this.
If withTempDirectory(org.apache.beam.sdk.options.ValueProvider<org.apache.beam.sdk.io.fs.ResourceId>)
has not been called, this filename prefix will be used to
infer a directory for temporary files.
@Experimental(value=FILESYSTEM) public TextIO.TypedWrite<UserT,DestinationT> to(ResourceId filenamePrefix)
to(String)
.public TextIO.TypedWrite<UserT,DestinationT> to(ValueProvider<java.lang.String> outputPrefix)
to(String)
.public TextIO.TypedWrite<UserT,DestinationT> to(FileBasedSink.FilenamePolicy filenamePolicy)
FileBasedSink.FilenamePolicy
. A
directory for temporary files must be specified using withTempDirectory(org.apache.beam.sdk.options.ValueProvider<org.apache.beam.sdk.io.fs.ResourceId>)
.public <NewDestinationT> TextIO.TypedWrite<UserT,NewDestinationT> to(FileBasedSink.DynamicDestinations<UserT,DestinationT,java.lang.String> dynamicDestinations)
FileBasedSink.DynamicDestinations
object to vend FileBasedSink.FilenamePolicy
objects. These
objects can examine the input record when creating a FileBasedSink.FilenamePolicy
. A directory for
temporary files must be specified using withTempDirectory(org.apache.beam.sdk.options.ValueProvider<org.apache.beam.sdk.io.fs.ResourceId>)
.public TextIO.TypedWrite<UserT,DefaultFilenamePolicy.Params> to(SerializableFunction<UserT,DefaultFilenamePolicy.Params> destinationFunction, DefaultFilenamePolicy.Params emptyDestination)
DefaultFilenamePolicy.Params
object that specifies where the
records should be written (base filename, file suffix, and shard template). The
emptyDestination parameter specified where empty files should be written for when the written
PCollection
is empty.@Experimental(value=FILESYSTEM) public TextIO.TypedWrite<UserT,DestinationT> toResource(ValueProvider<ResourceId> filenamePrefix)
to(ResourceId)
.public TextIO.TypedWrite<UserT,DestinationT> withFormatFunction(SerializableFunction<UserT,java.lang.String> formatFunction)
UserT
to the output type. If #to(DynamicDestinations)
is used, FileBasedSink.DynamicDestinations.formatRecord(Object)
must be
used instead.@Experimental(value=FILESYSTEM) public TextIO.TypedWrite<UserT,DestinationT> withTempDirectory(ValueProvider<ResourceId> tempDirectory)
@Experimental(value=FILESYSTEM) public TextIO.TypedWrite<UserT,DestinationT> withTempDirectory(ResourceId tempDirectory)
public TextIO.TypedWrite<UserT,DestinationT> withShardNameTemplate(java.lang.String shardTemplate)
ShardNameTemplate
for naming output files. This option may only be
used when using one of the default filename-prefix to() overrides - i.e. not when using
either #to(FilenamePolicy)
or #to(DynamicDestinations)
.
See DefaultFilenamePolicy
for how the prefix, shard name template, and suffix are
used.
public TextIO.TypedWrite<UserT,DestinationT> withSuffix(java.lang.String filenameSuffix)
#to(FilenamePolicy)
or #to(DynamicDestinations)
.
See DefaultFilenamePolicy
for how the prefix, shard name template, and suffix are
used.
public TextIO.TypedWrite<UserT,DestinationT> withNumShards(int numShards)
For unwindowed writes, constraining the number of shards is likely to reduce the performance of a pipeline. Setting this value is not recommended unless you require a specific number of output files.
numShards
- the number of shards to use, or 0 to let the system decide.public TextIO.TypedWrite<UserT,DestinationT> withoutSharding()
For unwindowed writes, constraining the number of shards is likely to reduce the performance of a pipeline. Setting this value is not recommended unless you require a specific number of output files.
This is equivalent to .withNumShards(1).withShardNameTemplate("")
public TextIO.TypedWrite<UserT,DestinationT> withHeader(@Nullable java.lang.String header)
A null
value will clear any previously configured header.
public TextIO.TypedWrite<UserT,DestinationT> withFooter(@Nullable java.lang.String footer)
A null
value will clear any previously configured footer.
public TextIO.TypedWrite<UserT,DestinationT> withWritableByteChannelFactory(FileBasedSink.WritableByteChannelFactory writableByteChannelFactory)
FileBasedSink.WritableByteChannelFactory
to be used by the FileBasedSink
during output. The
default is value is Compression.UNCOMPRESSED
.
A null
value will reset the value to the default value mentioned above.
public TextIO.TypedWrite<UserT,DestinationT> withCompression(Compression compression)
Compression
. The default value is Compression.UNCOMPRESSED
.public TextIO.TypedWrite<UserT,DestinationT> withWindowedWrites()
If using to(FileBasedSink.FilenamePolicy)
. Filenames will be generated using
FileBasedSink.FilenamePolicy.windowedFilename(int, int, org.apache.beam.sdk.transforms.windowing.BoundedWindow, org.apache.beam.sdk.transforms.windowing.PaneInfo, org.apache.beam.sdk.io.FileBasedSink.OutputFileHints)
. See also WriteFiles.withWindowedWrites()
.
public WriteFilesResult<DestinationT> expand(PCollection<UserT> input)
PTransform
PTransform
should be expanded
on the given InputT
.
NOTE: This method should not be called directly. Instead apply the
PTransform
should be applied to the InputT
using the apply
method.
Composite transforms, which are defined in terms of other transforms, should return the output of one of the composed transforms. Non-composite transforms, which do not apply any transforms internally, should return a new unbound output and register evaluators (via backend-specific registration methods).
expand
in class PTransform<PCollection<UserT>,WriteFilesResult<DestinationT>>
public void populateDisplayData(DisplayData.Builder builder)
PTransform
populateDisplayData(DisplayData.Builder)
is invoked by Pipeline runners to collect
display data via DisplayData.from(HasDisplayData)
. Implementations may call
super.populateDisplayData(builder)
in order to register display data in the current
namespace, but should otherwise use subcomponent.populateDisplayData(builder)
to use
the namespace of the subcomponent.
By default, does not register any display data. Implementors may override this method to provide their own display data.
populateDisplayData
in interface HasDisplayData
populateDisplayData
in class PTransform<PCollection<UserT>,WriteFilesResult<DestinationT>>
builder
- The builder to populate with display data.HasDisplayData