public abstract static class AvroIO.Write<T> extends PTransform<PCollection<T>,PDone>
AvroIO.write(java.lang.Class<T>)
.name
Constructor and Description |
---|
Write() |
Modifier and Type | Method and Description |
---|---|
PDone |
expand(PCollection<T> input)
Override this method to specify how this
PTransform should be expanded
on the given InputT . |
protected Coder<java.lang.Void> |
getDefaultOutputCoder()
Returns the default
Coder to use for the output of this
single-output PTransform . |
void |
populateDisplayData(DisplayData.Builder builder)
Register display data for the given transform or component.
|
AvroIO.Write<T> |
to(ResourceId outputPrefix)
Writes to file(s) with the given output prefix.
|
AvroIO.Write<T> |
to(java.lang.String outputPrefix)
Writes to file(s) with the given output prefix.
|
AvroIO.Write<T> |
to(ValueProvider<java.lang.String> outputPrefix)
Like
to(String) . |
AvroIO.Write<T> |
toResource(ValueProvider<ResourceId> outputPrefix)
Like
to(ResourceId) . |
AvroIO.Write<T> |
withCodec(CodecFactory codec)
Writes to Avro file(s) compressed using specified codec.
|
AvroIO.Write<T> |
withFilenamePolicy(FileBasedSink.FilenamePolicy filenamePolicy)
Configures the
FileBasedSink.FilenamePolicy that will be used to name written files. |
AvroIO.Write<T> |
withMetadata(java.util.Map<java.lang.String,java.lang.Object> metadata)
Writes to Avro file(s) with the specified metadata.
|
AvroIO.Write<T> |
withNumShards(int numShards)
Configures the number of output shards produced overall (when using unwindowed writes) or
per-window (when using windowed writes).
|
AvroIO.Write<T> |
withoutSharding()
Forces a single file as output and empty shard name template.
|
AvroIO.Write<T> |
withShardNameTemplate(java.lang.String shardTemplate)
Uses the given
ShardNameTemplate for naming output files. |
AvroIO.Write<T> |
withSuffix(java.lang.String filenameSuffix)
Configures the filename suffix for written files.
|
AvroIO.Write<T> |
withWindowedWrites()
Preserves windowing of input elements and writes them to files based on the element's window.
|
getAdditionalInputs, getDefaultOutputCoder, getDefaultOutputCoder, getKindString, getName, toString, validate
public AvroIO.Write<T> to(java.lang.String outputPrefix)
FileSystems
for information on
supported file systems.
The name of the output files will be determined by the FileBasedSink.FilenamePolicy
used.
By default, a DefaultFilenamePolicy
will build output filenames using the
specified prefix, a shard name template (see withShardNameTemplate(String)
, and
a common suffix (if supplied using withSuffix(String)
). This default can be
overridden using #withFilenamePolicy(FilenamePolicy)
.
@Experimental(value=FILESYSTEM) public AvroIO.Write<T> to(ResourceId outputPrefix)
FileSystems
for information on
supported file systems.
The name of the output files will be determined by the FileBasedSink.FilenamePolicy
used.
By default, a DefaultFilenamePolicy
will build output filenames using the
specified prefix, a shard name template (see withShardNameTemplate(String)
, and
a common suffix (if supplied using withSuffix(String)
). This default can be
overridden using #withFilenamePolicy(FilenamePolicy)
.
public AvroIO.Write<T> to(ValueProvider<java.lang.String> outputPrefix)
to(String)
.@Experimental(value=FILESYSTEM) public AvroIO.Write<T> toResource(ValueProvider<ResourceId> outputPrefix)
to(ResourceId)
.public AvroIO.Write<T> withFilenamePolicy(FileBasedSink.FilenamePolicy filenamePolicy)
FileBasedSink.FilenamePolicy
that will be used to name written files.public AvroIO.Write<T> withShardNameTemplate(java.lang.String shardTemplate)
ShardNameTemplate
for naming output files. This option may only be
used when #withFilenamePolicy(FilenamePolicy)
has not been configured.
See DefaultFilenamePolicy
for how the prefix, shard name template, and suffix are
used.
public AvroIO.Write<T> withSuffix(java.lang.String filenameSuffix)
#withFilenamePolicy(FilenamePolicy)
has not been configured.
See DefaultFilenamePolicy
for how the prefix, shard name template, and suffix are
used.
public AvroIO.Write<T> withNumShards(int numShards)
For unwindowed writes, constraining the number of shards is likely to reduce the performance of a pipeline. Setting this value is not recommended unless you require a specific number of output files.
numShards
- the number of shards to use, or 0 to let the system decide.public AvroIO.Write<T> withoutSharding()
For unwindowed writes, constraining the number of shards is likely to reduce the performance of a pipeline. Setting this value is not recommended unless you require a specific number of output files.
This is equivalent to .withNumShards(1).withShardNameTemplate("")
public AvroIO.Write<T> withWindowedWrites()
Requires use of withFilenamePolicy(FileBasedSink.FilenamePolicy)
. Filenames will
be generated using FileBasedSink.FilenamePolicy.windowedFilename(org.apache.beam.sdk.io.fs.ResourceId, org.apache.beam.sdk.io.FileBasedSink.FilenamePolicy.WindowedContext, java.lang.String)
. See also
WriteFiles.withWindowedWrites()
.
public AvroIO.Write<T> withCodec(CodecFactory codec)
public AvroIO.Write<T> withMetadata(java.util.Map<java.lang.String,java.lang.Object> metadata)
Supported value types are String, Long, and byte[].
public PDone expand(PCollection<T> input)
PTransform
PTransform
should be expanded
on the given InputT
.
NOTE: This method should not be called directly. Instead apply the
PTransform
should be applied to the InputT
using the apply
method.
Composite transforms, which are defined in terms of other transforms, should return the output of one of the composed transforms. Non-composite transforms, which do not apply any transforms internally, should return a new unbound output and register evaluators (via backend-specific registration methods).
expand
in class PTransform<PCollection<T>,PDone>
public void populateDisplayData(DisplayData.Builder builder)
PTransform
populateDisplayData(DisplayData.Builder)
is invoked by Pipeline runners to collect
display data via DisplayData.from(HasDisplayData)
. Implementations may call
super.populateDisplayData(builder)
in order to register display data in the current
namespace, but should otherwise use subcomponent.populateDisplayData(builder)
to use
the namespace of the subcomponent.
By default, does not register any display data. Implementors may override this method to provide their own display data.
populateDisplayData
in interface HasDisplayData
populateDisplayData
in class PTransform<PCollection<T>,PDone>
builder
- The builder to populate with display data.HasDisplayData
protected Coder<java.lang.Void> getDefaultOutputCoder()
PTransform
Coder
to use for the output of this
single-output PTransform
.
By default, always throws
getDefaultOutputCoder
in class PTransform<PCollection<T>,PDone>