public abstract static class TFRecordIO.Write extends PTransform<PCollection<byte[]>,PDone>
TFRecordIO.write()
.name
Constructor and Description |
---|
Write() |
Modifier and Type | Method and Description |
---|---|
PDone |
expand(PCollection<byte[]> input)
Override this method to specify how this
PTransform should be expanded
on the given InputT . |
protected Coder<java.lang.Void> |
getDefaultOutputCoder()
Returns the default
Coder to use for the output of this
single-output PTransform . |
void |
populateDisplayData(DisplayData.Builder builder)
Register display data for the given transform or component.
|
TFRecordIO.Write |
to(ResourceId outputResource)
Writes TFRecord file(s) with a prefix given by the specified resource.
|
TFRecordIO.Write |
to(java.lang.String outputPrefix)
Writes TFRecord file(s) with the given output prefix.
|
TFRecordIO.Write |
toResource(ValueProvider<ResourceId> outputResource)
Like
to(ResourceId) . |
TFRecordIO.Write |
withCompressionType(TFRecordIO.CompressionType compressionType)
Writes to output files using the specified compression type.
|
TFRecordIO.Write |
withNumShards(int numShards)
Writes to the provided number of shards.
|
TFRecordIO.Write |
withoutSharding()
Forces a single file as output.
|
TFRecordIO.Write |
withShardNameTemplate(java.lang.String shardTemplate)
Uses the given shard name template.
|
TFRecordIO.Write |
withSuffix(java.lang.String suffix)
Writes to the file(s) with the given filename suffix.
|
getAdditionalInputs, getDefaultOutputCoder, getDefaultOutputCoder, getKindString, getName, toString, validate
public TFRecordIO.Write to(java.lang.String outputPrefix)
prefix
will be used as a
to generate a ResourceId
using any supported FileSystem
.
In addition to their prefix, created files will have a shard identifier (see
withNumShards(int)
), and end in a common suffix, if given by
withSuffix(String)
.
For more information on filenames, see DefaultFilenamePolicy
.
@Experimental(value=FILESYSTEM) public TFRecordIO.Write to(ResourceId outputResource)
In addition to their prefix, created files will have a shard identifier (see
withNumShards(int)
), and end in a common suffix, if given by
withSuffix(String)
.
For more information on filenames, see DefaultFilenamePolicy
.
@Experimental(value=FILESYSTEM) public TFRecordIO.Write toResource(ValueProvider<ResourceId> outputResource)
to(ResourceId)
.public TFRecordIO.Write withSuffix(java.lang.String suffix)
ShardNameTemplate
public TFRecordIO.Write withNumShards(int numShards)
Constraining the number of shards is likely to reduce the performance of a pipeline. Setting this value is not recommended unless you require a specific number of output files.
numShards
- the number of shards to use, or 0 to let the system
decide.ShardNameTemplate
public TFRecordIO.Write withShardNameTemplate(java.lang.String shardTemplate)
ShardNameTemplate
public TFRecordIO.Write withoutSharding()
Constraining the number of shards is likely to reduce the performance of a pipeline. Using this setting is not recommended unless you truly require a single output file.
This is a shortcut for
.withNumShards(1).withShardNameTemplate("")
public TFRecordIO.Write withCompressionType(TFRecordIO.CompressionType compressionType)
If no compression type is specified, the default is
TFRecordIO.CompressionType.NONE
.
See TFRecordIO.Read.withCompressionType(org.apache.beam.sdk.io.TFRecordIO.CompressionType)
for more details.
public PDone expand(PCollection<byte[]> input)
PTransform
PTransform
should be expanded
on the given InputT
.
NOTE: This method should not be called directly. Instead apply the
PTransform
should be applied to the InputT
using the apply
method.
Composite transforms, which are defined in terms of other transforms, should return the output of one of the composed transforms. Non-composite transforms, which do not apply any transforms internally, should return a new unbound output and register evaluators (via backend-specific registration methods).
expand
in class PTransform<PCollection<byte[]>,PDone>
public void populateDisplayData(DisplayData.Builder builder)
PTransform
populateDisplayData(DisplayData.Builder)
is invoked by Pipeline runners to collect
display data via DisplayData.from(HasDisplayData)
. Implementations may call
super.populateDisplayData(builder)
in order to register display data in the current
namespace, but should otherwise use subcomponent.populateDisplayData(builder)
to use
the namespace of the subcomponent.
By default, does not register any display data. Implementors may override this method to provide their own display data.
populateDisplayData
in interface HasDisplayData
populateDisplayData
in class PTransform<PCollection<byte[]>,PDone>
builder
- The builder to populate with display data.HasDisplayData
protected Coder<java.lang.Void> getDefaultOutputCoder()
PTransform
Coder
to use for the output of this
single-output PTransform
.
By default, always throws
getDefaultOutputCoder
in class PTransform<PCollection<byte[]>,PDone>