public final class DefaultFilenamePolicy extends FileBasedSink.FilenamePolicy
FileBasedSink.FilenamePolicy
for unwindowed files. This policy is constructed using three
parameters that together define the output name of a sharded file, in conjunction with the number
of shards and index of the particular file, using constructName(java.lang.String, java.lang.String, java.lang.String, int, int)
.
Most users of unwindowed files will use this DefaultFilenamePolicy
. For more advanced
uses in generating different files for each window and other sharding controls, see the
WriteOneFilePerWindow
example pipeline.
FileBasedSink.FilenamePolicy.Context, FileBasedSink.FilenamePolicy.WindowedContext
Modifier and Type | Field and Description |
---|---|
static java.lang.String |
DEFAULT_SHARD_TEMPLATE
The default sharding name template used in
constructUsingStandardParameters(org.apache.beam.sdk.options.ValueProvider<org.apache.beam.sdk.io.fs.ResourceId>, java.lang.String, java.lang.String) . |
Modifier and Type | Method and Description |
---|---|
static java.lang.String |
constructName(java.lang.String prefix,
java.lang.String shardTemplate,
java.lang.String suffix,
int shardNum,
int numShards)
Constructs a fully qualified name from components.
|
static DefaultFilenamePolicy |
constructUsingStandardParameters(ValueProvider<ResourceId> outputPrefix,
java.lang.String shardTemplate,
java.lang.String filenameSuffix)
A helper function to construct a
DefaultFilenamePolicy using the standard filename
parameters, namely a provided ResourceId for the output prefix, and possibly-null
shard name template and suffix. |
void |
populateDisplayData(DisplayData.Builder builder)
Populates the display data.
|
ResourceId |
unwindowedFilename(ResourceId outputDirectory,
FileBasedSink.FilenamePolicy.Context context,
java.lang.String extension)
When a sink has not requested windowed or triggered output, this method will be invoked to
return the file
resource to be created given the base output directory and
a (possibly empty) extension applied by additional FileBasedSink configuration
(e.g., FileBasedSink.CompressionType ). |
ResourceId |
windowedFilename(ResourceId outputDirectory,
FileBasedSink.FilenamePolicy.WindowedContext c,
java.lang.String extension)
When a sink has requested windowed or triggered output, this method will be invoked to return
the file
resource to be created given the base output directory and a
(possibly empty) extension from FileBasedSink configuration
(e.g., FileBasedSink.CompressionType ). |
public static final java.lang.String DEFAULT_SHARD_TEMPLATE
constructUsingStandardParameters(org.apache.beam.sdk.options.ValueProvider<org.apache.beam.sdk.io.fs.ResourceId>, java.lang.String, java.lang.String)
.public static DefaultFilenamePolicy constructUsingStandardParameters(ValueProvider<ResourceId> outputPrefix, @Nullable java.lang.String shardTemplate, @Nullable java.lang.String filenameSuffix)
DefaultFilenamePolicy
using the standard filename
parameters, namely a provided ResourceId
for the output prefix, and possibly-null
shard name template and suffix.
Any filename component of the provided resource will be used as the filename prefix.
If provided, the shard name template will be used; otherwise DEFAULT_SHARD_TEMPLATE
will be used.
If provided, the suffix will be used; otherwise the files will have an empty suffix.
public static java.lang.String constructName(java.lang.String prefix, java.lang.String shardTemplate, java.lang.String suffix, int shardNum, int numShards)
The name is built from a prefix, shard template (with shard numbers applied), and a suffix. All components are required, but may be empty strings.
Within a shard template, repeating sequences of the letters "S" or "N" are replaced with the shard number, or number of shards respectively. The numbers are formatted with leading zeros to match the length of the repeated sequence of letters.
For example, if prefix = "output", shardTemplate = "-SSS-of-NNN", and suffix = ".txt", with shardNum = 1 and numShards = 100, the following is produced: "output-001-of-100.txt".
@Nullable public ResourceId unwindowedFilename(ResourceId outputDirectory, FileBasedSink.FilenamePolicy.Context context, java.lang.String extension)
FileBasedSink.FilenamePolicy
resource
to be created given the base output directory and
a (possibly empty) extension applied by additional FileBasedSink
configuration
(e.g., FileBasedSink.CompressionType
).
The FileBasedSink.FilenamePolicy.Context
object only provides sharding information, which is used by the policy
to generate unique and consistent filenames.
unwindowedFilename
in class FileBasedSink.FilenamePolicy
public ResourceId windowedFilename(ResourceId outputDirectory, FileBasedSink.FilenamePolicy.WindowedContext c, java.lang.String extension)
FileBasedSink.FilenamePolicy
resource
to be created given the base output directory and a
(possibly empty) extension from FileBasedSink
configuration
(e.g., FileBasedSink.CompressionType
).
The FileBasedSink.FilenamePolicy.WindowedContext
object gives access to the window and pane,
as well as sharding information. The policy must return unique and consistent filenames
for different windows and panes.
windowedFilename
in class FileBasedSink.FilenamePolicy
public void populateDisplayData(DisplayData.Builder builder)
FileBasedSink.FilenamePolicy
populateDisplayData
in class FileBasedSink.FilenamePolicy