Class DefaultFilenamePolicy
- All Implemented Interfaces:
Serializable
FileBasedSink.FilenamePolicy
for windowed and unwindowed files. This policy is constructed
using three parameters that together define the output name of a sharded file, in conjunction
with the number of shards, index of the particular file, current window and pane information,
using constructName(org.apache.beam.sdk.io.fs.ResourceId, java.lang.String, java.lang.String, int, int, java.lang.String, java.lang.String)
.
Most users will use this DefaultFilenamePolicy
. For more advanced uses in generating
different files for each window and other sharding controls, see the
WriteOneFilePerWindow
example pipeline.
- See Also:
-
Nested Class Summary
Nested ClassesModifier and TypeClassDescriptionstatic class
Encapsulates constructor parameters toDefaultFilenamePolicy
.static class
A Coder forDefaultFilenamePolicy.Params
. -
Field Summary
Fields -
Method Summary
Modifier and TypeMethodDescriptionstatic ResourceId
constructName
(ResourceId baseFilename, String shardTemplate, String suffix, int shardNum, int numShards, @Nullable String paneStr, @Nullable String windowStr) Constructs a fully qualified name from components.static DefaultFilenamePolicy
Construct aDefaultFilenamePolicy
from aDefaultFilenamePolicy.Params
object.static DefaultFilenamePolicy
fromStandardParameters
(ValueProvider<ResourceId> baseFilename, @Nullable String shardTemplate, @Nullable String filenameSuffix, boolean windowedWrites) Construct aDefaultFilenamePolicy
.void
populateDisplayData
(DisplayData.Builder builder) Populates the display data.unwindowedFilename
(int shardNumber, int numShards, FileBasedSink.OutputFileHints outputFileHints) When a sink has not requested windowed or triggered output, this method will be invoked to return the fileresource
to be created given the base output directory and aFileBasedSink.OutputFileHints
containing information about the file, including a suggested (e.g.windowedFilename
(int shardNumber, int numShards, BoundedWindow window, PaneInfo paneInfo, FileBasedSink.OutputFileHints outputFileHints) When a sink has requested windowed or triggered output, this method will be invoked to return the fileresource
to be created given the base output directory and aFileBasedSink.OutputFileHints
containing information about the file, including a suggested extension (e.g.
-
Field Details
-
DEFAULT_UNWINDOWED_SHARD_TEMPLATE
The default sharding name template.- See Also:
-
DEFAULT_WINDOWED_SHARD_TEMPLATE
The default windowed sharding name template used when writing windowed files. This is used as default in cases when user did not specify shard template to be used and there is a need to write windowed files. In cases when user does specify shard template to be used then provided template will be used for both windowed and non-windowed file names.- See Also:
-
-
Method Details
-
fromStandardParameters
public static DefaultFilenamePolicy fromStandardParameters(ValueProvider<ResourceId> baseFilename, @Nullable String shardTemplate, @Nullable String filenameSuffix, boolean windowedWrites) Construct aDefaultFilenamePolicy
.This is a shortcut for:
DefaultFilenamePolicy.fromParams(new Params() .withBaseFilename(baseFilename) .withShardTemplate(shardTemplate) .withSuffix(filenameSuffix) .withWindowedWrites())
Where the respective
with
methods are invoked only if the value is non-null or true. -
fromParams
Construct aDefaultFilenamePolicy
from aDefaultFilenamePolicy.Params
object. -
constructName
public static ResourceId constructName(ResourceId baseFilename, String shardTemplate, String suffix, int shardNum, int numShards, @Nullable String paneStr, @Nullable String windowStr) Constructs a fully qualified name from components.The name is built from a base filename, shard template (with shard numbers applied), and a suffix. All components are required, but may be empty strings.
Within a shard template, repeating sequences of the letters "S" or "N" are replaced with the shard number, or number of shards respectively. "P" is replaced with by stringification of current pane. "W" is replaced by stringification of current window.
The numbers are formatted with leading zeros to match the length of the repeated sequence of letters.
For example, if baseFilename = "path/to/output", shardTemplate = "-SSS-of-NNN", and suffix = ".txt", with shardNum = 1 and numShards = 100, the following is produced: "path/to/output-001-of-100.txt".
-
unwindowedFilename
public @Nullable ResourceId unwindowedFilename(int shardNumber, int numShards, FileBasedSink.OutputFileHints outputFileHints) Description copied from class:FileBasedSink.FilenamePolicy
When a sink has not requested windowed or triggered output, this method will be invoked to return the fileresource
to be created given the base output directory and aFileBasedSink.OutputFileHints
containing information about the file, including a suggested (e.g. coming fromCompression
).The shardNumber and numShards parameters, should be used by the policy to generate unique and consistent filenames.
- Specified by:
unwindowedFilename
in classFileBasedSink.FilenamePolicy
-
windowedFilename
public ResourceId windowedFilename(int shardNumber, int numShards, BoundedWindow window, PaneInfo paneInfo, FileBasedSink.OutputFileHints outputFileHints) Description copied from class:FileBasedSink.FilenamePolicy
When a sink has requested windowed or triggered output, this method will be invoked to return the fileresource
to be created given the base output directory and aFileBasedSink.OutputFileHints
containing information about the file, including a suggested extension (e.g. coming fromCompression
).The policy must return unique and consistent filenames for different windows and panes.
- Specified by:
windowedFilename
in classFileBasedSink.FilenamePolicy
-
populateDisplayData
Description copied from class:FileBasedSink.FilenamePolicy
Populates the display data.- Overrides:
populateDisplayData
in classFileBasedSink.FilenamePolicy
-