Class DefaultFilenamePolicy
- All Implemented Interfaces:
Serializable
FileBasedSink.FilenamePolicy for windowed and unwindowed files. This policy is constructed
using three parameters that together define the output name of a sharded file, in conjunction
with the number of shards, index of the particular file, current window and pane information,
using constructName(org.apache.beam.sdk.io.fs.ResourceId, java.lang.String, java.lang.String, int, int, java.lang.String, java.lang.String).
Most users will use this DefaultFilenamePolicy. For more advanced uses in generating
different files for each window and other sharding controls, see the
WriteOneFilePerWindow example pipeline.
- See Also:
-
Nested Class Summary
Nested ClassesModifier and TypeClassDescriptionstatic classEncapsulates constructor parameters toDefaultFilenamePolicy.static classA Coder forDefaultFilenamePolicy.Params. -
Field Summary
Fields -
Method Summary
Modifier and TypeMethodDescriptionstatic ResourceIdconstructName(ResourceId baseFilename, String shardTemplate, String suffix, int shardNum, int numShards, @Nullable String paneStr, @Nullable String windowStr) Constructs a fully qualified name from components.static DefaultFilenamePolicyConstruct aDefaultFilenamePolicyfrom aDefaultFilenamePolicy.Paramsobject.static DefaultFilenamePolicyfromStandardParameters(ValueProvider<ResourceId> baseFilename, @Nullable String shardTemplate, @Nullable String filenameSuffix, boolean windowedWrites) Construct aDefaultFilenamePolicy.voidpopulateDisplayData(DisplayData.Builder builder) Populates the display data.unwindowedFilename(int shardNumber, int numShards, FileBasedSink.OutputFileHints outputFileHints) When a sink has not requested windowed or triggered output, this method will be invoked to return the fileresourceto be created given the base output directory and aFileBasedSink.OutputFileHintscontaining information about the file, including a suggested (e.g.windowedFilename(int shardNumber, int numShards, BoundedWindow window, PaneInfo paneInfo, FileBasedSink.OutputFileHints outputFileHints) When a sink has requested windowed or triggered output, this method will be invoked to return the fileresourceto be created given the base output directory and aFileBasedSink.OutputFileHintscontaining information about the file, including a suggested extension (e.g.
-
Field Details
-
DEFAULT_UNWINDOWED_SHARD_TEMPLATE
The default sharding name template.- See Also:
-
DEFAULT_WINDOWED_SHARD_TEMPLATE
The default windowed sharding name template used when writing windowed files. This is used as default in cases when user did not specify shard template to be used and there is a need to write windowed files. In cases when user does specify shard template to be used then provided template will be used for both windowed and non-windowed file names.- See Also:
-
-
Method Details
-
fromStandardParameters
public static DefaultFilenamePolicy fromStandardParameters(ValueProvider<ResourceId> baseFilename, @Nullable String shardTemplate, @Nullable String filenameSuffix, boolean windowedWrites) Construct aDefaultFilenamePolicy.This is a shortcut for:
DefaultFilenamePolicy.fromParams(new Params() .withBaseFilename(baseFilename) .withShardTemplate(shardTemplate) .withSuffix(filenameSuffix) .withWindowedWrites())Where the respective
withmethods are invoked only if the value is non-null or true. -
fromParams
Construct aDefaultFilenamePolicyfrom aDefaultFilenamePolicy.Paramsobject. -
constructName
public static ResourceId constructName(ResourceId baseFilename, String shardTemplate, String suffix, int shardNum, int numShards, @Nullable String paneStr, @Nullable String windowStr) Constructs a fully qualified name from components.The name is built from a base filename, shard template (with shard numbers applied), and a suffix. All components are required, but may be empty strings.
Within a shard template, repeating sequences of the letters "S" or "N" are replaced with the shard number, or number of shards respectively. "P" is replaced with by stringification of current pane. "W" is replaced by stringification of current window.
The numbers are formatted with leading zeros to match the length of the repeated sequence of letters.
For example, if baseFilename = "path/to/output", shardTemplate = "-SSS-of-NNN", and suffix = ".txt", with shardNum = 1 and numShards = 100, the following is produced: "path/to/output-001-of-100.txt".
-
unwindowedFilename
public @Nullable ResourceId unwindowedFilename(int shardNumber, int numShards, FileBasedSink.OutputFileHints outputFileHints) Description copied from class:FileBasedSink.FilenamePolicyWhen a sink has not requested windowed or triggered output, this method will be invoked to return the fileresourceto be created given the base output directory and aFileBasedSink.OutputFileHintscontaining information about the file, including a suggested (e.g. coming fromCompression).The shardNumber and numShards parameters, should be used by the policy to generate unique and consistent filenames.
- Specified by:
unwindowedFilenamein classFileBasedSink.FilenamePolicy
-
windowedFilename
public ResourceId windowedFilename(int shardNumber, int numShards, BoundedWindow window, PaneInfo paneInfo, FileBasedSink.OutputFileHints outputFileHints) Description copied from class:FileBasedSink.FilenamePolicyWhen a sink has requested windowed or triggered output, this method will be invoked to return the fileresourceto be created given the base output directory and aFileBasedSink.OutputFileHintscontaining information about the file, including a suggested extension (e.g. coming fromCompression).The policy must return unique and consistent filenames for different windows and panes.
- Specified by:
windowedFilenamein classFileBasedSink.FilenamePolicy
-
populateDisplayData
Description copied from class:FileBasedSink.FilenamePolicyPopulates the display data.- Overrides:
populateDisplayDatain classFileBasedSink.FilenamePolicy
-