Class FileIO.MatchConfiguration
- All Implemented Interfaces:
Serializable,HasDisplayData
- Enclosing class:
FileIO
EmptyMatchTreatment and
continuous watching for matching files.- See Also:
-
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptioncontinuously(Duration interval, Watch.Growth.TerminationCondition<String, ?> condition) Continuously watches for new files at the given interval until the given termination condition is reached, where the input to the condition is the filepattern.continuously(Duration interval, Watch.Growth.TerminationCondition<String, ?> condition, boolean matchUpdatedFiles) Continuously watches for new files at the given interval until the given termination condition is reached, where the input to the condition is the filepattern.static FileIO.MatchConfigurationcreate(EmptyMatchTreatment emptyMatchTreatment) Creates aFileIO.MatchConfigurationwith the givenEmptyMatchTreatment.abstract EmptyMatchTreatmentabstract booleanvoidpopulateDisplayData(DisplayData.Builder builder) Register display data for the given transform or component.withEmptyMatchTreatment(EmptyMatchTreatment treatment) Sets theEmptyMatchTreatment.
-
Constructor Details
-
MatchConfiguration
public MatchConfiguration()
-
-
Method Details
-
create
Creates aFileIO.MatchConfigurationwith the givenEmptyMatchTreatment. -
getEmptyMatchTreatment
-
getMatchUpdatedFiles
public abstract boolean getMatchUpdatedFiles() -
getWatchInterval
-
withEmptyMatchTreatment
Sets theEmptyMatchTreatment. -
continuously
public FileIO.MatchConfiguration continuously(Duration interval, Watch.Growth.TerminationCondition<String, ?> condition, boolean matchUpdatedFiles) Continuously watches for new files at the given interval until the given termination condition is reached, where the input to the condition is the filepattern.If
matchUpdatedFilesis set, also watches for files with timestamp change, with the watching frequency given by theinterval. The pipeline will throw aRuntimeErrorif timestamp extraction for the matched file has failed, suggesting the timestamp metadata is not available with the IO connector.Matching continuously scales poorly, as it is stateful, and requires storing file ids in memory. In addition, because it is memory-only, if a pipeline is restarted, already processed files will be reprocessed. Consider an alternate technique, such as Pub/Sub Notifications when using GCS if possible.
-
continuously
public FileIO.MatchConfiguration continuously(Duration interval, Watch.Growth.TerminationCondition<String, ?> condition) Continuously watches for new files at the given interval until the given termination condition is reached, where the input to the condition is the filepattern. To watch also for updated files, please setmatchUpdatedFilesastrue. -
populateDisplayData
Description copied from interface:HasDisplayDataRegister display data for the given transform or component.populateDisplayData(DisplayData.Builder)is invoked by Pipeline runners to collect display data viaDisplayData.from(HasDisplayData). Implementations may callsuper.populateDisplayData(builder)in order to register display data in the current namespace, but should otherwise usesubcomponent.populateDisplayData(builder)to use the namespace of the subcomponent.- Specified by:
populateDisplayDatain interfaceHasDisplayData- Parameters:
builder- The builder to populate with display data.- See Also:
-