Class FileIO.MatchConfiguration
- All Implemented Interfaces:
Serializable
,HasDisplayData
- Enclosing class:
FileIO
EmptyMatchTreatment
and
continuous watching for matching files.- See Also:
-
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptioncontinuously
(Duration interval, Watch.Growth.TerminationCondition<String, ?> condition) Continuously watches for new files at the given interval until the given termination condition is reached, where the input to the condition is the filepattern.continuously
(Duration interval, Watch.Growth.TerminationCondition<String, ?> condition, boolean matchUpdatedFiles) Continuously watches for new files at the given interval until the given termination condition is reached, where the input to the condition is the filepattern.static FileIO.MatchConfiguration
create
(EmptyMatchTreatment emptyMatchTreatment) Creates aFileIO.MatchConfiguration
with the givenEmptyMatchTreatment
.abstract EmptyMatchTreatment
abstract boolean
void
populateDisplayData
(DisplayData.Builder builder) Register display data for the given transform or component.withEmptyMatchTreatment
(EmptyMatchTreatment treatment) Sets theEmptyMatchTreatment
.
-
Constructor Details
-
MatchConfiguration
public MatchConfiguration()
-
-
Method Details
-
create
Creates aFileIO.MatchConfiguration
with the givenEmptyMatchTreatment
. -
getEmptyMatchTreatment
-
getMatchUpdatedFiles
public abstract boolean getMatchUpdatedFiles() -
getWatchInterval
-
withEmptyMatchTreatment
Sets theEmptyMatchTreatment
. -
continuously
public FileIO.MatchConfiguration continuously(Duration interval, Watch.Growth.TerminationCondition<String, ?> condition, boolean matchUpdatedFiles) Continuously watches for new files at the given interval until the given termination condition is reached, where the input to the condition is the filepattern.If
matchUpdatedFiles
is set, also watches for files with timestamp change, with the watching frequency given by theinterval
. The pipeline will throw aRuntimeError
if timestamp extraction for the matched file has failed, suggesting the timestamp metadata is not available with the IO connector.Matching continuously scales poorly, as it is stateful, and requires storing file ids in memory. In addition, because it is memory-only, if a pipeline is restarted, already processed files will be reprocessed. Consider an alternate technique, such as Pub/Sub Notifications when using GCS if possible.
-
continuously
public FileIO.MatchConfiguration continuously(Duration interval, Watch.Growth.TerminationCondition<String, ?> condition) Continuously watches for new files at the given interval until the given termination condition is reached, where the input to the condition is the filepattern. To watch also for updated files, please setmatchUpdatedFiles
astrue
. -
populateDisplayData
Description copied from interface:HasDisplayData
Register display data for the given transform or component.populateDisplayData(DisplayData.Builder)
is invoked by Pipeline runners to collect display data viaDisplayData.from(HasDisplayData)
. Implementations may callsuper.populateDisplayData(builder)
in order to register display data in the current namespace, but should otherwise usesubcomponent.populateDisplayData(builder)
to use the namespace of the subcomponent.- Specified by:
populateDisplayData
in interfaceHasDisplayData
- Parameters:
builder
- The builder to populate with display data.- See Also:
-