Class AvroIO.Read<T>
- All Implemented Interfaces:
Serializable
,HasDisplayData
- Enclosing class:
AvroIO
AvroIO.read(java.lang.Class<T>)
and AvroIO.readGenericRecords(org.apache.avro.Schema)
.- See Also:
-
Field Summary
Fields inherited from class org.apache.beam.sdk.transforms.PTransform
annotations, displayData, name, resourceHints
-
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionOverride this method to specify how thisPTransform
should be expanded on the givenInputT
.Likefrom(ValueProvider)
.from
(ValueProvider<String> filepattern) Reads from the given filename or filepattern.void
populateDisplayData
(DisplayData.Builder builder) Register display data for the given transform or component.watchForNewFiles
(Duration pollInterval, Watch.Growth.TerminationCondition<String, ?> terminationCondition) Same aswatchForNewFiles(Duration, TerminationCondition, boolean)
withmatchUpdatedFiles=false
.watchForNewFiles
(Duration pollInterval, Watch.Growth.TerminationCondition<String, ?> terminationCondition, boolean matchUpdatedFiles) Continuously watches for new files matching the filepattern, polling it at the given interval, until the given termination condition is reached.withBeamSchemas
(boolean withBeamSchemas) If set to true, a Beam schema will be inferred from the AVRO schema.Sets a coder for the result of the read function.withDatumReaderFactory
(AvroSource.DatumReaderFactory<T> readerFactory) Sets a customAvroSource.DatumReaderFactory
for reading.withEmptyMatchTreatment
(EmptyMatchTreatment treatment) Configures whether or not a filepattern matching no files is allowed.Hints that the filepattern specified infrom(String)
matches a very large number of files.withMatchConfiguration
(FileIO.MatchConfiguration matchConfiguration) Sets theFileIO.MatchConfiguration
.Methods inherited from class org.apache.beam.sdk.transforms.PTransform
addAnnotation, compose, compose, getAdditionalInputs, getAnnotations, getDefaultOutputCoder, getDefaultOutputCoder, getDefaultOutputCoder, getKindString, getName, getResourceHints, setDisplayData, setResourceHints, toString, validate, validate
-
Constructor Details
-
Read
public Read()
-
-
Method Details
-
from
Reads from the given filename or filepattern.If it is known that the filepattern will match a very large number of files (at least tens of thousands), use
withHintMatchesManyFiles()
for better performance and scalability. -
from
Likefrom(ValueProvider)
. -
withMatchConfiguration
Sets theFileIO.MatchConfiguration
. -
withEmptyMatchTreatment
Configures whether or not a filepattern matching no files is allowed. -
watchForNewFiles
public AvroIO.Read<T> watchForNewFiles(Duration pollInterval, Watch.Growth.TerminationCondition<String, ?> terminationCondition, boolean matchUpdatedFiles) Continuously watches for new files matching the filepattern, polling it at the given interval, until the given termination condition is reached. The returnedPCollection
is unbounded. IfmatchUpdatedFiles
is set, also watches for files with timestamp change.This works only in runners supporting splittable
DoFn
. -
watchForNewFiles
public AvroIO.Read<T> watchForNewFiles(Duration pollInterval, Watch.Growth.TerminationCondition<String, ?> terminationCondition) Same aswatchForNewFiles(Duration, TerminationCondition, boolean)
withmatchUpdatedFiles=false
. -
withHintMatchesManyFiles
Hints that the filepattern specified infrom(String)
matches a very large number of files.This hint may cause a runner to execute the transform differently, in a way that improves performance for this case, but it may worsen performance if the filepattern matches only a small number of files (e.g., in a runner that supports dynamic work rebalancing, it will happen less efficiently within individual files).
-
withBeamSchemas
If set to true, a Beam schema will be inferred from the AVRO schema. This allows the output to be used by SQL and by the schema-transform library. -
withCoder
Sets a coder for the result of the read function. -
withDatumReaderFactory
Sets a customAvroSource.DatumReaderFactory
for reading. Pass aAvroDatumFactory
to also use the factory for the default outputAvroCoder
-
expand
Description copied from class:PTransform
Override this method to specify how thisPTransform
should be expanded on the givenInputT
.NOTE: This method should not be called directly. Instead apply the
PTransform
should be applied to theInputT
using theapply
method.Composite transforms, which are defined in terms of other transforms, should return the output of one of the composed transforms. Non-composite transforms, which do not apply any transforms internally, should return a new unbound output and register evaluators (via backend-specific registration methods).
- Specified by:
expand
in classPTransform<PBegin,
PCollection<T>>
-
populateDisplayData
Description copied from class:PTransform
Register display data for the given transform or component.populateDisplayData(DisplayData.Builder)
is invoked by Pipeline runners to collect display data viaDisplayData.from(HasDisplayData)
. Implementations may callsuper.populateDisplayData(builder)
in order to register display data in the current namespace, but should otherwise usesubcomponent.populateDisplayData(builder)
to use the namespace of the subcomponent.By default, does not register any display data. Implementors may override this method to provide their own display data.
- Specified by:
populateDisplayData
in interfaceHasDisplayData
- Overrides:
populateDisplayData
in classPTransform<PBegin,
PCollection<T>> - Parameters:
builder
- The builder to populate with display data.- See Also:
-