Class TikaIO.ParseFiles
- All Implemented Interfaces:
Serializable,HasDisplayData
- Enclosing class:
TikaIO
TikaIO.parseFiles().- See Also:
-
Field Summary
Fields inherited from class org.apache.beam.sdk.transforms.PTransform
annotations, displayData, name, resourceHints -
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionexpand(PCollection<FileIO.ReadableFile> input) Override this method to specify how thisPTransformshould be expanded on the givenInputT.voidpopulateDisplayData(DisplayData.Builder builder) Register display data for the given transform or component.withContentTypeHint(String contentTypeHint) Sets a content type hint to make the file parser detection more efficient.withInputMetadata(org.apache.tika.metadata.Metadata metadata) Sets the input metadata forParser.parse(java.io.InputStream, org.xml.sax.ContentHandler, org.apache.tika.metadata.Metadata, org.apache.tika.parser.ParseContext).withTikaConfigPath(String tikaConfigPath) Uses the given Tika Configuration XML file.withTikaConfigPath(ValueProvider<String> tikaConfigPath) Likewith(tikaConfigPath).Methods inherited from class org.apache.beam.sdk.transforms.PTransform
addAnnotation, compose, compose, getAdditionalInputs, getAnnotations, getDefaultOutputCoder, getDefaultOutputCoder, getDefaultOutputCoder, getKindString, getName, getResourceHints, setDisplayData, setResourceHints, toString, validate, validate
-
Constructor Details
-
ParseFiles
public ParseFiles()
-
-
Method Details
-
withTikaConfigPath
Uses the given Tika Configuration XML file. -
withTikaConfigPath
Likewith(tikaConfigPath). -
withContentTypeHint
Sets a content type hint to make the file parser detection more efficient. Overrides the content type hint inwithInputMetadata(org.apache.tika.metadata.Metadata), if any. -
withInputMetadata
Sets the input metadata forParser.parse(java.io.InputStream, org.xml.sax.ContentHandler, org.apache.tika.metadata.Metadata, org.apache.tika.parser.ParseContext). -
expand
Description copied from class:PTransformOverride this method to specify how thisPTransformshould be expanded on the givenInputT.NOTE: This method should not be called directly. Instead apply the
PTransformshould be applied to theInputTusing theapplymethod.Composite transforms, which are defined in terms of other transforms, should return the output of one of the composed transforms. Non-composite transforms, which do not apply any transforms internally, should return a new unbound output and register evaluators (via backend-specific registration methods).
- Specified by:
expandin classPTransform<PCollection<FileIO.ReadableFile>,PCollection<ParseResult>>
-
populateDisplayData
Description copied from class:PTransformRegister display data for the given transform or component.populateDisplayData(DisplayData.Builder)is invoked by Pipeline runners to collect display data viaDisplayData.from(HasDisplayData). Implementations may callsuper.populateDisplayData(builder)in order to register display data in the current namespace, but should otherwise usesubcomponent.populateDisplayData(builder)to use the namespace of the subcomponent.By default, does not register any display data. Implementors may override this method to provide their own display data.
- Specified by:
populateDisplayDatain interfaceHasDisplayData- Overrides:
populateDisplayDatain classPTransform<PCollection<FileIO.ReadableFile>,PCollection<ParseResult>> - Parameters:
builder- The builder to populate with display data.- See Also:
-