public abstract static class TikaIO.ParseFiles extends PTransform<PCollection<FileIO.ReadableFile>,PCollection<ParseResult>>
TikaIO.parseFiles()
.annotations, displayData, name, resourceHints
Constructor and Description |
---|
ParseFiles() |
Modifier and Type | Method and Description |
---|---|
PCollection<ParseResult> |
expand(PCollection<FileIO.ReadableFile> input)
Override this method to specify how this
PTransform should be expanded on the given
InputT . |
void |
populateDisplayData(DisplayData.Builder builder)
Register display data for the given transform or component.
|
TikaIO.ParseFiles |
withContentTypeHint(java.lang.String contentTypeHint)
Sets a content type hint to make the file parser detection more efficient.
|
TikaIO.ParseFiles |
withInputMetadata(org.apache.tika.metadata.Metadata metadata)
Sets the input metadata for
Parser.parse(java.io.InputStream, org.xml.sax.ContentHandler, org.apache.tika.metadata.Metadata, org.apache.tika.parser.ParseContext) . |
TikaIO.ParseFiles |
withTikaConfigPath(java.lang.String tikaConfigPath)
Uses the given Tika
Configuration XML file.
|
TikaIO.ParseFiles |
withTikaConfigPath(ValueProvider<java.lang.String> tikaConfigPath)
Like
with(tikaConfigPath) . |
addAnnotation, compose, compose, getAdditionalInputs, getAnnotations, getDefaultOutputCoder, getDefaultOutputCoder, getDefaultOutputCoder, getKindString, getName, getResourceHints, setDisplayData, setResourceHints, toString, validate, validate
public TikaIO.ParseFiles withTikaConfigPath(java.lang.String tikaConfigPath)
public TikaIO.ParseFiles withTikaConfigPath(ValueProvider<java.lang.String> tikaConfigPath)
with(tikaConfigPath)
.public TikaIO.ParseFiles withContentTypeHint(java.lang.String contentTypeHint)
withInputMetadata(org.apache.tika.metadata.Metadata)
, if any.public TikaIO.ParseFiles withInputMetadata(org.apache.tika.metadata.Metadata metadata)
Parser.parse(java.io.InputStream, org.xml.sax.ContentHandler, org.apache.tika.metadata.Metadata, org.apache.tika.parser.ParseContext)
.public PCollection<ParseResult> expand(PCollection<FileIO.ReadableFile> input)
PTransform
PTransform
should be expanded on the given
InputT
.
NOTE: This method should not be called directly. Instead apply the PTransform
should
be applied to the InputT
using the apply
method.
Composite transforms, which are defined in terms of other transforms, should return the output of one of the composed transforms. Non-composite transforms, which do not apply any transforms internally, should return a new unbound output and register evaluators (via backend-specific registration methods).
expand
in class PTransform<PCollection<FileIO.ReadableFile>,PCollection<ParseResult>>
public void populateDisplayData(DisplayData.Builder builder)
PTransform
populateDisplayData(DisplayData.Builder)
is invoked by Pipeline runners to collect
display data via DisplayData.from(HasDisplayData)
. Implementations may call super.populateDisplayData(builder)
in order to register display data in the current namespace,
but should otherwise use subcomponent.populateDisplayData(builder)
to use the namespace
of the subcomponent.
By default, does not register any display data. Implementors may override this method to provide their own display data.
populateDisplayData
in interface HasDisplayData
populateDisplayData
in class PTransform<PCollection<FileIO.ReadableFile>,PCollection<ParseResult>>
builder
- The builder to populate with display data.HasDisplayData