Class FileReadSchemaTransformConfiguration
java.lang.Object
org.apache.beam.sdk.io.fileschematransform.FileReadSchemaTransformConfiguration
@DefaultSchema(AutoValueSchema.class)
public abstract class FileReadSchemaTransformConfiguration
extends Object
-
Nested Class Summary
Nested Classes -
Field Summary
Fields -
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionbuilder()
abstract String
The filepattern used to match and read files.abstract String
The format of the file(s) to read.abstract Long
The time, in milliseconds, to wait before polling for new files.abstract String
The schema used by sources to deserialize data and create Beam Rows.abstract Long
If no new files are found after this many seconds, this transform will cease to watch for new files.
-
Field Details
-
VALID_PROVIDERS
-
-
Constructor Details
-
FileReadSchemaTransformConfiguration
public FileReadSchemaTransformConfiguration()
-
-
Method Details
-
builder
-
getFormat
@SchemaFieldDescription("The format of the file(s) to read. Possible values are \"lines\", \"avro\", \"parquet\", \"json\".") public abstract String getFormat()The format of the file(s) to read.Possible values are: `"lines"`, `"avro"`, `"parquet"`, `"json"`
-
getFilepattern
@SchemaFieldDescription("The filepattern used to match and read files. May instead use an input PCollection<Row> of filepatterns. To do so, each Row must have a \"filepattern\" String field containing the filepattern.") @Nullable public abstract String getFilepattern()The filepattern used to match and read files.May instead use an input PCollection
of filepatterns. To do so, each Row must have a "filepattern" String field containing the filepattern.
-
getSafeFilepattern
-
getSchema
@SchemaFieldDescription("The schema used by sources to deserialize data and create Beam Rows. May provide either a String representation of the schema or a single path to a file that contains the schema.") @Nullable public abstract String getSchema()The schema used by sources to deserialize data and create Beam Rows.May provide either a String representation of the schema or a single path to a file that contains the schema.
-
getSafeSchema
-
getPollIntervalMillis
@SchemaFieldDescription("The time, in milliseconds, to wait before polling for new files. This will set the pipeline to be a streaming pipeline that continuously watches for new files.Note: This only polls for new files. New updates to an existing file will not be watched for.") @Nullable public abstract Long getPollIntervalMillis()The time, in milliseconds, to wait before polling for new files.This will set the pipeline to be a streaming pipeline that continuously watches for new files.
Note: This only polls for new files. New updates to an existing file will not be watched for.
-
getTerminateAfterSecondsSinceNewOutput
@SchemaFieldDescription("If no new files are found after this many seconds, this transform will cease to watch for new files. The default is to never terminate. To set this parameter, a poll interval must also be provided.") @Nullable public abstract Long getTerminateAfterSecondsSinceNewOutput()If no new files are found after this many seconds, this transform will cease to watch for new files.The default is to never terminate. To set this parameter, a poll interval must also be provided.
-