Class FileSystems
FileSystem
utility.-
Field Summary
Fields -
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionstatic void
copy
(List<ResourceId> srcResourceIds, List<ResourceId> destResourceIds, MoveOptions... moveOptions) Copies aList
of file-like resources from one location to another.static WritableByteChannel
create
(ResourceId resourceId, String mimeType) Returns a write channel for the givenResourceId
.static WritableByteChannel
create
(ResourceId resourceId, CreateOptions createOptions) Returns a write channel for the givenResourceId
withCreateOptions
.static void
delete
(Collection<ResourceId> resourceIds, MoveOptions... moveOptions) Deletes a collection of resources.static boolean
hasGlobWildcard
(String spec) Checks whether the given spec contains a glob wildcard character.static MatchResult
Likematch(List)
, but for a single resource specification.static MatchResult
match
(String spec, EmptyMatchTreatment emptyMatchTreatment) Likematch(String)
, but with a configurableEmptyMatchTreatment
.static List
<MatchResult> This is the entry point to convert user-provided specs toResourceIds
.static List
<MatchResult> match
(List<String> specs, EmptyMatchTreatment emptyMatchTreatment) Likematch(List)
, but with a configurableEmptyMatchTreatment
.static ResourceId
matchNewDirectory
(String singleResourceSpec, String... baseNames) Returns a newResourceId
that represents the named directory resource.static ResourceId
matchNewResource
(String singleResourceSpec, boolean isDirectory) Returns a newResourceId
that represents the named resource of a type corresponding to the resource type.static List
<MatchResult> matchResources
(List<ResourceId> resourceIds) ReturnsMatchResults
for the givenresourceIds
.static MatchResult.Metadata
matchSingleFileSpec
(String spec) Returns theMatchResult.Metadata
for a single file resource.static ReadableByteChannel
open
(ResourceId resourceId) Returns a read channel for the givenResourceId
.static void
registerFileSystemsOnce
(PipelineOptions options) Register file systems once if never done before.static void
rename
(List<ResourceId> srcResourceIds, List<ResourceId> destResourceIds, MoveOptions... moveOptions) Renames aList
of file-like resources from one location to another.static void
reportSinkLineage
(ResourceId resourceId) Report sinkLineage
metrics for resource id.static void
reportSinkLineage
(ResourceId resourceId, FileSystem.LineageLevel level) Report sourceLineage
metrics for resource id at given level.static void
reportSourceLineage
(ResourceId resourceId) Report sourceLineage
metrics for resource id.static void
reportSourceLineage
(ResourceId resourceId, FileSystem.LineageLevel level) Report sourceLineage
metrics for resource id at given level.static void
setDefaultPipelineOptions
(PipelineOptions options) Sets the default configuration in workers.
-
Field Details
-
DEFAULT_SCHEME
- See Also:
-
-
Constructor Details
-
FileSystems
public FileSystems()
-
-
Method Details
-
hasGlobWildcard
Checks whether the given spec contains a glob wildcard character. -
match
This is the entry point to convert user-provided specs toResourceIds
. Callers should usematch(java.util.List<java.lang.String>)
to resolve users specs ambiguities before calling other methods.Implementation handles the following ambiguities of a user-provided spec:
spec
could be a glob or a uri.match(java.util.List<java.lang.String>)
should be able to tell and choose efficient implementations.- The user-provided
spec
might refer to files or directories. It is common that users that wish to indicate a directory will omit the trailing path delimiter, such as"/tmp/dir"
in Linux. TheFileSystem
should be able to recognize a directory with the trailing path delimiter omitted, but should always return a correctResourceId
(e.g.,"/tmp/dir/"
inside the returnedMatchResult
.
All
FileSystem
implementations should support glob in the final hierarchical path component ofResourceId
. This allows SDK libraries to construct file system agnostic spec.FileSystems
can support additional patterns for user-provided specs.In case the spec schemes don't match any known
FileSystem
implementations, FileSystems will attempt to useLocalFileSystem
to resolve a path.Specs that do not match any resources are treated according to
EmptyMatchTreatment.DISALLOW
.- Returns:
List<MatchResult>
in the same order of the input specs.- Throws:
IllegalArgumentException
- if specs are invalid -- empty or have different schemes.IOException
- if all specs failed to match due to issues like: network connection, authorization. Exception for individual spec is deferred until callers retrieve metadata withMatchResult.metadata()
.
-
match
public static List<MatchResult> match(List<String> specs, EmptyMatchTreatment emptyMatchTreatment) throws IOException Likematch(List)
, but with a configurableEmptyMatchTreatment
.- Throws:
IOException
-
match
Likematch(List)
, but for a single resource specification.The function
match(List)
is preferred when matching multiple patterns, as it allows for bulk API calls to remote filesystems.- Throws:
IOException
-
match
public static MatchResult match(String spec, EmptyMatchTreatment emptyMatchTreatment) throws IOException Likematch(String)
, but with a configurableEmptyMatchTreatment
.- Throws:
IOException
-
matchSingleFileSpec
Returns theMatchResult.Metadata
for a single file resource. Expects a resource specificationspec
that matches a single result.- Parameters:
spec
- a resource specification that matches exactly one result.- Returns:
- the
MatchResult.Metadata
for the specified resource. - Throws:
FileNotFoundException
- if the file resource is not found.IOException
- in the event of an error in the inner call tomatch(java.util.List<java.lang.String>)
, or if the given spec does not match exactly 1 result.
-
matchResources
ReturnsMatchResults
for the givenresourceIds
.- Parameters:
resourceIds
-resourceIds
that might be derived frommatch(java.util.List<java.lang.String>)
,ResourceId.resolve(java.lang.String, org.apache.beam.sdk.io.fs.ResolveOptions)
, orResourceId.getCurrentDirectory()
.- Throws:
IOException
- if allresourceIds
failed to match due to issues like: network connection, authorization. Exception for individualResourceId
need to be deferred until callers retrieve metadata withMatchResult.metadata()
.
-
create
Returns a write channel for the givenResourceId
.The resource is not expanded; it is used verbatim.
- Parameters:
resourceId
- the reference of the file-like resource to createmimeType
- the mine type of the file-like resource to create- Throws:
IOException
-
create
public static WritableByteChannel create(ResourceId resourceId, CreateOptions createOptions) throws IOException Returns a write channel for the givenResourceId
withCreateOptions
.The resource is not expanded; it is used verbatim.
- Parameters:
resourceId
- the reference of the file-like resource to createcreateOptions
- the configuration of the create operation- Throws:
IOException
-
open
Returns a read channel for the givenResourceId
.The resource is not expanded; it is used verbatim.
If seeking is supported, then this returns a
SeekableByteChannel
.- Parameters:
resourceId
- the reference of the file-like resource to open- Throws:
IOException
-
copy
public static void copy(List<ResourceId> srcResourceIds, List<ResourceId> destResourceIds, MoveOptions... moveOptions) throws IOException Copies aList
of file-like resources from one location to another.The number of source resources must equal the number of destination resources. Destination resources will be created recursively.
srcResourceIds
anddestResourceIds
must have the same scheme.It doesn't support copying globs.
- Parameters:
srcResourceIds
- the references of the source resourcesdestResourceIds
- the references of the destination resources- Throws:
IOException
-
rename
public static void rename(List<ResourceId> srcResourceIds, List<ResourceId> destResourceIds, MoveOptions... moveOptions) throws IOException Renames aList
of file-like resources from one location to another.The number of source resources must equal the number of destination resources. Destination resources will be created recursively.
srcResourceIds
anddestResourceIds
must have the same scheme.It doesn't support renaming globs.
Src files will be removed, even if the copy is skipped due to specified move options.
- Parameters:
srcResourceIds
- the references of the source resourcesdestResourceIds
- the references of the destination resources- Throws:
IOException
-
delete
public static void delete(Collection<ResourceId> resourceIds, MoveOptions... moveOptions) throws IOException Deletes a collection of resources.resourceIds
must have the same scheme.- Parameters:
resourceIds
- the references of the resources to delete.- Throws:
IOException
-
reportSourceLineage
Report sourceLineage
metrics for resource id. -
reportSinkLineage
Report sinkLineage
metrics for resource id. -
reportSourceLineage
Report sourceLineage
metrics for resource id at given level.Internal API, no backward compatibility guaranteed.
-
reportSinkLineage
Report sourceLineage
metrics for resource id at given level.Internal API, no backward compatibility guaranteed.
-
setDefaultPipelineOptions
Sets the default configuration in workers.It will be used in
FileSystemRegistrars
for all schemes.Outside of workers where Beam FileSystem API is used (e.g. test methods, user code executed during pipeline submission), consider use
registerFileSystemsOnce(org.apache.beam.sdk.options.PipelineOptions)
if initialize FileSystem of supported schema is the main goal. -
registerFileSystemsOnce
Register file systems once if never done before.This method executes
setDefaultPipelineOptions(org.apache.beam.sdk.options.PipelineOptions)
only if it has never been run, otherwise it returns immediately.It is internally used by test setup to avoid repeated filesystem registrations (involves expensive ServiceLoader calls) when there are multiple pipeline and PipelineOptions object initialized, which is commonly seen in test execution.
-
matchNewResource
Returns a newResourceId
that represents the named resource of a type corresponding to the resource type.The supplied
singleResourceSpec
is expected to be in a proper format, including any necessary escaping, for the underlyingFileSystem
.This function may throw an
IllegalArgumentException
if given an invalid argument, such as when the specifiedsingleResourceSpec
is not a valid resource name. -
matchNewDirectory
Returns a newResourceId
that represents the named directory resource.- Parameters:
singleResourceSpec
- the root directory, for example "/abc"baseNames
- a list of named directory, for example ["d", "e", "f"]- Returns:
- the ResourceId for the resolved directory. In same example as above, it corresponds to "/abc/d/e/f".
-