java.lang.Object
org.apache.beam.sdk.extensions.gcp.util.gcsfs.GcsPath
All Implemented Interfaces:
Serializable, Comparable<Path>, Iterable<Path>, Path, Watchable

public class GcsPath extends Object implements Path, Serializable
Implements the Java NIO Path API for Google Cloud Storage paths.

GcsPath uses a slash ('/') as a directory separator. Below is a summary of how slashes are treated:

  • A GCS bucket may not contain a slash. An object may contain zero or more slashes.
  • A trailing slash always indicates a directory, which is compliant with POSIX.1-2008.
  • Slashes separate components of a path. Empty components are allowed, these are represented as repeated slashes. An empty component always refers to a directory, and always ends in a slash.
  • getParent()} always returns a path ending in a slash, as the parent of a GcsPath is always a directory.
  • Use resolve(String) to append elements to a GcsPath -- this applies the rules consistently and is highly recommended over any custom string concatenation.

GcsPath treats all GCS objects and buckets as belonging to the same filesystem, so the root of a GcsPath is the GcsPath bucket="", object="".

Relative paths are not associated with any bucket. This matches common treatment of Path in which relative paths can be constructed from one filesystem and appended to another filesystem.

See Also:
  • Field Details

    • SCHEME

      public static final String SCHEME
      See Also:
    • GCS_URI

      public static final Pattern GCS_URI
      Pattern that is used to parse a GCS URL.

      This is used to separate the components. Verification is handled separately.

  • Constructor Details

    • GcsPath

      public GcsPath(@Nullable FileSystem fs, @Nullable String bucket, @Nullable String object)
      Constructs a GcsPath.
      Parameters:
      fs - the associated FileSystem, if any
      bucket - the associated bucket, or none (null or an empty string) for a relative path component
      object - the object, which is a fully-qualified object name if bucket was also provided, or none (null or an empty string) for no object
      Throws:
      IllegalArgumentException - if the bucket of object names are invalid.
  • Method Details

    • fromUri

      public static GcsPath fromUri(URI uri)
      Creates a GcsPath from a URI.

      The URI must be in the form gs://[bucket]/[path], and may not contain a port, user info, a query, or a fragment.

    • fromUri

      public static GcsPath fromUri(String uri)
      Creates a GcsPath from a URI in string form.

      This does not use URI parsing, which means it may accept patterns that the URI parser would not accept.

    • fromResourceName

      public static GcsPath fromResourceName(String name)
      Creates a GcsPath from a OnePlatform resource name in string form.
    • fromObject

      public static GcsPath fromObject(StorageObject object)
      Creates a GcsPath from a StorageObject.
    • fromComponents

      public static GcsPath fromComponents(@Nullable String bucket, @Nullable String object)
      Creates a GcsPath from bucket and object components.

      A GcsPath without a bucket name is treated as a relative path, which is a path component with no linkage to the root element. This is similar to a Unix path that does not begin with the root marker (a slash). GCS has different naming constraints and APIs for working with buckets and objects, so these two concepts are kept separate to avoid accidental attempts to treat objects as buckets, or vice versa, as much as possible.

      A GcsPath without an object name is a bucket reference. A bucket is always a directory, which could be used to lookup or add files to a bucket, but could not be opened as a file.

      A GcsPath containing neither bucket or object names is treated as the root of the GCS filesystem. A listing on the root element would return the buckets available to the user.

      If null is passed as either parameter, it is converted to an empty string internally for consistency. There is no distinction between an empty string and a null, as neither are allowed by GCS.

      Parameters:
      bucket - a GCS bucket name, or none (null or an empty string) if the object is not associated with a bucket (e.g. relative paths or the root node).
      object - a GCS object path, or none (null or an empty string) for no object.
    • getBucket

      public String getBucket()
      Returns the bucket name associated with this GCS path, or an empty string if this is a relative path component.
    • getObject

      public String getObject()
      Returns the object name associated with this GCS path, or an empty string if no object is specified.
    • setFileSystem

      public void setFileSystem(FileSystem fs)
    • getFileSystem

      public FileSystem getFileSystem()
      Specified by:
      getFileSystem in interface Path
    • isAbsolute

      public boolean isAbsolute()
      Specified by:
      isAbsolute in interface Path
    • getRoot

      public GcsPath getRoot()
      Specified by:
      getRoot in interface Path
    • getFileName

      public GcsPath getFileName()
      Specified by:
      getFileName in interface Path
    • getParent

      public GcsPath getParent()
      Returns the parent path, or null if this path does not have a parent.

      Returns a path that ends in '/', as the parent path always refers to a directory.

      Specified by:
      getParent in interface Path
    • getNameCount

      public int getNameCount()
      Specified by:
      getNameCount in interface Path
    • getName

      public GcsPath getName(int count)
      Specified by:
      getName in interface Path
    • subpath

      public GcsPath subpath(int beginIndex, int endIndex)
      Specified by:
      subpath in interface Path
    • startsWith

      public boolean startsWith(Path other)
      Specified by:
      startsWith in interface Path
    • startsWith

      public boolean startsWith(String prefix)
      Specified by:
      startsWith in interface Path
    • endsWith

      public boolean endsWith(Path other)
      Specified by:
      endsWith in interface Path
    • endsWith

      public boolean endsWith(String suffix)
      Specified by:
      endsWith in interface Path
    • normalize

      public GcsPath normalize()
      Specified by:
      normalize in interface Path
    • resolve

      public GcsPath resolve(Path other)
      Specified by:
      resolve in interface Path
    • resolve

      public GcsPath resolve(String other)
      Specified by:
      resolve in interface Path
    • resolveSibling

      public Path resolveSibling(Path other)
      Specified by:
      resolveSibling in interface Path
    • resolveSibling

      public Path resolveSibling(String other)
      Specified by:
      resolveSibling in interface Path
    • relativize

      public Path relativize(Path other)
      Specified by:
      relativize in interface Path
    • toAbsolutePath

      public GcsPath toAbsolutePath()
      Specified by:
      toAbsolutePath in interface Path
    • toRealPath

      public GcsPath toRealPath(LinkOption... options) throws IOException
      Specified by:
      toRealPath in interface Path
      Throws:
      IOException
    • toFile

      public File toFile()
      Specified by:
      toFile in interface Path
    • register

      public WatchKey register(WatchService watcher, WatchEvent.Kind<?>[] events, WatchEvent.Modifier... modifiers) throws IOException
      Specified by:
      register in interface Path
      Specified by:
      register in interface Watchable
      Throws:
      IOException
    • register

      public WatchKey register(WatchService watcher, WatchEvent.Kind<?>... events) throws IOException
      Specified by:
      register in interface Path
      Specified by:
      register in interface Watchable
      Throws:
      IOException
    • iterator

      public Iterator<Path> iterator()
      Specified by:
      iterator in interface Iterable<Path>
      Specified by:
      iterator in interface Path
    • compareTo

      public int compareTo(Path other)
      Specified by:
      compareTo in interface Comparable<Path>
      Specified by:
      compareTo in interface Path
    • equals

      public boolean equals(@Nullable Object o)
      Specified by:
      equals in interface Path
      Overrides:
      equals in class Object
    • hashCode

      public int hashCode()
      Specified by:
      hashCode in interface Path
      Overrides:
      hashCode in class Object
    • toString

      public String toString()
      Specified by:
      toString in interface Path
      Overrides:
      toString in class Object
    • toResourceName

      public String toResourceName()
    • toUri

      public URI toUri()
      Specified by:
      toUri in interface Path