apache_beam.dataframe.partitionings module¶
-
class
apache_beam.dataframe.partitionings.
Partitioning
[source]¶ Bases:
object
A class representing a (consistent) partitioning of dataframe objects.
-
is_subpartitioning_of
(other)[source]¶ Returns whether self is a sub-partition of other.
Specifically, returns whether something partitioned by self is necissarily also partitioned by other.
-
-
class
apache_beam.dataframe.partitionings.
Index
(levels=None)[source]¶ Bases:
apache_beam.dataframe.partitionings.Partitioning
A partitioning by index (either fully or partially).
If the set of “levels” of the index to consider is not specified, the entire index is used.
These form a partial order, given by
Singleton() < Index([i]) < Index([i, j]) < … < Index() < Arbitrary()The ordering is implemented via the is_subpartitioning_of method, where the examples on the right are subpartitionings of the examples on the left above.
-
class
apache_beam.dataframe.partitionings.
Singleton
(reason=None)[source]¶ Bases:
apache_beam.dataframe.partitionings.Partitioning
A partitioning of all the data into a single partition.
-
reason
¶
-
-
class
apache_beam.dataframe.partitionings.
JoinIndex
(ancestor=None)[source]¶ Bases:
apache_beam.dataframe.partitionings.Partitioning
A partitioning that lets two frames be joined. This can either be a hash partitioning on the full index, or a common ancestor with no intervening re-indexing/re-partitioning.
It fits into the partial ordering as
Index() < JoinIndex(x) < JoinIndex() < Arbitrary()with
JoinIndex(x) and JoinIndex(y)being incomparable for nontrivial x != y.
Expressions desiring to make use of this index should simply declare a requirement of JoinIndex().