apache_beam.dataframe.expressions module¶
-
class
apache_beam.dataframe.expressions.
Session
(bindings=None)[source]¶ Bases:
object
A session represents a mapping of expressions to concrete values.
The bindings typically include required placeholders, but may be any intermediate expression as well.
-
class
apache_beam.dataframe.expressions.
PartitioningSession
(bindings=None)[source]¶ Bases:
apache_beam.dataframe.expressions.Session
An extension of Session that enforces actual partitioning of inputs.
Each expression is evaluated multiple times for various supported partitionings determined by its requires_partition_by specification. For each tested partitioning, the input is partitioned and the expression is evaluated on each partition separately, as if this were actually executed in a parallel manner.
For each input partitioning, the results are verified to be partitioned appropriately according to the expression’s preserves_partition_by specification.
For testing only.
-
class
apache_beam.dataframe.expressions.
Expression
(name, proxy, _id=None)[source]¶ Bases:
object
An expression is an operation bound to a set of arguments.
An expression represents a deferred tree of operations, which can be evaluated at a specific bindings of root expressions to values.
-
class
apache_beam.dataframe.expressions.
PlaceholderExpression
(proxy, reference=None)[source]¶ Bases:
apache_beam.dataframe.expressions.Expression
An expression whose value must be explicitly bound in the session.
Initialize a placeholder expression.
Parameters: proxy – A proxy object with the type expected to be bound to this expression. Used for type checking at pipeline construction time.
-
class
apache_beam.dataframe.expressions.
ConstantExpression
(value, proxy=None)[source]¶ Bases:
apache_beam.dataframe.expressions.Expression
An expression whose value is known at pipeline construction time.
Initialize a constant expression.
Parameters: - value – The constant value to be produced by this expression.
- proxy – (Optional) a proxy object with same type as value to use for rapid type checking at pipeline construction time. If not provided, value will be used directly.
-
class
apache_beam.dataframe.expressions.
ComputedExpression
(name, func, args, proxy=None, _id=None, requires_partition_by=Index, preserves_partition_by=Nothing)[source]¶ Bases:
apache_beam.dataframe.expressions.Expression
An expression whose value must be computed at pipeline execution time.
Initialize a computed expression.
Parameters: - name – The name of this expression.
- func – The function that will be used to compute the value of this expression. Should accept arguments of the types returned when evaluating the args expressions.
- args – The list of expressions that will be used to produce inputs to func.
- proxy – (Optional) a proxy object with same type as the objects that this ComputedExpression will produce at execution time. If not provided, a proxy will be generated using func and the proxies of args.
- _id – (Optional) a string to uniquely identify this expression.
- requires_partition_by – The required (common) partitioning of the args.
- preserves_partition_by – The level of partitioning preserved.