apache_beam.transforms.enrichment module

apache_beam.transforms.enrichment.cross_join(left: Dict[str, Any], right: Dict[str, Any]) → apache_beam.pvalue.Row[source]

cross_join performs a cross join on two dict objects.

Joins the columns of the right row onto the left row.

Parameters:
  • left (Dict[str, Any]) – input request dictionary.
  • right (Dict[str, Any]) – response dictionary from the API.
Returns:

beam.Row containing the merged columns.

class apache_beam.transforms.enrichment.EnrichmentSourceHandler[source]

Bases: apache_beam.io.requestresponse.Caller

Wrapper class for apache_beam.io.requestresponse.Caller.

Ensure that the implementation of __call__ method returns a tuple of beam.Row objects.

class apache_beam.transforms.enrichment.Enrichment(source_handler: apache_beam.transforms.enrichment.EnrichmentSourceHandler, join_fn: Callable[[Dict[str, Any], Dict[str, Any]], apache_beam.pvalue.Row] = <function cross_join>, timeout: Optional[float] = 30, repeater: apache_beam.io.requestresponse.Repeater = <apache_beam.io.requestresponse.ExponentialBackOffRepeater object>, throttler: apache_beam.io.requestresponse.PreCallThrottler = <apache_beam.io.requestresponse.DefaultThrottler object>)[source]

Bases: apache_beam.transforms.ptransform.PTransform

A apache_beam.transforms.enrichment.Enrichment transform to enrich elements in a PCollection. NOTE: This transform and its implementation are under development and do not provide backward compatibility guarantees. Uses the apache_beam.transforms.enrichment.EnrichmentSourceHandler to enrich elements by joining the metadata from external source.

Processes an input PCollection of beam.Row by applying a apache_beam.transforms.enrichment.EnrichmentSourceHandler to each element and returning the enriched PCollection.

Parameters:
expand(input_row: apache_beam.pvalue.PCollection[~InputT][InputT]) → apache_beam.pvalue.PCollection[~OutputT][OutputT][source]