apache_beam.transforms.enrichment module¶
- 
apache_beam.transforms.enrichment.cross_join(left: Dict[str, Any], right: Dict[str, Any]) → apache_beam.pvalue.Row[source]¶
- cross_join performs a cross join on two dict objects. - Joins the columns of the right row onto the left row. - Parameters: - Returns: - beam.Row containing the merged columns. 
- 
class apache_beam.transforms.enrichment.EnrichmentSourceHandler[source]¶
- Bases: - apache_beam.io.requestresponse.Caller- Wrapper class for - apache_beam.io.requestresponse.Caller.- Ensure that the implementation of - __call__method returns a tuple of beam.Row objects.
- 
class apache_beam.transforms.enrichment.Enrichment(source_handler: apache_beam.transforms.enrichment.EnrichmentSourceHandler, join_fn: Callable[[Dict[str, Any], Dict[str, Any]], apache_beam.pvalue.Row] = <function cross_join>, timeout: Optional[float] = 30, repeater: apache_beam.io.requestresponse.Repeater = <apache_beam.io.requestresponse.ExponentialBackOffRepeater object>, throttler: apache_beam.io.requestresponse.PreCallThrottler = <apache_beam.io.requestresponse.DefaultThrottler object>)[source]¶
- Bases: - apache_beam.transforms.ptransform.PTransform- A - apache_beam.transforms.enrichment.Enrichmenttransform to enrich elements in a PCollection. NOTE: This transform and its implementation are under development and do not provide backward compatibility guarantees. Uses the- apache_beam.transforms.enrichment.EnrichmentSourceHandlerto enrich elements by joining the metadata from external source.- Processes an input - PCollectionof beam.Row by applying a- apache_beam.transforms.enrichment.EnrichmentSourceHandlerto each element and returning the enriched- PCollection.- Parameters: - source_handler – Handles source lookup and metadata retrieval.
Implements the
apache_beam.transforms.enrichment.EnrichmentSourceHandler
- join_fn – A lambda function to join original element with lookup metadata. Defaults to CROSS_JOIN.
- timeout – (Optional) timeout for source requests. Defaults to 30 seconds.
- repeater (Repeater) – provides method to
repeat failed requests to API due to service errors. Defaults to
apache_beam.io.requestresponse.ExponentialBackOffRepeaterto repeat requests with exponential backoff.
- throttler (PreCallThrottler) – provides methods to pre-throttle a request. Defaults to
apache_beam.io.requestresponse.DefaultThrottlerfor client-side adaptive throttling usingapache_beam.io.components.adaptive_throttler.AdaptiveThrottler.
 
- source_handler – Handles source lookup and metadata retrieval.
Implements the