apache_beam.io.requestresponse module

PTransform for reading from and writing to Web APIs.

exception apache_beam.io.requestresponse.UserCodeExecutionException[source]

Bases: Exception

Base class for errors related to calling Web APIs.

exception apache_beam.io.requestresponse.UserCodeQuotaException[source]

Bases: apache_beam.io.requestresponse.UserCodeExecutionException

Extends UserCodeExecutionException to signal specifically that the Web API client encountered a Quota or API overuse related error.

exception apache_beam.io.requestresponse.UserCodeTimeoutException[source]

Bases: apache_beam.io.requestresponse.UserCodeExecutionException

Extends UserCodeExecutionException to signal a user code timeout.

apache_beam.io.requestresponse.retry_on_exception(exception: Exception)[source]

retry on exceptions caused by unavailability of the remote server.

class apache_beam.io.requestresponse.Caller[source]

Bases: contextlib.AbstractContextManager, abc.ABC, typing.Generic

Interface for user custom code intended for API calls. For setup and teardown of clients when applicable, implement the __enter__ and __exit__ methods respectively.

class apache_beam.io.requestresponse.ShouldBackOff[source]

Bases: abc.ABC

ShouldBackOff provides mechanism to apply adaptive throttling.

class apache_beam.io.requestresponse.Repeater[source]

Bases: abc.ABC

Repeater provides mechanism to repeat requests for a configurable condition.

repeat(caller: apache_beam.io.requestresponse.Caller[~RequestT, ~ResponseT][RequestT, ResponseT], request: RequestT, timeout: float, metrics_collector: Optional[apache_beam.io.requestresponse._MetricsCollector]) → ResponseT[source]

repeat method is called from the RequestResponseIO when a repeater is enabled.

Parameters:
  • callerapache_beam.io.requestresponse.Caller object that calls the API.
  • request – input request to repeat.
  • timeout – time to wait for the request to complete.
  • metrics_collector – (Optional) a :class:`apache_beam.io.requestresponse._MetricsCollector` object to collect the metrics for RequestResponseIO.
class apache_beam.io.requestresponse.ExponentialBackOffRepeater[source]

Bases: apache_beam.io.requestresponse.Repeater

Exponential BackOff Repeater uses exponential backoff retry strategy for exceptions due to the remote service such as TooManyRequests (HTTP 429), UserCodeTimeoutException, UserCodeQuotaException.

It utilizes the decorator apache_beam.utils.retry.with_exponential_backoff().

repeat(caller: apache_beam.io.requestresponse.Caller[~RequestT, ~ResponseT][RequestT, ResponseT], request: RequestT, timeout: float, metrics_collector: Optional[apache_beam.io.requestresponse._MetricsCollector] = None) → ResponseT[source]

repeat method is called from the RequestResponseIO when a repeater is enabled.

Parameters:
  • callerapache_beam.io.requestresponse.Caller object that calls the API.
  • request – input request to repeat.
  • timeout – time to wait for the request to complete.
  • metrics_collector – (Optional) a :class:`apache_beam.io.requestresponse._MetricsCollector` object to collect the metrics for RequestResponseIO.
class apache_beam.io.requestresponse.NoOpsRepeater[source]

Bases: apache_beam.io.requestresponse.Repeater

NoOpsRepeater executes a request just once irrespective of any exception.

repeat(caller: apache_beam.io.requestresponse.Caller[~RequestT, ~ResponseT][RequestT, ResponseT], request: RequestT, timeout: float, metrics_collector: Optional[apache_beam.io.requestresponse._MetricsCollector]) → ResponseT[source]
class apache_beam.io.requestresponse.CacheReader[source]

Bases: abc.ABC

CacheReader provides mechanism to read from the cache.

class apache_beam.io.requestresponse.CacheWriter[source]

Bases: abc.ABC

CacheWriter provides mechanism to write to the cache.

class apache_beam.io.requestresponse.PreCallThrottler[source]

Bases: abc.ABC

PreCallThrottler provides a throttle mechanism before sending request.

class apache_beam.io.requestresponse.DefaultThrottler(window_ms: int = 1, bucket_ms: int = 1, overload_ratio: float = 2, delay_secs: int = 5)[source]

Bases: apache_beam.io.requestresponse.PreCallThrottler

Default throttler that uses apache_beam.io.components.adaptive_throttler.AdaptiveThrottler

Parameters:
  • window_ms (int) – length of history to consider, in ms, to set throttling.
  • bucket_ms (int) – granularity of time buckets that we store data in, in ms.
  • overload_ratio (float) – the target ratio between requests sent and successful requests. This is “K” in the formula in https://landing.google.com/sre/book/chapters/handling-overload.html.
  • delay_secs (int) – minimum number of seconds to throttle a request.
class apache_beam.io.requestresponse.RequestResponseIO(caller: apache_beam.io.requestresponse.Caller[~RequestT, ~ResponseT][RequestT, ResponseT], timeout: Optional[float] = 30, should_backoff: Optional[apache_beam.io.requestresponse.ShouldBackOff] = None, repeater: apache_beam.io.requestresponse.Repeater = <apache_beam.io.requestresponse.ExponentialBackOffRepeater object>, cache_reader: Optional[apache_beam.io.requestresponse.CacheReader] = None, cache_writer: Optional[apache_beam.io.requestresponse.CacheWriter] = None, throttler: apache_beam.io.requestresponse.PreCallThrottler = <apache_beam.io.requestresponse.DefaultThrottler object>)[source]

Bases: apache_beam.transforms.ptransform.PTransform

A RequestResponseIO transform to read and write to APIs.

Processes an input PCollection of requests by making a call to the API as defined in Caller’s __call__ and returns a PCollection of responses.

Instantiates a RequestResponseIO transform.

Parameters:
expand(requests: apache_beam.pvalue.PCollection[~RequestT][RequestT]) → apache_beam.pvalue.PCollection[~ResponseT][ResponseT][source]