apache_beam.tools.utils module

Utility functions for all microbenchmarks.

apache_beam.tools.utils.check_compiled(module)[source]

Check whether given module has been compiled. :param module: string, module name

class apache_beam.tools.utils.BenchmarkConfig[source]

Bases: apache_beam.tools.utils.BenchmarkConfig

benchmark

a callable that takes an int argument - benchmark size, and returns a callable. A returned callable must run the code being benchmarked on an input of specified size.

For example, one can implement a benchmark as:

class MyBenchmark(object):
def __init__(self, size):
[do necessary initialization]
def __call__(self):
[run the code in question]
size

int, a size of the input. Aggregated per-element metrics are counted based on the size of the input.

num_runs

int, number of times to run each benchmark.

Create new instance of BenchmarkConfig(benchmark, size, num_runs)

class apache_beam.tools.utils.LinearRegressionBenchmarkConfig[source]

Bases: apache_beam.tools.utils.LinearRegressionBenchmarkConfig

benchmark

a callable that takes an int argument - benchmark size, and returns a callable. A returned callable must run the code being benchmarked on an input of specified size.

For example, one can implement a benchmark as:

class MyBenchmark(object):
def __init__(self, size):
[do necessary initialization]
def __call__(self):
[run the code in question]
starting_point

int, an initial size of the input. Regression results are calculated based on the input.

increment

int, the rate of growth of the input for each run of the benchmark.

num_runs

int, number of times to run each benchmark.

Create new instance of LinearRegressionBenchmarkConfig(benchmark, starting_point, increment, num_runs)

apache_beam.tools.utils.run_benchmarks(benchmark_suite, verbose=True)[source]

Runs benchmarks, and collects execution times.

A simple instrumentation to run a callable several times, collect and print its execution times.

Parameters:
  • benchmark_suite – A list of BenchmarkConfig.
  • verbose – bool, whether to print benchmark results to stdout.
Returns:

A dictionary of the form string -> list of floats. Keys of the dictionary are benchmark names, values are execution times in seconds for each run.