apache_beam.tools.teststream_microbenchmark module

A microbenchmark for measuring changes in the performance of TestStream running locally. This microbenchmark attempts to measure the overhead of the main data paths for the TestStream. Specifically new elements, watermark changes and processing time advances.

This runs a series of N parallel pipelines with M parallel stages each. Each stage does the following:

  1. Put all the PCollection elements in a window
  2. Wait until the watermark advances past the end of the window.
  3. When the watermark passes, change the key and output all the elements
  4. Go back to #1 until all elements in the stream have been consumed.

This executes the same codepaths that are run on the Fn API (and Dataflow) workers, but is generally easier to run (locally) and more stable.

Run as

python -m apache_beam.tools.teststream_microbenchmark
class apache_beam.tools.teststream_microbenchmark.RekeyElements(*unused_args, **unused_kwargs)[source]

Bases: apache_beam.transforms.core.DoFn

process(element)[source]
apache_beam.tools.teststream_microbenchmark.run_single_pipeline(size)[source]
apache_beam.tools.teststream_microbenchmark.run_benchmark(starting_point=1, num_runs=10, num_elements_step=300, verbose=True)[source]