apache_beam.testing.benchmarks.nexmark.nexmark_util module

Utilities for the Nexmark suite.

The Nexmark suite is a series of queries (streaming pipelines) performed on a simulation of auction events. This util includes:

  • A Command class used to terminate the streaming jobs launched in nexmark_launcher.py by the DirectRunner.
  • A ParseEventFn DoFn to parse events received from PubSub.

Usage:

To run a process for a certain duration, define in the code:
command = Command(process_to_terminate, args) command.run(timeout=duration)
class apache_beam.testing.benchmarks.nexmark.nexmark_util.Command(cmd, args)[source]

Bases: object

run(timeout)[source]
class apache_beam.testing.benchmarks.nexmark.nexmark_util.ParseEventFn(*unused_args, **unused_kwargs)[source]

Bases: apache_beam.transforms.core.DoFn

Parses the raw event info into a Python objects.

Each event line has the following format:

person: <id starting with ‘p’>,name,email,credit_card,city, state,timestamp,extra auction: <id starting with ‘a’>,item_name, description,initial_bid, reserve_price,timestamp,expires,seller,category,extra bid: <auction starting with ‘b’>,bidder,price,timestamp,extra

For example:

‘p12345,maria,maria@maria.com,1234-5678-9012-3456, sunnyvale,CA,1528098831536’ ‘a12345,car67,2012 hyundai elantra,15000,20000, 1528098831536,20180630,maria,vehicle’ ‘b12345,maria,20000,1528098831536’
process(elem)[source]
apache_beam.testing.benchmarks.nexmark.nexmark_util.display(elm)[source]