apache_beam.testing.benchmarks.nexmark.nexmark_util module¶
Utilities for the Nexmark suite.
The Nexmark suite is a series of queries (streaming pipelines) performed on a simulation of auction events. This util includes:
- A Command class used to terminate the streaming jobs launched in nexmark_launcher.py by the DirectRunner.
- A ParseEventFn DoFn to parse events received from PubSub.
Usage:
- To run a process for a certain duration, define in the code:
- command = Command(process_to_terminate, args) command.run(timeout=duration)
-
class
apache_beam.testing.benchmarks.nexmark.nexmark_util.
ParseEventFn
(*unused_args, **unused_kwargs)[source]¶ Bases:
apache_beam.transforms.core.DoFn
Original parser for parsing raw events info into a Python objects.
Each event line has the following format:
person: <id starting with ‘p’>,name,email,credit_card,city, state,timestamp,extra auction: <id starting with ‘a’>,item_name, description,initial_bid, reserve_price,timestamp,expires,seller,category,extra bid: <auction starting with ‘b’>,bidder,price,timestamp,extraFor example:
‘p12345,maria,maria@maria.com,1234-5678-9012-3456, sunnyvale,CA,1528098831536’ ‘a12345,car67,2012 hyundai elantra,15000,20000, 1528098831536,20180630,maria,vehicle’ ‘b12345,maria,20000,1528098831536’
-
class
apache_beam.testing.benchmarks.nexmark.nexmark_util.
ParseJsonEventFn
(*unused_args, **unused_kwargs)[source]¶ Bases:
apache_beam.transforms.core.DoFn
Parses the raw event info into a Python objects.
Each event line has the following format:
person: {id,name,email,credit_card,city, state,timestamp,extra} auction: {id,item_name, description,initial_bid, reserve_price,timestamp,expires,seller,category,extra} bid: {auction,bidder,price,timestamp,extra}For example:
{“id”:1000,”name”:”Peter Jones”,”emailAddress”:”nhd@xcat.com”, “creditCard”:”7241 7320 9143 4888”,”city”:”Portland”,”state”:”WY”, “dateTime”:1528098831026,”extra”:”WN_HS_bnpVQ[[“}
{“id”:1000,”itemName”:”wkx mgee”,”description”:”eszpqxtdxrvwmmywkmogoahf”, “initialBid”:28873,”reserve”:29448,”dateTime”:1528098831036, “expires”:1528098840451,”seller”:1000,”category”:13,”extra”:”zcuupiz”}
{“auction”:1000,”bidder”:1001,”price”:32530001,”dateTime”:1528098831066, “extra”:”fdiysaV^]NLVsbolvyqwgticfdrwdyiyofWPYTOuwogvszlxjrcNOORM”}
-
apache_beam.testing.benchmarks.nexmark.nexmark_util.
millis_to_timestamp
(millis: int) → apache_beam.utils.timestamp.Timestamp[source]¶
-
apache_beam.testing.benchmarks.nexmark.nexmark_util.
get_counter_metric
(result: apache_beam.runners.runner.PipelineResult, namespace: str, name: str) → int[source]¶ get specific counter metric from pipeline result
Parameters: - result – the PipelineResult which metrics are read from
- namespace – a string representing the namespace of wanted metric
- name – a string representing the name of the wanted metric
Returns: the result of the wanted metric if it exist, else -1
-
apache_beam.testing.benchmarks.nexmark.nexmark_util.
get_start_time_metric
(result: apache_beam.runners.runner.PipelineResult, namespace: str, name: str) → int[source]¶ get the start time out of all times recorded by the specified distribution metric
Parameters: - result – the PipelineResult which metrics are read from
- namespace – a string representing the namespace of wanted metric
- name – a string representing the name of the wanted metric
Returns: the smallest time in the metric or -1 if it doesn’t exist
-
apache_beam.testing.benchmarks.nexmark.nexmark_util.
get_end_time_metric
(result: apache_beam.runners.runner.PipelineResult, namespace: str, name: str) → int[source]¶ get the end time out of all times recorded by the specified distribution metric
Parameters: - result – the PipelineResult which metrics are read from
- namespace – a string representing the namespace of wanted metric
- name – a string representing the name of the wanted metric
Returns: the largest time in the metric or -1 if it doesn’t exist