apache_beam.testing.benchmarks.nexmark.queries.query4 module

Query 4, ‘Average Price for a Category’. Select the average of the wining bid prices for all closed auctions in each category. In CQL syntax:

SELECT Istream(AVG(Q.final))
FROM Category C, (SELECT Rstream(MAX(B.price) AS final, A.category)
  FROM Auction A [ROWS UNBOUNDED], Bid B [ROWS UNBOUNDED]
  WHERE A.id=B.auction
    AND B.datetime < A.expires AND A.expires < CURRENT_TIME
  GROUP BY A.id, A.category) Q
WHERE Q.category = C.id
GROUP BY C.id;

For extra spiciness our implementation differs slightly from the above:

  • We select both the average winning price and the category.
  • We don’t bother joining with a static category table, since it’s contents are never used.
  • We only consider bids which are above the auction’s reserve price.
  • We accept the highest-price, earliest valid bid as the winner.
  • We calculate the averages oven a sliding window of size window_size_sec and period window_period_sec.
apache_beam.testing.benchmarks.nexmark.queries.query4.load(events, metadata=None, pipeline_options=None)[source]
class apache_beam.testing.benchmarks.nexmark.queries.query4.ProjectToCategoryPriceFn(*unused_args, **unused_kwargs)[source]

Bases: apache_beam.transforms.core.DoFn

process(element, pane_info=PaneInfoParam)[source]