apache_beam.testing package¶
Submodules¶
apache_beam.testing.pipeline_verifiers module¶
End-to-end test result verifiers
A set of verifiers that are used in end-to-end tests to verify state/output of test pipeline job. Customized verifier should extend hamcrest.core.base_matcher.BaseMatcher and override _matches.
-
class
apache_beam.testing.pipeline_verifiers.
PipelineStateMatcher
(expected_state='DONE')[source]¶ Bases:
hamcrest.core.base_matcher.BaseMatcher
Matcher that verify pipeline job terminated in expected state
Matcher compares the actual pipeline terminate state with expected. By default, PipelineState.DONE is used as expected state.
-
class
apache_beam.testing.pipeline_verifiers.
FileChecksumMatcher
(file_path, expected_checksum, sleep_secs=None)[source]¶ Bases:
hamcrest.core.base_matcher.BaseMatcher
Matcher that verifies file(s) content by comparing file checksum.
Use apache_beam.io.filebasedsink to fetch file(s) from given path. File checksum is a hash string computed from content of file(s).
apache_beam.testing.test_pipeline module¶
Test Pipeline, a wrapper of Pipeline for test purpose
-
class
apache_beam.testing.test_pipeline.
TestPipeline
(runner=None, options=None, argv=None, is_integration_test=False, blocking=True)[source]¶ Bases:
apache_beam.pipeline.Pipeline
TestPipeline class is used inside of Beam tests that can be configured to run against pipeline runner.
It has a functionality to parse arguments from command line and build pipeline options for tests who runs against a pipeline runner and utilizes resources of the pipeline runner. Those test functions are recommended to be tagged by @attr(“ValidatesRunner”) annotation.
In order to configure the test with customized pipeline options from command line, system argument ‘test-pipeline-options’ can be used to obtains a list of pipeline options. If no options specified, default value will be used.
For example, use following command line to execute all ValidatesRunner tests:
python setup.py nosetests -a ValidatesRunner --test-pipeline-options="--runner=DirectRunner --job_name=myJobName --num_workers=1"
For example, use assert_that for test validation:
pipeline = TestPipeline() pcoll = ... assert_that(pcoll, equal_to(...)) pipeline.run()
-
get_full_options_as_args
(**extra_opts)[source]¶ Get full pipeline options as an argument list.
Append extra pipeline options to existing option list if provided. Test verifier (if contains in extra options) should be pickled before appending, and will be unpickled later in the TestRunner.
-
apache_beam.testing.test_stream module¶
Provides TestStream for verifying streaming runner semantics.
For internal use only; no backwards-compatibility guarantees.
-
class
apache_beam.testing.test_stream.
Event
[source]¶ Bases:
object
Test stream event to be emitted during execution of a TestStream.
-
class
apache_beam.testing.test_stream.
ElementEvent
(timestamped_values)[source]¶ Bases:
apache_beam.testing.test_stream.Event
Element-producing test stream event.
-
class
apache_beam.testing.test_stream.
WatermarkEvent
(new_watermark)[source]¶ Bases:
apache_beam.testing.test_stream.Event
Watermark-advancing test stream event.
-
class
apache_beam.testing.test_stream.
ProcessingTimeEvent
(advance_by)[source]¶ Bases:
apache_beam.testing.test_stream.Event
Processing time-advancing test stream event.
-
class
apache_beam.testing.test_stream.
TestStream
(coder=<class 'apache_beam.coders.coders.FastPrimitivesCoder'>)[source]¶ Bases:
apache_beam.transforms.ptransform.PTransform
Test stream that generates events on an unbounded PCollection of elements.
Each event emits elements, advances the watermark or advances the processing time. After all of the specified elements are emitted, ceases to produce output.
-
add_elements
(elements)[source]¶ Add elements to the TestStream.
Elements added to the TestStream will be produced during pipeline execution. These elements can be TimestampedValue, WindowedValue or raw unwrapped elements that are serializable using the TestStream’s specified Coder. When a TimestampedValue or a WindowedValue element is used, the timestamp of the TimestampedValue or WindowedValue will be the timestamp of the produced element; otherwise, the current watermark timestamp will be used for that element. The windows of a given WindowedValue are ignored by the TestStream.
-
advance_processing_time
(advance_by)[source]¶ Advance the current processing time by a given duration in seconds.
The duration must be a positive second duration and should be given as an int, float or utils.timestamp.Duration object.
-
apache_beam.testing.test_utils module¶
Utility methods for testing
For internal use only; no backwards-compatibility guarantees.
-
apache_beam.testing.test_utils.
compute_hash
(content, hashing_alg='sha1')[source]¶ Compute a hash value from a list of string.
-
apache_beam.testing.test_utils.
patch_retry
(testcase, module)[source]¶ A function to patch retry module to use mock clock and logger.
Clock and logger that defined in retry decorator will be replaced in test in order to skip sleep phase when retry happens.
Parameters: - testcase – An instance of unittest.TestCase that calls this function to patch retry module.
- module – The module that uses retry and need to be replaced with mock clock and logger in test.
apache_beam.testing.util module¶
Utilities for testing Beam pipelines.
-
apache_beam.testing.util.
assert_that
(actual, matcher, label='assert_that')[source]¶ A PTransform that checks a PCollection has an expected value.
Note that assert_that should be used only for testing pipelines since the check relies on materializing the entire PCollection being checked.
Parameters: - actual – A PCollection.
- matcher – A matcher function taking as argument the actual value of a materialized PCollection. The matcher validates this actual value against expectations and raises BeamAssertException if they are not met.
- label – Optional string label. This is needed in case several assert_that transforms are introduced in the same pipeline.
Returns: Ignored.