Beam SDK for Python dependencies

The Beam SDKs depend on common third-party components which then import additional dependencies. Version collisions can result in unexpected behavior in the service. If you are using any of these packages in your code, be aware that some libraries are not forward-compatible and you may need to pin to the listed versions that will be in scope during execution.

Dependencies for your Beam SDK version are listed in setup.py in the Beam repository. To view them, perform the following steps:

  1. Open setup.py.

     https://raw.githubusercontent.com/apache/beam/v<VERSION_NUMBER>/sdks/python/setup.py
    

    Replace <VERSION_NUMBER> with the major.minor.patch version of the SDK. For example, https://raw.githubusercontent.com/apache/beam/v2.19.0/sdks/python/setup.py will provide the dependencies for the 2.19.0 release.

  2. Review the core dependency list under REQUIRED_PACKAGES.

    Note: If you require extra features such as gcp or test, you should review the lists under REQUIRED_TEST_PACKAGES, GCP_REQUIREMENTS, or INTERACTIVE_BEAM for additional dependencies.

You can also retrieve the dependency list from the command line using the following process:

  1. Create a clean virtual environment on your local machine.

    Python 3:

    $ python3 -m venv env && source env/bin/activate
    

    Python 2:

    $ pip install virtualenv && virtualenv env && source env/bin/activate
    
  2. Install the Beam Python SDK.

  3. Retrieve the list of dependencies.

     $ pip install pipdeptree && pipdeptree -p apache-beam