Apache Beam Go SDK Quickstart

This Quickstart will walk you through executing your first Beam pipeline to run WordCount, written using Beam’s Go SDK, on a runner of your choice.

Set up your environment

The Beam SDK for Go requires go version 1.10 or newer. It can be downloaded here. Check that you have version 1.10 by running:

$ go --version

Get the SDK and the examples

The easiest way to obtain the Apache Beam Go SDK is via go get:

$ go get -u github.com/apache/beam/sdks/go/...

For development of the Go SDK itself, see BUILD.md for details.

Run wordcount

The Apache Beam examples directory has many examples. All examples can be run by passing the required arguments described in the examples.

For example, to run wordcount, run:

$ go install github.com/apache/beam/sdks/go/examples/wordcount
$ wordcount --input <PATH_TO_INPUT_FILE> --output counts
$ go install github.com/apache/beam/sdks/go/examples/wordcount
$ wordcount --input gs://dataflow-samples/shakespeare/kinglear.txt \
            --output gs://<your-gcs-bucket>/counts \
            --runner dataflow \
            --project your-gcp-project \
            --temp_location gs://<your-gcs-bucket>/tmp/ \
            --worker_harness_container_image=apache-docker-beam-snapshots-docker.bintray.io/beam/go:20180515

Next Steps

Please don’t hesitate to reach out if you encounter any issues!