WordCount quickstart for Go
This Quickstart will walk you through executing your first Beam pipeline to run WordCount, written using Beam’s Go SDK, on a runner of your choice.
If you’re interested in contributing to the Apache Beam Go codebase, see the Contribution Guide.
Set up your environment
The Beam SDK for Go requires go
version 1.19 or newer. It can be downloaded here. Check that you have version 1.19 by running:
Get the SDK and the examples
The easiest way to obtain the Apache Beam Go SDK is via go get
:
For development of the Go SDK itself, see BUILD.md for details.
Run wordcount
The Apache Beam examples directory has many examples. All examples can be run by passing the required arguments described in the examples.
For example, to run wordcount
, run:
$ go install github.com/apache/beam/sdks/v2/go/examples/wordcount
# As part of the initial setup, for non linux users - install package unix before run
$ go get -u golang.org/x/sys/unix
$ wordcount --input gs://dataflow-samples/shakespeare/kinglear.txt \
--output gs://<your-gcs-bucket>/counts \
--runner dataflow \
--project your-gcp-project \
--region your-gcp-region \
--temp_location gs://<your-gcs-bucket>/tmp/ \
--staging_location gs://<your-gcs-bucket>/binaries/ \
--worker_harness_container_image=apache/beam_go_sdk:latest
# Build and run the Spark job server from Beam source.
# -PsparkMasterUrl is optional. If it is unset the job will be run inside an embedded Spark cluster.
$ ./gradlew :runners:spark:3:job-server:runShadow -PsparkMasterUrl=spark://localhost:7077
# In a separate terminal, run:
$ go install github.com/apache/beam/sdks/v2/go/examples/wordcount
$ wordcount --input <PATH_TO_INPUT_FILE> \
--output counts \
--runner spark \
--endpoint localhost:8099
Next Steps
- Learn more about the Beam SDK for Go and look through the godoc.
- Walk through these WordCount examples in the WordCount Example Walkthrough.
- Take a self-paced tour through our Learning Resources.
- Dive in to some of our favorite Videos and Podcasts.
- Join the Beam users@ mailing list.
Please don’t hesitate to reach out if you encounter any issues!
Last updated on 2023/05/31
Have you found everything you were looking for?
Was it all useful and clear? Is there anything that you would like to change? Let us know!