Overview
This page contains technical details for users starting Go SDK pipelines on machines that are not using a linux
operating system, nor an amd64
architecture.
Go is a statically compiled language. To execute a Go binary on a machine, it must be compiled for the matching operating system and processor architecture. This has implications for how Go SDK pipelines execute on workers.
Development: Using go run
When starting your in development pipeline against a remote runner, you can use go run
from your development environment.
The Go SDK will cross-compile your pipeline for linux-amd64
, and use that as the pipeline’s worker binary.
Alternatively, some local runners support Loopback execution.
Setting the flag --environment_type=LOOPBACK
can cause the runner to connect back to the local binary to serve as a worker.
This can simplify development and debugging by avoiding hiding log output in a container.
Production: Overriding the Worker Binary
Go SDK pipeline binaries have a --worker_binary
flag to set the path to the desired worker binary.
This section will teach you how to use this flag for robust Go pipelines.
In production settings, it’s common to only have access to compiled artifacts. For Go SDK pipelines, you may need to have two: one for the launching platform, and one for the worker platform.
In order to run a Go program on a specific platform, that program must be built targeting that platform’s operating system, and architecture.
The Go compiler is able to cross compile to a target architecture by setting the $GOOS
and $GOARCH
environment variables for your build.
For example, you may be launching a pipeline from an M1 Macbook, but running the jobs on a Flink cluster executing on linux VMs with amd64 processors.
In this situation, you would need to compile your pipeline binary for both darwin-arm64
for the launching, and linux-amd64
.
# Build binary for the launching platform.
# This uses the defaults for your machine, so no new environment variables are needed.
$ go build path/to/my/pipeline -o output/launcher
# Build binary for the worker platform, linux-amd64
$ GOOS=linux GOARCH=amd64 go build path/to/my/pipeline -o output/worker
Execute the pipeline with the --worker_binary
flag set to the desired binary.
# Launch the pipeline specifying the worker binary.
$ ./output/launcher --worker_binary=output/worker --runner=flink --endpoint=... <...other flags...>
SDK Containers
Apache Beam releases SDK specific containers for runners to use to launch workers. These containers provision and initialize the worker binary as appropriate for the SDK.
At present, Go SDK worker containers are only built for the linux-amd64
platform.
See Issue 20807 for the current state of ARM64 container support.
Because Go is statically compiled, there are no runtime dependencies on a specific Go version for a container. The Go release used to compile your binary will be what your workers execute. Be sure to update to a recent Go release for best performance.