Custom I/O patterns
This page describes common patterns in pipelines with custom I/O connectors. Custom I/O connectors connect pipelines to databases that aren’t supported by Beam’s built-in I/O transforms.
- Java SDK
- Python SDK
Choosing between built-in and custom connectors
Built-in I/O connectors are tested and hardened, so use them whenever possible. Only use custom I/O connectors when:
- No built-in options exist
- Your pipeline pulls in a small subset of source data
For instance, use a custom I/O connector to enrich pipeline elements with a small subset of source data. If you’re processing a sales order and adding information to each purchase, you can use a custom I/O connector to pull the small subset of data into your pipeline (instead of processing the entire source).
Beam distributes work across many threads, so custom I/O connectors can increase your data source’s load average. You can reduce the load with the startstart and finishfinish bundle annotations.
Last updated on 2024/11/18
Have you found everything you were looking for?
Was it all useful and clear? Is there anything that you would like to change? Let us know!