Class ClickHouseIO
Writing to ClickHouse
To write to ClickHouse, use write(String, String, String), which writes
elements from input PCollection. It's required that your ClickHouse cluster already has
table you are going to insert into.
// New way (recommended):
Properties props = new Properties();
props.setProperty("user", "admin");
props.setProperty("password", "secret");
pipeline
.apply(...)
.apply(
ClickHouseIO.<POJO>write("http://localhost:8123", "default", "my_table")
.withProperties(props));
// Old way (deprecated):
pipeline
.apply(...)
.apply(
ClickHouseIO.<POJO>write("jdbc:clickhouse://localhost:8123/default", "my_table"));
Optionally, you can provide connection settings, for instance, specify insert block size with
ClickHouseIO.Write.withMaxInsertBlockSize(long), or configure number of retries with ClickHouseIO.Write.withMaxRetries(int).
Deduplication
Deduplication is performed by ClickHouse if inserting to ReplicatedMergeTree
or Distributed
table on top of ReplicatedMergeTree. Without replication, inserting into regular MergeTree can
produce duplicates, if insert fails, and then successfully retries. However, each block is
inserted atomically, and you can configure block size with ClickHouseIO.Write.withMaxInsertBlockSize(long).
Deduplication is performed using checksums of inserted blocks. For SharedMergeTree tables in ClickHouse Cloud, deduplication behavior is similar to ReplicatedMergeTree. For more information about deduplication, please visit the Deduplication strategies documentation
Mapping between Beam and ClickHouse types
Nullable row columns are supported through Nullable type in ClickHouse. Low cardinality hint is supported through LowCardinality DataType in ClickHouse.
Nested rows should be unnested using Select.flattenedSchema(). Type casting should be
done using Cast before ClickHouseIO.
-
Nested Class Summary
Nested Classes -
Field Summary
Fields -
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionstatic TableSchemagetTableSchema(String jdbcUrl, String table) Deprecated.static TableSchemagetTableSchema(String clickHouseUrl, String database, String table, Properties properties) ReturnsTableSchemafor a given table using ClickHouse Java Client.static <T> ClickHouseIO.Write<T> Deprecated.Usewrite(String, String, String)with explicit URL, database, and tablestatic <T> ClickHouseIO.Write<T>
-
Field Details
-
DEFAULT_MAX_INSERT_BLOCK_SIZE
public static final long DEFAULT_MAX_INSERT_BLOCK_SIZE- See Also:
-
DEFAULT_MAX_RETRIES
public static final int DEFAULT_MAX_RETRIES- See Also:
-
DEFAULT_MAX_CUMULATIVE_BACKOFF
-
DEFAULT_INITIAL_BACKOFF
-
-
Constructor Details
-
ClickHouseIO
public ClickHouseIO()
-
-
Method Details
-
write
Deprecated.Usewrite(String, String, String)with explicit URL, database, and tableCreates a write transform using a JDBC URL format.Deprecated: Use
write(String, String, String)instead with separate URL, database, and table parameters.This method is provided for backward compatibility. It parses the JDBC URL to extract the connection URL, database name, and any connection properties specified in the query string. Properties can be overridden later using
ClickHouseIO.Write.withProperties(Properties).Example:
// Old way (deprecated): ClickHouseIO.write("jdbc:clickhouse://localhost:8123/mydb?user=admin&password=secret", "table") // New way: ClickHouseIO.write("http://localhost:8123", "mydb", "table") .withProperties(props)Property Precedence: Properties from the JDBC URL can be overridden by calling
ClickHouseIO.Write.withProperties(Properties). Later calls to withProperties() override earlier settings.- Parameters:
jdbcUrl- JDBC connection URL (e.g., jdbc:clickhouse://host:port/database?param=value)table- table name- Returns:
- a
PTransformwriting data to ClickHouse
-
write
-
getTableSchema
Deprecated.UsegetTableSchema(String, String, String, Properties)with explicit parametersReturnsTableSchemafor a given table using JDBC URL format.Deprecated: Use
getTableSchema(String, String, String, Properties)instead with separate URL, database, table, and properties parameters.This method parses the JDBC URL to extract connection details and properties. For new code, use the explicit parameter version for better clarity and control.
Example migration:
// Old way (deprecated): TableSchema schema = ClickHouseIO.getTableSchema( "jdbc:clickhouse://localhost:8123/mydb?user=admin", "my_table"); // New way: Properties props = new Properties(); props.setProperty("user", "admin"); TableSchema schema = ClickHouseIO.getTableSchema( "http://localhost:8123", "mydb", "my_table", props);- Parameters:
jdbcUrl- JDBC connection URL (e.g., jdbc:clickhouse://host:port/database?param=value)table- table name- Returns:
- table schema
-
getTableSchema
public static TableSchema getTableSchema(String clickHouseUrl, String database, String table, Properties properties) ReturnsTableSchemafor a given table using ClickHouse Java Client.- Parameters:
clickHouseUrl- ClickHouse connection urldatabase- ClickHouse databasetable- table nameproperties- connection properties- Returns:
- table schema
- Since:
- 2.72.0
-
getTableSchema(String, String, String, Properties)with explicit parameters