public class JdbcIO
extends java.lang.Object
JdbcIO source returns a bounded collection of T as a PCollection<T>. T is the
 type returned by the provided JdbcIO.RowMapper.
 
To configure the JDBC source, you have to provide a JdbcIO.DataSourceConfiguration using
 1. JdbcIO.DataSourceConfiguration.create(DataSource)(which must be Serializable);
 2. or JdbcIO.DataSourceConfiguration.create(String, String)(driver class name and url).
 Optionally, JdbcIO.DataSourceConfiguration.withUsername(String) and JdbcIO.DataSourceConfiguration.withPassword(String) allows you to define username and password.
 
For example:
 pipeline.apply(JdbcIO.<KV<Integer, String>>read()
   .withDataSourceConfiguration(JdbcIO.DataSourceConfiguration.create(
          "com.mysql.jdbc.Driver", "jdbc:mysql://hostname:3306/mydb")
        .withUsername("username")
        .withPassword("password"))
   .withQuery("select id,name from Person")
   .withRowMapper(new JdbcIO.RowMapper<KV<Integer, String>>() {
     public KV<Integer, String> mapRow(ResultSet resultSet) throws Exception {
       return KV.of(resultSet.getInt(1), resultSet.getString(2));
     }
   })
 );
 Query parameters can be configured using a user-provided JdbcIO.StatementPreparator. For
 example:
 
 pipeline.apply(JdbcIO.<KV<Integer, String>>read()
   .withDataSourceConfiguration(JdbcIO.DataSourceConfiguration.create(
       "com.mysql.jdbc.Driver", "jdbc:mysql://hostname:3306/mydb",
       "username", "password"))
   .withQuery("select id,name from Person where name = ?")
   .withStatementPreparator(new JdbcIO.StatementPreparator() {
     public void setParameters(PreparedStatement preparedStatement) throws Exception {
       preparedStatement.setString(1, "Darwin");
     }
   })
   .withRowMapper(new JdbcIO.RowMapper<KV<Integer, String>>() {
     public KV<Integer, String> mapRow(ResultSet resultSet) throws Exception {
       return KV.of(resultSet.getInt(1), resultSet.getString(2));
     }
   })
 );
 To customize the building of the DataSource we can provide a SerializableFunction. For example if you need to provide a PoolingDataSource from an
 existing JdbcIO.DataSourceConfiguration: you can use a JdbcIO.PoolableDataSourceProvider:
 
 pipeline.apply(JdbcIO.<KV<Integer, String>>read()
   .withDataSourceProviderFn(JdbcIO.PoolableDataSourceProvider.of(
       JdbcIO.DataSourceConfiguration.create(
           "com.mysql.jdbc.Driver", "jdbc:mysql://hostname:3306/mydb",
           "username", "password")))
    // ...
 );
 By default, the provided function requests a DataSource per execution thread. In some
 circumstances this can quickly overwhelm the database by requesting too many connections. In that
 case you should look into sharing a single instance of a PoolingDataSource across all the
 execution threads. For example:
 
 private static class MyDataSourceProviderFn implements SerializableFunction<Void, DataSource> {
   private static transient DataSource dataSource;
   @Override
   public synchronized DataSource apply(Void input) {
     if (dataSource == null) {
       dataSource = ... build data source ...
     }
     return dataSource;
   }
 }
 
 pipeline.apply(JdbcIO.<KV<Integer, String>>read()
   .withDataSourceProviderFn(new MyDataSourceProviderFn())
   // ...
 );
 Beam supports partitioned reading of all data from a table. Automatic partitioning is
 supported for a few data types: Long, DateTime, String. To
 enable this, use readWithPartitions(TypeDescriptor).
 
The partitioning scheme depends on these parameters, which can be user-provided, or automatically inferred by Beam (for the supported types):
Math.floor(Math.sqrt(numRows) / 5)).
 To trigger auto-inference of these parameters, the user just needs to not provide them. To infer them automatically, Beam runs either of these statements:
SELECT min(column), max(column), COUNT(*) from table when none of the parameters is
       passed to the transform.
   SELECT min(column), max(column) from table when only number of partitions is
       provided, but not upper or lower bounds.
 Should I use this transform? Consider using this transform in the following situations:
The following example shows usage of auto-inferred ranges, number of partitions, and schema
 pipeline.apply(JdbcIO.<Row>readWithPartitions()
  .withDataSourceConfiguration(JdbcIO.DataSourceConfiguration.create(
         "com.mysql.jdbc.Driver", "jdbc:mysql://hostname:3306/mydb")
       .withUsername("username")
       .withPassword("password"))
  .withTable("Person")
  .withPartitionColumn("id")
  .withRowOutput()
 );
 Instead of a full table you could also use a subquery in parentheses. The subquery can be specified using Table option instead and partition columns can be qualified using the subquery alias provided as part of Table. Note that a subquery may not perform as well with auto-inferred ranges and partitions, because it may not rely on indices to speed up the partitioning.
 pipeline.apply(JdbcIO.<KV<Integer, String>>readWithPartitions()
  .withDataSourceConfiguration(JdbcIO.DataSourceConfiguration.create(
         "com.mysql.jdbc.Driver", "jdbc:mysql://hostname:3306/mydb")
       .withUsername("username")
       .withPassword("password"))
  .withTable("(select id, name from Person) as subq")
  .withPartitionColumn("id")
  .withLowerBound(0)
  .withUpperBound(1000)
  .withNumPartitions(5)
  .withRowMapper(new JdbcIO.RowMapper<KV<Integer, String>>() {
    public KV<Integer, String> mapRow(ResultSet resultSet) throws Exception {
      return KV.of(resultSet.getInt(1), resultSet.getString(2));
    }
  })
 );
 JDBC sink supports writing records into a database. It writes a PCollection to the
 database by converting each T into a PreparedStatement via a user-provided JdbcIO.PreparedStatementSetter.
 
Like the source, to configure the sink, you have to provide a JdbcIO.DataSourceConfiguration.
 
 pipeline
   .apply(...)
   .apply(JdbcIO.<KV<Integer, String>>write()
      .withDataSourceConfiguration(JdbcIO.DataSourceConfiguration.create(
            "com.mysql.jdbc.Driver", "jdbc:mysql://hostname:3306/mydb")
          .withUsername("username")
          .withPassword("password"))
      .withStatement("insert into Person values(?, ?)")
      .withPreparedStatementSetter(new JdbcIO.PreparedStatementSetter<KV<Integer, String>>() {
        public void setParameters(KV<Integer, String> element, PreparedStatement query)
          throws SQLException {
          query.setInt(1, element.getKey());
          query.setString(2, element.getValue());
        }
      })
    );
 NB: in case of transient failures, Beam runners may execute parts of JdbcIO.Write multiple
 times for fault tolerance. Because of that, you should avoid using INSERT statements,
 since that risks duplicating records in the database, or failing due to primary key conflicts.
 Consider using MERGE ("upsert")
 statements supported by your database instead.
| Modifier and Type | Class and Description | 
|---|---|
| static class  | JdbcIO.DataSourceConfigurationA POJO describing a  DataSource, either providing directly aDataSourceor all
 properties allowing to create aDataSource. | 
| static class  | JdbcIO.DataSourceProviderFromDataSourceConfigurationWraps a  JdbcIO.DataSourceConfigurationto provide aDataSource. | 
| static class  | JdbcIO.DefaultRetryStrategyThis is the default  Predicatewe use to detect DeadLock. | 
| static class  | JdbcIO.PoolableDataSourceProviderWraps a  JdbcIO.DataSourceConfigurationto provide aPoolingDataSource. | 
| static interface  | JdbcIO.PreparedStatementSetter<T>An interface used by the JdbcIO Write to set the parameters of the  PreparedStatementused to setParameters into the database. | 
| static class  | JdbcIO.Read<T>Implementation of  read(). | 
| static class  | JdbcIO.ReadAll<ParameterT,OutputT>Implementation of  readAll(). | 
| static class  | JdbcIO.ReadRowsImplementation of  readRows(). | 
| static class  | JdbcIO.ReadWithPartitions<T,PartitionColumnT> | 
| static class  | JdbcIO.RetryConfigurationBuilder used to help with retry configuration for  JdbcIO. | 
| static interface  | JdbcIO.RetryStrategyAn interface used to control if we retry the statements when a  SQLExceptionoccurs. | 
| static interface  | JdbcIO.RowMapper<T>An interface used by  JdbcIO.Readfor converting each row of theResultSetinto
 an element of the resultingPCollection. | 
| static interface  | JdbcIO.StatementPreparatorAn interface used by the JdbcIO Write to set the parameters of the  PreparedStatementused to setParameters into the database. | 
| static class  | JdbcIO.Write<T>This class is used as the default return value of  write(). | 
| static class  | JdbcIO.WriteVoid<T>A  PTransformto write to a JDBC datasource. | 
| static class  | JdbcIO.WriteWithResults<T,V extends JdbcWriteResult>A  PTransformto write to a JDBC datasource. | 
| Modifier and Type | Method and Description | 
|---|---|
| static <T> JdbcIO.Read<T> | read()Read data from a JDBC datasource. | 
| static <ParameterT,OutputT> | readAll()Like  read(), but executes multiple instances of the query substituting each element of aPCollectionas query parameters. | 
| static JdbcIO.ReadRows | readRows()Read Beam  Rows from a JDBC data source. | 
| static <T> JdbcIO.ReadWithPartitions<T,java.lang.Long> | readWithPartitions() | 
| static <T,PartitionColumnT> | readWithPartitions(TypeDescriptor<PartitionColumnT> partitioningColumnType)Like  readAll(), but executes multiple instances of the query on the same table
 (subquery) using ranges. | 
| static <T> JdbcIO.Write<T> | write()Write data to a JDBC datasource. | 
| static <T> JdbcIO.WriteVoid<T> | writeVoid() | 
public static <T> JdbcIO.Read<T> read()
T - Type of the data to be read.public static JdbcIO.ReadRows readRows()
Rows from a JDBC data source.public static <ParameterT,OutputT> JdbcIO.ReadAll<ParameterT,OutputT> readAll()
read(), but executes multiple instances of the query substituting each element of a
 PCollection as query parameters.ParameterT - Type of the data representing query parameters.OutputT - Type of the data to be read.public static <T,PartitionColumnT> JdbcIO.ReadWithPartitions<T,PartitionColumnT> readWithPartitions(TypeDescriptor<PartitionColumnT> partitioningColumnType)
readAll(), but executes multiple instances of the query on the same table
 (subquery) using ranges.T - Type of the data to be read.public static <T> JdbcIO.ReadWithPartitions<T,java.lang.Long> readWithPartitions()
public static <T> JdbcIO.Write<T> write()
T - Type of the data to be written.public static <T> JdbcIO.WriteVoid<T> writeVoid()