Class AvroUtils
java.lang.Object
org.apache.beam.sdk.extensions.avro.schemas.utils.AvroUtils
Utils to convert AVRO records to Beam rows. Imposes a mapping between common avro types and Beam
portable schemas (https://s.apache.org/beam-schemas):
Avro Beam Field Type INT invalid input: '<'-----> INT32 LONG invalid input: '<'-----> INT64 FLOAT invalid input: '<'-----> FLOAT DOUBLE invalid input: '<'-----> DOUBLE BOOLEAN invalid input: '<'-----> BOOLEAN STRING invalid input: '<'-----> STRING BYTES invalid input: '<'-----> BYTES invalid input: '<'------ LogicalType(urn="beam:logical_type:var_bytes:v1") FIXED invalid input: '<'-----> LogicalType(urn="beam:logical_type:fixed_bytes:v1") ARRAY invalid input: '<'-----> ARRAY ENUM invalid input: '<'-----> LogicalType(EnumerationType) MAP invalid input: '<'-----> MAP RECORD invalid input: '<'-----> ROW UNION invalid input: '<'-----> LogicalType(OneOfType) LogicalTypes.Date invalid input: '<'-----> LogicalType(DATE) invalid input: '<'------ LogicalType(urn="beam:logical_type:date:v1") LogicalTypes.TimestampMillis invalid input: '<'-----> DATETIME LogicalTypes.Decimal invalid input: '<'-----> DECIMALFor SQL CHAR/VARCHAR types, an Avro schema
LogicalType({"type":"string","logicalType":"char","maxLength":MAX_LENGTH}) or
LogicalType({"type":"string","logicalType":"varchar","maxLength":MAX_LENGTH})
is used.-
Nested Class Summary
Nested ClassesModifier and TypeClassDescriptionstatic classstatic classstatic classstatic classWrapper for fixed byte fields.static class -
Method Summary
Modifier and TypeMethodDescriptionstatic voidconvertAvroFieldStrict(@PolyNull Object value, Schema avroSchema, Schema.FieldType fieldType) Strict conversion from AVRO to Beam, strict because it doesn't do widening or narrowing during conversion.convertAvroFieldStrict(@PolyNull Object value, Schema avroSchema, Schema.FieldType fieldType, GenericData genericData) Strict conversion from AVRO to Beam, strict because it doesn't do widening or narrowing during conversion.static SimpleFunction<byte[], Row> getAvroBytesToRowFunction(Schema beamSchema) Returns a function mapping encoded AVROGenericRecords to BeamRows.static <T> SchemaUserTypeCreatorgetCreator(TypeDescriptor<T> typeDescriptor, Schema schema) Get an object creator for an AVRO-generated SpecificRecord.static <T> List<FieldValueTypeInformation> getFieldTypes(TypeDescriptor<T> typeDescriptor, Schema schema) Get field types for an AVRO-generated SpecificRecord or a POJO.static <T> SerializableFunction<Row, T> getFromRowFunction(Class<T> clazz) static SerializableFunction<GenericRecord, Row> Returns a function mapping AVROGenericRecords to BeamRows for use inPCollection.setSchema(org.apache.beam.sdk.schemas.Schema, org.apache.beam.sdk.values.TypeDescriptor<T>, org.apache.beam.sdk.transforms.SerializableFunction<T, org.apache.beam.sdk.values.Row>, org.apache.beam.sdk.transforms.SerializableFunction<org.apache.beam.sdk.values.Row, T>).static <T> List<FieldValueGetter<@NonNull T, Object>> getGetters(TypeDescriptor<T> typeDescriptor, Schema schema) Get generated getters for an AVRO-generated SpecificRecord or a POJO.static SimpleFunction<Row, byte[]> getRowToAvroBytesFunction(Schema beamSchema) Returns a function mapping BeamRows to encoded AVROGenericRecords.static SerializableFunction<Row, GenericRecord> getRowToGenericRecordFunction(@Nullable Schema avroSchema) Returns a function mapping BeamRows to AVROGenericRecords for use inPCollection.setSchema(org.apache.beam.sdk.schemas.Schema, org.apache.beam.sdk.values.TypeDescriptor<T>, org.apache.beam.sdk.transforms.SerializableFunction<T, org.apache.beam.sdk.values.Row>, org.apache.beam.sdk.transforms.SerializableFunction<org.apache.beam.sdk.values.Row, T>).static <T> SerializableFunction<T, Row> getToRowFunction(Class<T> clazz, Schema schema) static <T> SchemaCoder<T> schemaCoder(Class<T> clazz) Returns anSchemaCoderinstance for the provided element class.static <T> SchemaCoder<T> schemaCoder(Class<T> clazz, Schema schema) Returns anSchemaCoderinstance for the provided element type using the provided Avro schema.static SchemaCoder<GenericRecord> schemaCoder(Schema schema) Returns anSchemaCoderinstance for the Avro schema.static <T> SchemaCoder<T> schemaCoder(AvroCoder<T> avroCoder) Returns anSchemaCoderinstance based on the provided AvroCoder for the element type.static <T> SchemaCoder<T> schemaCoder(TypeDescriptor<T> type) Returns anSchemaCoderinstance for the provided element type.static Schema.FieldtoAvroField(Schema.Field field, String namespace) Get Avro Field from Beam Field.static SchematoAvroSchema(Schema beamSchema) static SchemaConverts a Beam Schema into an AVRO schema.static Schema.FieldtoBeamField(Schema.Field field) Get Beam Field from avro Field.static RowtoBeamRowStrict(GenericRecord record, @Nullable Schema schema) Strict conversion from AVRO to Beam, strict because it doesn't do widening or narrowing during conversion.static RowtoBeamRowStrict(GenericRecord record, @Nullable Schema schema, @Nullable GenericData genericData) Strict conversion from AVRO to Beam, strict because it doesn't do widening or narrowing during conversion.static SchematoBeamSchema(Class<?> clazz) Converts AVRO schema to Beam row schema.static SchematoBeamSchema(Schema schema) Converts AVRO schema to Beam row schema.static GenericRecordtoGenericRecord(Row row) Convert from a Beam Row to an AVRO GenericRecord.static GenericRecordtoGenericRecord(Row row, @Nullable Schema avroSchema) Convert from a Beam Row to an AVRO GenericRecord.
-
Method Details
-
addLogicalTypeConversions
-
toBeamField
Get Beam Field from avro Field. -
toAvroField
Get Avro Field from Beam Field. -
toBeamSchema
Converts AVRO schema to Beam row schema.- Parameters:
clazz- avro class
-
toBeamSchema
Converts AVRO schema to Beam row schema.- Parameters:
schema- schema of type RECORD
-
toAvroSchema
public static Schema toAvroSchema(Schema beamSchema, @Nullable String name, @Nullable String namespace) Converts a Beam Schema into an AVRO schema. -
toAvroSchema
-
toBeamRowStrict
public static Row toBeamRowStrict(GenericRecord record, @Nullable Schema schema, @Nullable GenericData genericData) Strict conversion from AVRO to Beam, strict because it doesn't do widening or narrowing during conversion. If Schema is not provided, one is inferred from the AVRO schema. -
toBeamRowStrict
Strict conversion from AVRO to Beam, strict because it doesn't do widening or narrowing during conversion. If Schema is not provided, one is inferred from the AVRO schema. -
toGenericRecord
Convert from a Beam Row to an AVRO GenericRecord. The Avro Schema is inferred from the Beam schema on the row. -
toGenericRecord
Convert from a Beam Row to an AVRO GenericRecord. If a Schema is not provided, one is inferred from the Beam schema on the row. -
getToRowFunction
-
getFromRowFunction
-
getSchema
-
getAvroBytesToRowFunction
Returns a function mapping encoded AVROGenericRecords to BeamRows. -
getRowToAvroBytesFunction
Returns a function mapping BeamRows to encoded AVROGenericRecords. -
getGenericRecordToRowFunction
public static SerializableFunction<GenericRecord,Row> getGenericRecordToRowFunction(@Nullable Schema schema) Returns a function mapping AVROGenericRecords to BeamRows for use inPCollection.setSchema(org.apache.beam.sdk.schemas.Schema, org.apache.beam.sdk.values.TypeDescriptor<T>, org.apache.beam.sdk.transforms.SerializableFunction<T, org.apache.beam.sdk.values.Row>, org.apache.beam.sdk.transforms.SerializableFunction<org.apache.beam.sdk.values.Row, T>). -
getRowToGenericRecordFunction
public static SerializableFunction<Row,GenericRecord> getRowToGenericRecordFunction(@Nullable Schema avroSchema) Returns a function mapping BeamRows to AVROGenericRecords for use inPCollection.setSchema(org.apache.beam.sdk.schemas.Schema, org.apache.beam.sdk.values.TypeDescriptor<T>, org.apache.beam.sdk.transforms.SerializableFunction<T, org.apache.beam.sdk.values.Row>, org.apache.beam.sdk.transforms.SerializableFunction<org.apache.beam.sdk.values.Row, T>). -
schemaCoder
Returns anSchemaCoderinstance for the provided element type.- Type Parameters:
T- the element type
-
schemaCoder
Returns anSchemaCoderinstance for the provided element class.- Type Parameters:
T- the element type
-
schemaCoder
Returns anSchemaCoderinstance for the Avro schema. The implicit type is GenericRecord. -
schemaCoder
Returns anSchemaCoderinstance for the provided element type using the provided Avro schema.If the type argument is GenericRecord, the schema may be arbitrary. Otherwise, the schema must correspond to the type provided.
- Type Parameters:
T- the element type
-
schemaCoder
Returns anSchemaCoderinstance based on the provided AvroCoder for the element type.- Type Parameters:
T- the element type
-
getFieldTypes
public static <T> List<FieldValueTypeInformation> getFieldTypes(TypeDescriptor<T> typeDescriptor, Schema schema) Get field types for an AVRO-generated SpecificRecord or a POJO. -
getGetters
public static <T> List<FieldValueGetter<@NonNull T,Object>> getGetters(TypeDescriptor<T> typeDescriptor, Schema schema) Get generated getters for an AVRO-generated SpecificRecord or a POJO. -
getCreator
Get an object creator for an AVRO-generated SpecificRecord. -
convertAvroFieldStrict
public static @PolyNull Object convertAvroFieldStrict(@PolyNull Object value, @Nonnull Schema avroSchema, @Nonnull Schema.FieldType fieldType, @Nonnull GenericData genericData) Strict conversion from AVRO to Beam, strict because it doesn't do widening or narrowing during conversion.- Parameters:
value-GenericRecordor any nested valueavroSchema- schema for valuefieldType- target beam field typegenericData-GenericDatainstance to use for conversions- Returns:
- value converted for
Row
-
convertAvroFieldStrict
public static @PolyNull Object convertAvroFieldStrict(@PolyNull Object value, @Nonnull Schema avroSchema, @Nonnull Schema.FieldType fieldType) Strict conversion from AVRO to Beam, strict because it doesn't do widening or narrowing during conversion.- Parameters:
value-GenericRecordor any nested valueavroSchema- schema for valuefieldType- target beam field type- Returns:
- value converted for
Row
-