Class GroupByEncryptedKey<K,V>
- All Implemented Interfaces:
Serializable,HasDisplayData
PTransform that provides a secure alternative to GroupByKey.
This transform encrypts the keys of the input PCollection, performs a GroupByKey on the encrypted keys, and then decrypts the keys in
the output. This is useful when the keys contain sensitive data that should not be stored at rest
by the runner.
The transform requires a Secret which returns a base64 encoded 32 byte secret which
can be used to generate a SecretKeySpec object using the HmacSHA256 algorithm.
Note the following caveats: 1) Runners can implement arbitrary materialization steps, so this does not guarantee that the whole pipeline will not have unencrypted data at rest by itself. 2) If using this transform in streaming mode, this transform may not properly handle update compatibility checks around coders. This means that an improper update could lead to invalid coders, causing pipeline failure or data corruption. If you need to update, make sure that the input type passed into this transform does not change.
- See Also:
-
Field Summary
Fields inherited from class org.apache.beam.sdk.transforms.PTransform
annotations, displayData, name, resourceHints -
Method Summary
Modifier and TypeMethodDescriptionstatic <K,V> GroupByEncryptedKey <K, V> create(org.apache.beam.sdk.util.Secret hmacKey) Creates aGroupByEncryptedKeytransform.static <K,V> GroupByEncryptedKey <K, V> createWithCustomGbk(org.apache.beam.sdk.util.Secret hmacKey, PTransform<PCollection<KV<byte[], KV<byte[], byte[]>>>, PCollection<KV<byte[], Iterable<KV<byte[], byte[]>>>>> gbk) Creates aGroupByEncryptedKeytransform with a custom GBK in the middle.PCollection<KV<K, Iterable<V>>> expand(PCollection<KV<K, V>> input) Override this method to specify how thisPTransformshould be expanded on the givenInputT.Methods inherited from class org.apache.beam.sdk.transforms.PTransform
addAnnotation, compose, compose, getAdditionalInputs, getAnnotations, getDefaultOutputCoder, getDefaultOutputCoder, getDefaultOutputCoder, getKindString, getName, getResourceHints, populateDisplayData, setDisplayData, setResourceHints, toString, validate, validate
-
Method Details
-
create
Creates aGroupByEncryptedKeytransform.- Type Parameters:
K- The type of the keys in the input PCollection.V- The type of the values in the input PCollection.- Parameters:
hmacKey- TheSecretkey to use for encryption.- Returns:
- A
GroupByEncryptedKeytransform.
-
createWithCustomGbk
public static <K,V> GroupByEncryptedKey<K,V> createWithCustomGbk(org.apache.beam.sdk.util.Secret hmacKey, PTransform<PCollection<KV<byte[], KV<byte[], byte[]>>>, PCollection<KV<byte[], Iterable<KV<byte[], byte[]>>>>> gbk) Creates aGroupByEncryptedKeytransform with a custom GBK in the middle.- Type Parameters:
K- The type of the keys in the input PCollection.V- The type of the values in the input PCollection.- Parameters:
hmacKey- TheSecretkey to use for encryption.gbk- The custom GBK transform to use in the middle of the GBEK.- Returns:
- A
GroupByEncryptedKeytransform.
-
expand
Description copied from class:PTransformOverride this method to specify how thisPTransformshould be expanded on the givenInputT.NOTE: This method should not be called directly. Instead apply the
PTransformshould be applied to theInputTusing theapplymethod.Composite transforms, which are defined in terms of other transforms, should return the output of one of the composed transforms. Non-composite transforms, which do not apply any transforms internally, should return a new unbound output and register evaluators (via backend-specific registration methods).
- Specified by:
expandin classPTransform<PCollection<KV<K,V>>, PCollection<KV<K, Iterable<V>>>>
-