java.lang.Object
org.apache.beam.sdk.io.gcp.healthcare.FhirIO

public class FhirIO extends Object
FhirIO provides an API for reading and writing resources to Google Cloud Healthcare Fhir API.

Reading

FHIR resources can be read with FhirIO.Read, which supports use cases where you have a $PCollection of FHIR resource names in the format of projects/{p}/locations/{l}/datasets/{d}/fhirStores/{f}/fhir/{resourceType}/{id}. This is appropriate for reading the Fhir notifications from a Pub/Sub subscription with PubsubIO.readStrings() or in cases where you have a manually prepared list of resources that you need to process (e.g. in a text file read with TextIO*) .

Get Resource contents from the FHIR Store based on the PCollection of FHIR resource name strings FhirIO.Read.Result where one can call FhirIO.Read.Result.getResources() to retrieve a PCollection containing the successfully fetched json resources as Strings and/or FhirIO.Read.Result.getFailedReads() to retrieve a PCollection of HealthcareIOError containing the resources that could not be fetched and the exception as a HealthcareIOError, this can be used to write to the dead letter storage system of your choosing. This error handling is mainly to transparently surface errors where the upstream PCollection contains FHIR resources that are not valid or are not reachable due to permissions issues.

Additionally, you can query an entire FHIR Patient resource's compartment (resources that refer to the patient, and are referred to by the patient) by calling getPatientEverything() to execute a FHIR GetPatientEverythingRequest.

Writing

Write Resources can be written to FHIR with a couple of different methods: including Import or Execute Bundle.

  • Execute Bundle

    This is best for use cases where you are writing to a non-empty FHIR store with other clients or otherwise need referential integrity (e.g. A Streaming HL7v2 to FHIR ETL pipeline).

  • Import

    This is best for use cases where you are populating an empty FHIR store with no other clients. It is faster than the execute bundles method but does not respect referential integrity and the resources are not written transactionally (e.g. a historical backfill on a new FHIR store) This requires each resource to contain a client provided ID. It is important that when using import you give the appropriate permissions to the Google Cloud Healthcare Service Agent.

  • Export

    This is to export FHIR resources from a FHIR store to Google Cloud Storage or BigQuery. The output resources are in ndjson (newline delimited json) of FHIR resources. It is important that when using export you give the appropriate permissions to the Google Cloud Healthcare Service Agent.

  • Deidentify

    This is to de-identify FHIR resources from a source FHIR store and write the result to a destination FHIR store. It is important that the destination store must already exist.

  • Search

    This is to search FHIR resources within a given FHIR store. The inputs are individual FHIR Search queries, represented by the FhirSearchParameter class. The outputs are results of each Search, represented as a Json array of FHIR resources in string form, with pagination handled, and an optional input key.

A PCollection of String can be ingested into an Fhir store using FhirIO.Write.fhirStoresImport(String, String, String, FhirIO.Import.ContentStructure) This will return a FhirIO.Write.Result on which you can call FhirIO.Write.Result.getFailedBodies() to retrieve a PCollection of HealthcareIOError containing the String that failed to be ingested and the exception.

Example


 Pipeline pipeline = ...

 // Tail the FHIR store by retrieving resources based on Pub/Sub notifications.
 FhirIO.Read.Result readResult = p
   .apply("Read FHIR notifications",
     PubsubIO.readStrings().fromSubscription(options.getNotificationSubscription()))
   .apply(FhirIO.readResources());

 // happily retrived resources
 PCollection<String> resources = readResult.getResources();
 // resource paths that couldn't be retrieved + error context
 PCollection<HealthcareIOError<String>> failedReads = readResult.getFailedReads();

 failedReads.apply("Write Resources / Stacktrace for Failed Reads to BigQuery",
     BigQueryIO
         .write()
         .to(option.getBQFhirExecuteBundlesDeadLetterTable())
         .withFormatFunction(new HealthcareIOErrorToTableRow()));

 output = resources.apply("Happy path transformations", ...);
 FhirIO.Write.Result writeResult =
     output.apply("Execute FHIR Bundles", FhirIO.executeBundles(options.getExistingFhirStore()));

 PCollection<HealthcareIOError<String>> failedBundles = writeResult.getFailedInsertsWithErr();

 failedBundles.apply("Write failed bundles to BigQuery",
     BigQueryIO
         .write()
         .to(option.getBQFhirExecuteBundlesDeadLetterTable())
         .withFormatFunction(new HealthcareIOErrorToTableRow()));

 // Alternatively you could use import for high throughput to a new store.
 FhirIO.Write.Result writeResult =
     output.apply("Import FHIR Resources", FhirIO.executeBundles(options.getNewFhirStore()));

 // Export FHIR resources to Google Cloud Storage or BigQuery.
 String fhirStoreName = ...;
 String exportUri = ...; // "gs://..." or "bq://..."
 PCollection<String> resources =
     pipeline.apply(FhirIO.exportResources(fhirStoreName, exportUri));

 // De-identify FHIR resources.
 String sourceFhirStoreName = ...;
 String destinationFhirStoreName = ...;
 DeidentifyConfig deidConfig = new DeidentifyConfig(); // use default DeidentifyConfig
 pipeline.apply(FhirIO.deidentify(fhirStoreName, destinationFhirStoreName, deidConfig));

 // Search FHIR resources using an "OR" query.
 Map<String, String> queries = new HashMap<>();
 queries.put("name", "Alice,Bob");
 FhirSearchParameter<String> searchParameter = FhirSearchParameter.of("Patient", queries);
 PCollection<FhirSearchParameter<String>> searchQueries =
 pipeline.apply(
      Create.of(searchParameter)
            .withCoder(FhirSearchParameterCoder.of(StringUtf8Coder.of())));
 FhirIO.Search.Result searchResult =
      searchQueries.apply(FhirIO.searchResources(options.getFhirStore()));
 PCollection<JsonArray> resources = searchResult.getResources(); // JsonArray of results

 // Search FHIR resources using an "AND" query with a key.
 Map<String, List<String>> listQueries = new HashMap<>();
 listQueries.put("name", Arrays.asList("Alice", "Bob"));
 FhirSearchParameter<List<String>> listSearchParameter =
      FhirSearchParameter.of("Patient", "Alice-Bob-Search", listQueries);
 PCollection<FhirSearchParameter<List<String>>> listSearchQueries =
 pipeline.apply(
      Create.of(listSearchParameter)
            .withCoder(FhirSearchParameterCoder.of(ListCoder.of(StringUtf8Coder.of()))));
 FhirIO.Search.Result listSearchResult =
      searchQueries.apply(FhirIO.searchResources(options.getFhirStore()));
 PCollection<KV<String, JsonArray>> listResource =
      listSearchResult.getKeyedResources(); // KV<"Alice-Bob-Search", JsonArray of results>

 

Updates to the I/O connector code

For any significant updates to this I/O connector, please consider involving corresponding code reviewers mentioned here.
  • Constructor Details

    • FhirIO

      public FhirIO()
  • Method Details

    • readResources

      public static FhirIO.Read readResources()
      Read resources from a PCollection of resource IDs (e.g. when subscribing the pubsub notifications)
      Returns:
      the read
      See Also:
    • searchResources

      public static FhirIO.Search<String> searchResources(String fhirStore)
      Search resources from a Fhir store with String parameter values.
      Returns:
      the search
      See Also:
    • searchResourcesWithGenericParameters

      public static FhirIO.Search<?> searchResourcesWithGenericParameters(String fhirStore)
      Search resources from a Fhir store with any type of parameter values.
      Returns:
      the search
      See Also:
    • importResources

      public static FhirIO.Import importResources(String fhirStore, String tempDir, String deadLetterDir, @Nullable FhirIO.Import.ContentStructure contentStructure)
      Import resources. Intended for use on empty FHIR stores
      Parameters:
      fhirStore - the fhir store
      tempDir - the temp dir
      deadLetterDir - the dead letter dir
      contentStructure - the content structure
      Returns:
      the import
      See Also:
    • importResources

      public static FhirIO.Import importResources(ValueProvider<String> fhirStore, ValueProvider<String> tempDir, ValueProvider<String> deadLetterDir, @Nullable FhirIO.Import.ContentStructure contentStructure)
      Import resources. Intended for use on empty FHIR stores
      Parameters:
      fhirStore - the fhir store
      tempDir - the temp dir
      deadLetterDir - the dead letter dir
      contentStructure - the content structure
      Returns:
      the import
      See Also:
    • exportResources

      public static FhirIO.Export exportResources(String fhirStore, String exportUri)
      Export resources to GCS. Intended for use on non-empty FHIR stores
      Parameters:
      fhirStore - the fhir store, in the format: projects/project_id/locations/location_id/datasets/dataset_id/fhirStores/fhir_store_id
      exportUri - the destination GCS dir or BigQuery dataset, in the format: gs://YOUR_BUCKET_NAME/path/to/a/dir | bq://PROJECT_ID.BIGQUERY_DATASET_ID
      Returns:
      the export
      See Also:
    • exportResources

      public static FhirIO.Export exportResources(ValueProvider<String> fhirStore, ValueProvider<String> exportUri)
      See Also:
    • deidentify

      public static FhirIO.Deidentify deidentify(String sourceFhirStore, String destinationFhirStore, DeidentifyConfig deidConfig)
      Deidentify FHIR resources. Intended for use on non-empty FHIR stores
      Parameters:
      sourceFhirStore - the source fhir store, in the format: projects/project_id/locations/location_id/datasets/dataset_id/fhirStores/fhir_store_id
      destinationFhirStore - the destination fhir store to write de-identified resources, in the format: projects/project_id/locations/location_id/datasets/dataset_id/fhirStores/fhir_store_id
      deidConfig - the DeidentifyConfig
      Returns:
      the deidentify
      See Also:
    • deidentify

      public static FhirIO.Deidentify deidentify(ValueProvider<String> sourceFhirStore, ValueProvider<String> destinationFhirStore, ValueProvider<DeidentifyConfig> deidConfig)
      Deidentify FHIR resources. Intended for use on non-empty FHIR stores
      Parameters:
      sourceFhirStore - the source fhir store, in the format: projects/project_id/locations/location_id/datasets/dataset_id/fhirStores/fhir_store_id
      destinationFhirStore - the destination fhir store to write de-identified resources, in the format: projects/project_id/locations/location_id/datasets/dataset_id/fhirStores/fhir_store_id
      deidConfig - the DeidentifyConfig
      Returns:
      the deidentify
      See Also:
    • getPatientEverything

      public static FhirIOPatientEverything getPatientEverything()
      Get the patient compartment for a FHIR Patient using the GetPatientEverything/$everything API.
      Returns:
      the patient everything
      See Also: