Class ElasticsearchIO.DocToBulk
- All Implemented Interfaces:
Serializable,HasDisplayData
- Enclosing class:
ElasticsearchIO
PTransform converting docs to their Bulk API counterparts.- See Also:
-
Field Summary
Fields inherited from class org.apache.beam.sdk.transforms.PTransform
annotations, displayData, name, resourceHints -
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionexpand(PCollection<String> docs) Override this method to specify how thisPTransformshould be expanded on the givenInputT.withAppendOnly(boolean appendOnly) Provide an instruction to control whether the target index should be considered append-only.withBackendVersion(int backendVersion) Use to set explicitly which version of Elasticsearch the destination cluster is running.withConnectionConfiguration(ElasticsearchIO.ConnectionConfiguration connectionConfiguration) Provide the Elasticsearch connection configuration object.withDocVersionFn(ElasticsearchIO.Write.FieldValueExtractFn docVersionFn) Provide a function to extract the doc version from the document.withDocVersionType(String docVersionType) Provide a function to extract the doc version from the document.Provide a function to extract the id from the document.Provide a function to extract the target index from the document allowing for dynamic document routing.Provide a function to extract the target operation either upsert or delete from the document fields allowing dynamic bulk operation decision.Provide a function to extract the target routing from the document allowing for dynamic document routing.Provide a function to extract the target type from the document allowing for dynamic document routing.withUpsertScript(String source) Whether to use scripted updates and what script to use.withUsePartialUpdate(boolean usePartialUpdate) Provide an instruction to control whether partial updates or inserts (default) are issued to Elasticsearch.Methods inherited from class org.apache.beam.sdk.transforms.PTransform
addAnnotation, compose, compose, getAdditionalInputs, getAnnotations, getDefaultOutputCoder, getDefaultOutputCoder, getDefaultOutputCoder, getKindString, getName, getResourceHints, populateDisplayData, setDisplayData, setResourceHints, toString, validate, validate
-
Constructor Details
-
DocToBulk
public DocToBulk()
-
-
Method Details
-
withConnectionConfiguration
public ElasticsearchIO.DocToBulk withConnectionConfiguration(ElasticsearchIO.ConnectionConfiguration connectionConfiguration) Provide the Elasticsearch connection configuration object. Only required if withBackendVersion was not used i.e. getBackendVersion() returns null.- Parameters:
connectionConfiguration- the ElasticsearchElasticsearchIO.ConnectionConfigurationobject- Returns:
- the
ElasticsearchIO.DocToBulkwith connection configuration set
-
withIdFn
Provide a function to extract the id from the document. This id will be used as the document id in Elasticsearch. Should the function throw an Exception then the batch will fail and the exception propagated.- Parameters:
idFn- to extract the document ID- Returns:
- the
ElasticsearchIO.DocToBulkwith the function set
-
withIndexFn
Provide a function to extract the target index from the document allowing for dynamic document routing. Should the function throw an Exception then the batch will fail and the exception propagated.- Parameters:
indexFn- to extract the destination index from- Returns:
- the
ElasticsearchIO.DocToBulkwith the function set
-
withRoutingFn
Provide a function to extract the target routing from the document allowing for dynamic document routing. Should the function throw an Exception then the batch will fail and the exception propagated.- Parameters:
routingFn- to extract the destination index from- Returns:
- the
ElasticsearchIO.DocToBulkwith the function set
-
withTypeFn
Provide a function to extract the target type from the document allowing for dynamic document routing. Should the function throw an Exception then the batch will fail and the exception propagated. Users are encouraged to consider carefully if multipe types are a sensible model as discussed in this blog.- Parameters:
typeFn- to extract the destination index from- Returns:
- the
ElasticsearchIO.DocToBulkwith the function set
-
withUsePartialUpdate
Provide an instruction to control whether partial updates or inserts (default) are issued to Elasticsearch.- Parameters:
usePartialUpdate- set to true to issue partial updates- Returns:
- the
ElasticsearchIO.DocToBulkwith the partial update control set
-
withAppendOnly
Provide an instruction to control whether the target index should be considered append-only. For append-only indexes and/or data streams, onlycreateoperations will be issued, instead ofindex, which is the default.createfails if a document with the same ID already exists in the target,indexadds or replaces a document as necessary. If no ID is provided, both operations are equivalent, unless you are writing to a data stream. Data streams only support thecreateoperation. For more information see the Elasticsearch documentationUpdates and deletions are not allowed, so related options will be ignored.
When the documents contain
- Parameters:
appendOnly- set to true to allow only document appending- Returns:
- the
ElasticsearchIO.DocToBulkwith the-append only control set
-
withUpsertScript
Whether to use scripted updates and what script to use.- Parameters:
source- set to the value of the script source, painless lang- Returns:
- the
ElasticsearchIO.DocToBulkwith the scripted updates set
-
withDocVersionFn
public ElasticsearchIO.DocToBulk withDocVersionFn(ElasticsearchIO.Write.FieldValueExtractFn docVersionFn) Provide a function to extract the doc version from the document. This version number will be used as the document version in Elasticsearch. Should the function throw an Exception then the batch will fail and the exception propagated. Incompatible with update operations and should only be used with withUsePartialUpdate(false)- Parameters:
docVersionFn- to extract the document version- Returns:
- the
ElasticsearchIO.DocToBulkwith the function set
-
withIsDeleteFn
public ElasticsearchIO.DocToBulk withIsDeleteFn(ElasticsearchIO.Write.BooleanFieldValueExtractFn isDeleteFn) Provide a function to extract the target operation either upsert or delete from the document fields allowing dynamic bulk operation decision. While using withIsDeleteFn, it should be taken care that the document's id extraction is defined using the withIdFn function or else IllegalArgumentException is thrown. Should the function throw an Exception then the batch will fail and the exception propagated.- Parameters:
isDeleteFn- set to true for deleting the specific document- Returns:
- the
ElasticsearchIO.Writewith the function set
-
withDocVersionType
Provide a function to extract the doc version from the document. This version number will be used as the document version in Elasticsearch. Should the function throw an Exception then the batch will fail and the exception propagated. Incompatible with update operations and should only be used with withUsePartialUpdate(false)- Parameters:
docVersionType- the version type to use, one ofElasticsearchIO.VERSION_TYPES- Returns:
- the
ElasticsearchIO.DocToBulkwith the doc version type set
-
withBackendVersion
Use to set explicitly which version of Elasticsearch the destination cluster is running. Providing this hint means there is no need for settingwithConnectionConfiguration(org.apache.beam.sdk.io.elasticsearch.ElasticsearchIO.ConnectionConfiguration). This can also be very useful for testing purposes.Note: if the value of @param backendVersion differs from the version the destination cluster is running, behavior is undefined and likely to yield errors.
- Parameters:
backendVersion- the major version number of the version of Elasticsearch being run in the cluster where documents will be indexed.- Returns:
- the
ElasticsearchIO.DocToBulkwith the Elasticsearch major version number set
-
expand
Description copied from class:PTransformOverride this method to specify how thisPTransformshould be expanded on the givenInputT.NOTE: This method should not be called directly. Instead apply the
PTransformshould be applied to theInputTusing theapplymethod.Composite transforms, which are defined in terms of other transforms, should return the output of one of the composed transforms. Non-composite transforms, which do not apply any transforms internally, should return a new unbound output and register evaluators (via backend-specific registration methods).
- Specified by:
expandin classPTransform<PCollection<String>,PCollection<ElasticsearchIO.Document>>
-