public class WithTimestamps<T> extends PTransform<PCollection<T>,PCollection<T>>
PTransform
for assigning timestamps to all the elements of a PCollection
.
Timestamps are used to assign Windows
to elements within the
Window.into(org.apache.beam.sdk.transforms.windowing.WindowFn)
PTransform
. Assigning timestamps is useful when the input data set comes from a
Source
without implicit timestamps (such as
TextIO
).
name
Modifier and Type | Method and Description |
---|---|
PCollection<T> |
expand(PCollection<T> input)
Applies this
PTransform on the given InputT , and returns its
Output . |
Duration |
getAllowedTimestampSkew()
Deprecated.
This method permits a to elements to be emitted behind the watermark. These
elements are considered late, and if behind the
allowed lateness of a downstream
PCollection may be silently dropped. See
https://issues.apache.org/jira/browse/BEAM-644 for details on a replacement. |
static <T> WithTimestamps<T> |
of(SerializableFunction<T,Instant> fn)
For a
SerializableFunction fn from T to Instant , outputs a
PTransform that takes an input PCollection<T> and outputs a
PCollection<T> containing every element v in the input where
each element is output with a timestamp obtained as the result of fn.apply(v) . |
WithTimestamps<T> |
withAllowedTimestampSkew(Duration allowedTimestampSkew)
Deprecated.
This method permits a to elements to be emitted behind the watermark. These
elements are considered late, and if behind the
allowed lateness of a downstream
PCollection may be silently dropped. See
https://issues.apache.org/jira/browse/BEAM-644 for details on a replacement. |
getAdditionalInputs, getDefaultOutputCoder, getDefaultOutputCoder, getDefaultOutputCoder, getKindString, getName, populateDisplayData, toString, validate
public static <T> WithTimestamps<T> of(SerializableFunction<T,Instant> fn)
SerializableFunction
fn
from T
to Instant
, outputs a
PTransform
that takes an input PCollection<T>
and outputs a
PCollection<T>
containing every element v
in the input where
each element is output with a timestamp obtained as the result of fn.apply(v)
.
If the input PCollection
elements have timestamps, the output timestamp for each
element must not be before the input element's timestamp minus the value of getAllowedTimestampSkew()
. If an output timestamp is before this time, the transform will
throw an IllegalArgumentException
when executed. Use withAllowedTimestampSkew(Duration)
to update the allowed skew.
CAUTION: Use of withAllowedTimestampSkew(Duration)
permits elements to be emitted
behind the watermark. These elements are considered late, and if behind the allowed lateness
of a downstream
PCollection
may be silently dropped. See https://issues.apache.org/jira/browse/BEAM-644
for details on a replacement.
Each output element will be in the same windows as the input element. If a new window based
on the new output timestamp is desired, apply a new instance of Window.into(WindowFn)
.
This transform will fail at execution time with a NullPointerException
if for any
input element the result of fn.apply(v)
is null
.
Example of use in Java 8:
PCollection<Record> timestampedRecords = records.apply(
WithTimestamps.of((Record rec) -> rec.getInstant());
@Deprecated public WithTimestamps<T> withAllowedTimestampSkew(Duration allowedTimestampSkew)
allowed lateness
of a downstream
PCollection
may be silently dropped. See
https://issues.apache.org/jira/browse/BEAM-644 for details on a replacement.The default value is Duration.ZERO
, allowing timestamps to only be shifted into the
future. For infinite skew, use new Duration(Long.MAX_VALUE)
.
@Deprecated public Duration getAllowedTimestampSkew()
allowed lateness
of a downstream
PCollection
may be silently dropped. See
https://issues.apache.org/jira/browse/BEAM-644 for details on a replacement.DoFn.getAllowedTimestampSkew()
public PCollection<T> expand(PCollection<T> input)
PTransform
PTransform
on the given InputT
, and returns its
Output
.
Composite transforms, which are defined in terms of other transforms, should return the output of one of the composed transforms. Non-composite transforms, which do not apply any transforms internally, should return a new unbound output and register evaluators (via backend-specific registration methods).
expand
in class PTransform<PCollection<T>,PCollection<T>>