T
- Type of elements read by the source.@Experimental(value=SOURCE_SINK) public abstract class Source<T> extends java.lang.Object implements java.io.Serializable, HasDisplayData
Source
for reading the input.
This class is not intended to be subclassed directly. Instead, to define a bounded source (a
source which produces a finite amount of input), subclass BoundedSource
; to define an
unbounded source, subclass UnboundedSource
.
A Source
passed to a Read
transform must be Serializable
. This allows
the Source
instance created in this "main program" to be sent (in serialized form) to
remote worker machines and reconstituted for each batch of elements of the input PCollection
being processed or for each source splitting operation. A Source
can have
instance variable state, and non-transient instance variable state will be serialized in the main
program and then deserialized on remote worker machines.
Source
classes MUST be effectively immutable. The only acceptable use of mutable
fields is to cache the results of expensive operations, and such fields MUST be marked transient
.
Source
objects should override Object.toString()
, as it will be used in
important error and debugging messages.
Modifier and Type | Class and Description |
---|---|
static class |
Source.Reader<T>
The interface that readers of custom input sources must implement.
|
Constructor and Description |
---|
Source() |
Modifier and Type | Method and Description |
---|---|
Coder<T> |
getDefaultOutputCoder()
Deprecated.
Override
getOutputCoder() instead. |
Coder<T> |
getOutputCoder()
Returns the
Coder to use for the data read from this source. |
void |
populateDisplayData(DisplayData.Builder builder)
Register display data for the given transform or component.
|
void |
validate()
Checks that this source is valid, before it can be used in a pipeline.
|
public void validate()
It is recommended to use Preconditions
for implementing this method.
@Deprecated public Coder<T> getDefaultOutputCoder()
getOutputCoder()
instead.public Coder<T> getOutputCoder()
Coder
to use for the data read from this source.public void populateDisplayData(DisplayData.Builder builder)
populateDisplayData(DisplayData.Builder)
is invoked by Pipeline runners to collect
display data via DisplayData.from(HasDisplayData)
. Implementations may call super.populateDisplayData(builder)
in order to register display data in the current namespace,
but should otherwise use subcomponent.populateDisplayData(builder)
to use the namespace
of the subcomponent.
By default, does not register any display data. Implementors may override this method to provide their own display data.
populateDisplayData
in interface HasDisplayData
builder
- The builder to populate with display data.HasDisplayData