T - Type of elements read by the source.@Experimental(value=SOURCE_SINK) public abstract class Source<T> extends java.lang.Object implements java.io.Serializable, HasDisplayData
Source for reading the input.
This class is not intended to be subclassed directly. Instead, to define
a bounded source (a source which produces a finite amount of input), subclass
BoundedSource; to define an unbounded source, subclass UnboundedSource.
A Source passed to a Read transform must be
Serializable. This allows the Source instance
created in this "main program" to be sent (in serialized form) to
remote worker machines and reconstituted for each batch of elements
of the input PCollection being processed or for each source splitting
operation. A Source can have instance variable state, and
non-transient instance variable state will be serialized in the main program
and then deserialized on remote worker machines.
Source classes MUST be effectively immutable. The only acceptable use of
mutable fields is to cache the results of expensive operations, and such fields MUST be
marked transient.
Source objects should override Object.toString(), as it will be
used in important error and debugging messages.
| Modifier and Type | Class and Description |
|---|---|
static class |
Source.Reader<T>
The interface that readers of custom input sources must implement.
|
| Constructor and Description |
|---|
Source() |
| Modifier and Type | Method and Description |
|---|---|
abstract Coder<T> |
getDefaultOutputCoder()
Returns the default
Coder to use for the data read from this source. |
void |
populateDisplayData(DisplayData.Builder builder)
Register display data for the given transform or component.
|
abstract void |
validate()
Checks that this source is valid, before it can be used in a pipeline.
|
public abstract void validate()
It is recommended to use Preconditions for implementing
this method.
public abstract Coder<T> getDefaultOutputCoder()
Coder to use for the data read from this source.public void populateDisplayData(DisplayData.Builder builder)
populateDisplayData(DisplayData.Builder) is invoked by Pipeline runners to collect
display data via DisplayData.from(HasDisplayData). Implementations may call
super.populateDisplayData(builder) in order to register display data in the current
namespace, but should otherwise use subcomponent.populateDisplayData(builder) to use
the namespace of the subcomponent.
By default, does not register any display data. Implementors may override this method to provide their own display data.
populateDisplayData in interface HasDisplayDatabuilder - The builder to populate with display data.HasDisplayData