Actor
is the basic execution unit in the streaming framework.
Shared by all operators of an actor.
AppendOnlyDedupExecutor
drops any message that has duplicate pk columns with previous
messages. It only accepts append-only input, and its output will be append-only as well.
Schema: | vnode | pk โฆ | backfill_finished
| row_count
|
We can decode that into BackfillState
on recovery.
The generic type M
is the mutation type of the barrier.
The executor only for receiving barrier from the meta service. It always resides in the leaves
of the streaming graph.
ChainExecutor
is an executor that enables synchronization between the existing stream and
newly appended executors. Currently,
ChainExecutor
is mainly used to implement MV on MV
feature. It pipes new data of existing MVs to newly created MV only all of the old data in the
existing MVs are dispatched.
DispatchExecutor
consumes messages and send them into downstream actors. Usually,
data chunks will be dispatched with some specified policy, while control message
such as barriers will be distributed to all receivers.
EowcOverWindowExecutor
consumes ordered input (on order key column with watermark in
ascending order) and outputs window function results. One
EowcOverWindowExecutor
can handle
one combination of partition key and order key.
Static information of an executor.
This struct represents an external table to be read during backfill
FilterExecutor
filters data with the expr
. The expr
takes a chunk of data,
and returns a boolean array on whether each item should be retained. And then,
FilterExecutor
will insert, delete or update element into next executor according
to the result of the expression.
HashAggExecutor
could process large amounts of data using a state backend. It works as
follows:
LookupExecutor
takes one input stream and one arrangement. It joins the input stream with the
arrangement. Currently, it only supports inner join. See LookupExecutorParams
for more
information.
Merges data from multiple inputs with order. If order = [2, 1, 0]
, then
it will first pipe data from the third input; after the third input gets a barrier, it will then
pipe the second, and finally the first. In the future we could have more efficient
implementation.
MaterializeBuffer
is a buffer to handle chunk into KeyOp
.
A cache for materialize executors.
MaterializeExecutor
materializes changes in stream into a materialized view on storage.
MergeExecutor
merges data from multiple channels. Dataflow from one channel
will be stopped on barrier.
No-op executor directly forwards the input stream. Currently used to break the multiple edges in
the fragment graph.
ProjectExecutor
project data with the expr
. The expr
takes a chunk of data,
and returns a new data chunk. And then, ProjectExecutor
will insert, delete
or update element into next operator according to the result of the expression.
ProjectSetExecutor
projects data with the expr
. The expr
takes a chunk of data,
and returns a new data chunk. And then, ProjectSetExecutor
will insert, delete
or update element into next operator according to the result of the expression.
ChainExecutor
is an executor that enables synchronization between the existing stream and
newly appended executors. Currently, ChainExecutor
is mainly used to implement MV on MV
feature. It pipes new data of existing MVs to newly created MV only all of the old data in the
existing MVs are dispatched.
ReceiverExecutor
is used along with a channel. After creating a mpsc channel,
there should be a ReceiverExecutor
running in the background, so as to push
messages down to the executors.
SimpleAggExecutor
is the aggregation operator for streaming system.
To create an aggregation operator, states and expressions should be passed along the
constructor.
TroublemakerExecutor
is used to make some trouble in the stream graph. Specifically,
it is attached to
StreamScan
and
Source
executors in
insane mode. It randomly
corrupts the stream chunks it receives and sends them downstream, making the stream
inconsistent. This should ONLY BE USED IN INSANE MODE FOR TESTING PURPOSES.
UnionExecutor
merges data from multiple inputs.
Execute values
in stream. As is a leaf, current workaround holds a barrier_executor
.
May refractor with BarrierRecvExecutor
in the near future.
The executor will generate a Watermark
after each chunk.
This will also guarantee all later rows with event time less than the watermark will be
filtered.
WrapperExecutor
will do some sanity checks and logging for the wrapped executor.