This contains the synced kv log store implementation.
Itβs meant to buffer a large number of records emitted from upstream,
to avoid overwhelming the downstream executor.
AppendOnlyDedupExecutor drops any message that has duplicate pk columns with previous
messages. It only accepts append-only input, and its output will be append-only as well.
ChainExecutor is an executor that enables synchronization between the existing stream and
newly appended executors. Currently, ChainExecutor is mainly used to implement MV on MV
feature. It pipes new data of existing MVs to newly created MV only all of the old data in the
existing MVs are dispatched.
DispatchExecutor consumes messages and send them into downstream actors. Usually,
data chunks will be dispatched with some specified policy, while control message
such as barriers will be distributed to all receivers.
EowcOverWindowExecutor consumes ordered input (on order key column with watermark in
ascending order) and outputs window function results. One EowcOverWindowExecutor can handle
one combination of partition key and order key.
FilterExecutor filters data with the expr. The expr takes a chunk of data,
and returns a boolean array on whether each item should be retained. And then,
FilterExecutor will insert, delete or update element into next executor according
to the result of the expression.
LookupExecutor takes one input stream and one arrangement. It joins the input stream with the
arrangement. Currently, it only supports inner join. See LookupExecutorParams for more
information.
Merges data from multiple inputs with order. If order = [2, 1, 0], then
it will first pipe data from the third input; after the third input gets a barrier, it will then
pipe the second, and finally the first. In the future we could have more efficient
implementation.
OverWindowExecutor consumes retractable input stream and produces window function outputs.
One OverWindowExecutor can handle one combination of partition key and order key.
ChainExecutor is an executor that enables synchronization between the existing stream and
newly appended executors. Currently, ChainExecutor is mainly used to implement MV on MV
feature. It pipes new data of existing MVs to newly created MV only all of the old data in the
existing MVs are dispatched.
ReceiverExecutor is used along with a channel. After creating a mpsc channel,
there should be a ReceiverExecutor running in the background, so as to push
messages down to the executors.
TroublemakerExecutor is used to make some trouble in the stream graph. Specifically,
it is attached to StreamScan and Source executors in insane mode. It randomly
corrupts the stream chunks it receives and sends them downstream, making the stream
inconsistent. This should ONLY BE USED IN INSANE MODE FOR TESTING PURPOSES.
The executor will generate a Watermark after each chunk.
This will also guarantee all later rows with event time less than the watermark will be
filtered.
MessageBatchInner is used exclusively by Dispatcher and the Merger/Receiver for exchanging messages between them.
It shares the same message type as the fundamental MessageInner, but batches multiple barriers into a single message.
If the input is append-only, AppendOnlyGroupTopNExecutor does not need
to keep all the rows seen. As long as a record
is no longer in the result set, it can be deleted.
If the input is append-only, AppendOnlyGroupTopNExecutor does not need
to keep all the rows seen. As long as a record
is no longer in the result set, it can be deleted.