AggSpillManager
is used to manage how to write spill data file and read them back.
The spill data first need to be partitioned. Each partition contains 2 files: agg_state_file
and input_chunks_file
.
The spill file consume a data chunk and serialize the chunk into a protobuf bytes.
Finally, spill file content will look like the below.
The file write pattern is append-only and the read pattern is sequential scan.
This can maximize the disk IO performance.
DeleteExecutor
implements table deletion with values from its child executor.
Distributed Lookup Join Executor.
High level Execution flow:
Repeat 1-3:
Group Top-N Executor
HashAggExecutor
implements the hash aggregate algorithm.
Hash Join Executor
InsertExecutor
implements table insertion with values from its child executor.
Limit executor.
Local Lookup Join Executor.
High level Execution flow:
Repeat 1-3:
Lookup Join Base.
Used by LocalLookupJoinExecutor
and DistributedLookupJoinExecutor
.
ManagedExecutor
build on top of the underlying executor. For now, it does two things:
MergeSortExchangeExecutor2
takes inputs from multiple sources and
The outputs of all the sources have been sorted in the same way.
MySqlQuery
executor. Runs a query against a MySql
database.
Nested loop join executor.
PostgresQuery
executor. Runs a query against a Postgres database.
Id of one row in chunked data.
Executor that scans data from row table
Range for batch scan.
SortAggExecutor
implements the sort aggregate algorithm, which assumes
that the input chunks has already been sorted by group columns.
The aggregation will be applied to tuples within the same group.
And the output schema is [group columns, agg result]
.
Sort Executor
SortOverWindowExecutor
accepts input chunks sorted by partition key and order key, and
outputs chunks with window function result columns.
Top-N Executor
UpdateExecutor
implements table update with values from its child executor and given
expressions.