We want to avoid allocating a row for every vnode.
Instead we can just modify a single row, and dispatch it to state table to write.
This builds the following segments of the row:
Creates a data chunk builder for snapshot read.
If the rate_limit
is smaller than chunk_size
, it will take precedence.
This is so we can partition snapshot read into smaller chunks than chunk size.
Flush the data
Get new backfill pos from the chunk. Since chunk should have ordered rows, we can just take the
last row.
Recovers progress per vnode, so we know which to backfill.
See how it decodes the state with the inline comments.
Builds a new stream chunk with output_indices
.
Mark chunk:
For each row of the chunk, forward it to downstream if its pk <= current_pos
, otherwise
ignore it. We implement it by changing the visibility bitmap.
Mark chunk:
For each row of the chunk, forward it to downstream if its pk <= current_pos
for the
corresponding vnode
, otherwise ignore it.
We implement it by changing the visibility bitmap.
Schema
| vnode | pk | backfill_finished
| row_count
|
Schema
| vnode | pk | backfill_finished
| row_count
|
Persists the state per vnode based on BackfillState
.
We track the current committed state via committed_progress
so we know whether we need to persist the state or not.
Update backfill pos by vnode.