Trait BatchSourceSplit

Source
pub trait BatchSourceSplit: SplitMetaData {
    // Required methods
    fn finished(&self) -> bool;
    fn finish(&mut self);
    fn refresh(&mut self);
}
Expand description

§Refreshable Batch Source/Table

A refreshable batch source can be refreshed - reload all data from the source, e.g., re-run a SELECT * query from the source. The reloaded data will be handled by RefreshableMaterialize to calculate a diff to send to downstream.

  • Batch means the source loads all data at once, instead of continuously streaming data.
  • Refreshable part is handled by the materialize executor. When creating a table with a refreshable batch source, the table can be refreshed by running REFRESH TABLE t SQL command.

See https://github.com/risingwavelabs/risingwave/issues/22690 for the whole picture of the user journey.

§Failover

Batch source is considered stateless. i.e., it’s consumption progress is not recorded, and cannot be resumed. The split metadata just represent “how to load the data”.

  • On startup, SourceExecutor will load data.
  • On RefreshStart barrier (from REFRESH TABLE t SQL command), it will re-load data.
  • On recovery, it will do nothing, regardless of whether it’s in the middle of loading data or not before crash.

Required Methods§

Source

fn finished(&self) -> bool

Source

fn finish(&mut self)

Mark the source as finished. Called after the source is exhausted. Then SourceExecutor will report to meta to send a LoadFinish barrier, and the RefreshableMaterialize will begin to calculate the diff.

Source

fn refresh(&mut self)

Refresh the source to make it ready for re-run. Called when receiving RefreshStart barrier.

Dyn Compatibility§

This trait is not dyn compatible.

In older versions of Rust, dyn compatibility was called "object safety", so this trait is not object safe.

Implementors§