risingwave_connector::parser

Trait Access

source
pub trait Access {
    // Required method
    fn access<'a>(
        &'a self,
        path: &[&str],
        type_expected: &DataType,
    ) -> Result<DatumCow<'a>, AccessError>;
}
Expand description

Access to a field in the data structure. Created by AccessBuilder.

It’s the ENCODE ... part in FORMAT ... ENCODE ...

Required Methods§

source

fn access<'a>( &'a self, path: &[&str], type_expected: &DataType, ) -> Result<DatumCow<'a>, AccessError>

Accesses path in the data structure (parsed Avro/JSON/Protobuf data), and then converts it to RisingWave Datum.

type_expected might or might not be used during the conversion depending on the implementation.

§Path

We usually expect the data (Access instance) is a record (struct), and path represents field path. The data (or part of the data) represents the whole row (Vec<Datum>), and we use different path to access one column at a time.

TODO: the meaning of path is a little confusing and maybe over-abstracted. access does not need to serve arbitrarily deep path access, but just “top-level” access. The API creates an illusion that arbitrary access is supported, but it’s not. Perhapts we should separate out another trait like ToDatum, which only does type mapping, without caring about the path. And path itself is only an enum instead of &[&str].

What path to access is decided by the CDC layer, i.e., the FORMAT ... part (ChangeEvent). e.g.,

  • DebeziumChangeEvent accesses ["before", "col_name"] for value, ["source", "db"], ["source", "table"] etc. for additional columns’ values, ["op"] for op type.
  • MaxwellChangeEvent accesses ["data", "col_name"] for value, ["type"] for op type.
  • In the simplest case, for FORMAT PLAIN/UPSERT (KvEvent), they just access ["col_name"] for value, and op type is derived.
§Returns

The implementation should prefer to return a borrowed DatumRef through DatumCow::Borrowed to avoid unnecessary allocation if possible, especially for fields with string or bytes data. If that’s not the case, it may return an owned Datum through DatumCow::Owned.

Implementors§