risingwave_connector::source::iceberg::parquet_file_handler

Function get_project_mask

source
pub fn get_project_mask(
    columns: Option<Vec<Column>>,
    metadata: &FileMetaData,
) -> ConnectorResult<ProjectionMask>
Expand description

Extracts a suitable ProjectionMask from a Parquet file schema based on the user’s requested schema.

This function is utilized for column pruning of Parquet files. It checks the user’s requested schema against the schema of the currently read Parquet file. If the provided columns are None or if the Parquet file contains nested data types, it returns ProjectionMask::all(). Otherwise, it returns only the columns where both the data type and column name match the requested schema, facilitating efficient reading of the RecordBatch.

§Parameters

  • columns: An optional vector of Column representing the user’s requested schema.
  • metadata: A reference to FileMetaData containing the schema and metadata of the Parquet file.

§Returns

  • A ConnectorResult<ProjectionMask>, which represents the valid columns in the Parquet file schema that correspond to the requested schema. If an error occurs during processing, it returns an appropriate error.