pub fn extract_valid_column_indices(
columns: Option<Vec<Column>>,
metadata: &FileMetaData,
) -> ConnectorResult<Vec<usize>>
Expand description
Extracts valid column indices from a Parquet file schema based on the user’s requested schema.
This function is used for column pruning of Parquet files. It calculates the intersection
between the columns in the currently read Parquet file and the schema provided by the user.
This is useful for reading a RecordBatch
with the appropriate ProjectionMask
, ensuring that
only the necessary columns are read.
§Parameters
columns
: A vector ofColumn
representing the user’s requested schema.metadata
: A reference toFileMetaData
containing the schema and metadata of the Parquet file.
§Returns
- A
ConnectorResult<Vec<usize>>
, which contains the indices of the valid columns in the Parquet file schema that match the requested schema. If an error occurs during processing, it returns an appropriate error.