Module iceberg_intermediate_scan_rule

Module iceberg_intermediate_scan_rule 

Source
Expand description

This rule materializes a LogicalIcebergIntermediateScan to the final LogicalIcebergScan with delete file anti-joins.

This is the final step in the Iceberg scan optimization pipeline:

  1. LogicalSource -> LogicalIcebergIntermediateScan
  2. Predicate pushdown and column pruning on LogicalIcebergIntermediateScan
  3. LogicalIcebergIntermediateScan -> LogicalIcebergScan (this rule)

At this point, the intermediate scan has accumulated:

  • The predicate to be pushed down to Iceberg
  • The output column indices for projection

This rule:

  1. Reads file scan tasks from Iceberg (data files and delete files)
  2. Creates the LogicalIcebergScan for data files with pre-computed splits
  3. Creates anti-joins for equality delete and position delete files
  4. Adds a project if output columns differ from scan columns

Structsยง

IcebergIntermediateScanRule

Functionsยง

build_column_catalogs ๐Ÿ”’
Builds a mapping of column names to their catalogs by looking them up from a catalog map.
build_equal_conditions ๐Ÿ”’
Builds equality conditions between two sets of input references.
build_equality_delete_hashjoin_scan
build_position_delete_hashjoin_scan
empty_table_plan ๐Ÿ”’
Returns an empty table plan with the same schema as the scan.
set_project_field_ids ๐Ÿ”’
Sets the project field IDs for a list of files based on column names.