pub struct DataChunk {
columns: Arc<[ArrayRef]>,
visibility: Bitmap,
}
Expand description
Fields§
§columns: Arc<[ArrayRef]>
§visibility: Bitmap
Implementations§
source§impl DataChunk
impl DataChunk
pub(crate) const PRETTY_TABLE_PRESET: &'static str = "||--+-++| ++++++"
sourcepub fn new(columns: Vec<ArrayRef>, visibility: impl Into<Bitmap>) -> Self
pub fn new(columns: Vec<ArrayRef>, visibility: impl Into<Bitmap>) -> Self
Create a DataChunk
with columns
and visibility.
The visibility can either be a Bitmap
or a simple cardinality number.
sourcepub fn new_dummy(cardinality: usize) -> Self
pub fn new_dummy(cardinality: usize) -> Self
new_dummy
creates a data chunk without columns but only a cardinality.
sourcepub fn from_rows(rows: &[impl Row], data_types: &[DataType]) -> Self
pub fn from_rows(rows: &[impl Row], data_types: &[DataType]) -> Self
Build a DataChunk
with rows.
Panics if the rows
is empty.
Should prefer using DataChunkBuilder
instead to avoid unnecessary allocation
of rows.
sourcepub fn next_visible_row_idx(&self, row_idx: usize) -> Option<usize>
pub fn next_visible_row_idx(&self, row_idx: usize) -> Option<usize>
Return the next visible row index on or after row_idx
.
pub fn into_parts(self) -> (Vec<ArrayRef>, Bitmap)
pub fn into_parts_v2(self) -> (Arc<[ArrayRef]>, Bitmap)
pub fn from_parts(columns: Arc<[ArrayRef]>, visibilities: Bitmap) -> Self
pub fn dimension(&self) -> usize
sourcepub fn cardinality(&self) -> usize
pub fn cardinality(&self) -> usize
cardinality
returns the number of visible tuples
pub fn selectivity(&self) -> f64
pub fn with_visibility(&self, visibility: impl Into<Bitmap>) -> Self
pub fn visibility(&self) -> &Bitmap
pub fn set_visibility(&mut self, visibility: Bitmap)
pub fn is_compacted(&self) -> bool
pub fn column_at(&self, idx: usize) -> &ArrayRef
pub fn columns(&self) -> &[ArrayRef]
sourcepub fn data_types(&self) -> Vec<DataType>
pub fn data_types(&self) -> Vec<DataType>
Returns the data types of all columns.
sourcepub fn split_column_at(&self, idx: usize) -> (Self, Self)
pub fn split_column_at(&self, idx: usize) -> (Self, Self)
pub fn to_protobuf(&self) -> PbDataChunk
sourcepub fn compact(self) -> Self
pub fn compact(self) -> Self
compact
will convert the chunk to compact format.
Compacting removes the hidden rows, and returns a new visibility
mask which indicates this.
compact
has trade-offs:
Cost: It has to rebuild the each column, meaning it will incur cost of copying over bytes from the original column array to the new one.
Benefit: The main benefit is that the data chunk is smaller, taking up less memory. We can also save the cost of iterating over many hidden rows.
sourcepub fn uncompact(self, vis: Bitmap) -> Self
pub fn uncompact(self, vis: Bitmap) -> Self
Scatter a compacted chunk to a new chunk with the given visibility.
sourcepub fn compact_cow(&self) -> Cow<'_, Self>
pub fn compact_cow(&self) -> Cow<'_, Self>
Convert the chunk to compact format.
If the chunk is not compacted, return a new compacted chunk, otherwise return a reference to self.
pub fn from_protobuf(proto: &PbDataChunk) -> ArrayResult<Self>
sourcepub fn rechunk(
chunks: &[DataChunk],
each_size_limit: usize,
) -> ArrayResult<Vec<DataChunk>>
pub fn rechunk( chunks: &[DataChunk], each_size_limit: usize, ) -> ArrayResult<Vec<DataChunk>>
rechunk
creates a new vector of data chunk whose size is each_size_limit
.
When the total cardinality of all the chunks is not evenly divided by the each_size_limit
,
the last new chunk will be the remainder.
sourcepub fn get_hash_values<H: BuildHasher>(
&self,
column_idxes: &[usize],
hasher_builder: H,
) -> Vec<HashCode<H>>
pub fn get_hash_values<H: BuildHasher>( &self, column_idxes: &[usize], hasher_builder: H, ) -> Vec<HashCode<H>>
Compute hash values for each row. The number of the returning HashCodes
is self.capacity()
.
When skip_invisible_row
is true, the HashCode
for the invisible rows is arbitrary.
sourcepub fn row_at(&self, pos: usize) -> (RowRef<'_>, bool)
pub fn row_at(&self, pos: usize) -> (RowRef<'_>, bool)
Random access a tuple in a data chunk. Return in a row format.
§Arguments
pos
- Index of look up tupleRowRef
- Reference of data tuple- bool - whether this tuple is visible
sourcepub fn row_at_unchecked_vis(&self, pos: usize) -> RowRef<'_>
pub fn row_at_unchecked_vis(&self, pos: usize) -> RowRef<'_>
Random access a tuple in a data chunk. Return in a row format. Note that this function do not return whether the row is visible.
§Arguments
pos
- Index of look up tuple
sourcepub fn to_pretty(&self) -> impl Display
pub fn to_pretty(&self) -> impl Display
Returns a table-like text representation of the DataChunk
.
sourcepub fn keep_columns(&self, column_indices: &[usize]) -> Self
pub fn keep_columns(&self, column_indices: &[usize]) -> Self
Keep the specified columns and set the rest elements to null.
§Example
i i i i i i
1 2 3 --> keep_columns([1]) --> . 2 .
4 5 6 . 5 .
sourcepub fn project(&self, indices: &[usize]) -> Self
pub fn project(&self, indices: &[usize]) -> Self
Reorder (and possibly remove) columns.
e.g. if indices
is [2, 1, 0]
, and the chunk contains column [a, b, c]
, then the output
will be [c, b, a]
. If indices
is [2, 0], then the output will be [c, a]
.
If the input mapping is identity mapping, no reorder will be performed.
sourcepub fn project_with_vis(&self, indices: &[usize], visibility: Bitmap) -> Self
pub fn project_with_vis(&self, indices: &[usize], visibility: Bitmap) -> Self
Reorder columns and set visibility.
sourcepub fn reorder_rows(&self, indexes: &[usize]) -> Self
pub fn reorder_rows(&self, indexes: &[usize]) -> Self
Reorder rows by indexes.
sourcefn partition_sizes(&self) -> (usize, Vec<&ArrayRef>)
fn partition_sizes(&self) -> (usize, Vec<&ArrayRef>)
§Partition fixed size datums and variable length ones.
In some cases, we have fixed size for the entire column, when the datatypes are fixed size or the datums are constants. As such we can compute the size for it just once for the column.
Otherwise, for variable sized datatypes, such as varchar
,
we have to individually compute their sizes per row.
unsafe fn compute_size_of_variable_cols_in_row( variable_cols: &[&ArrayRef], row_idx: usize, ) -> usize
unsafe fn init_buffer( row_len_fixed: usize, variable_cols: &[&ArrayRef], row_idx: usize, ) -> Vec<u8> ⓘ
sourcepub fn serialize(&self) -> Vec<Bytes>
pub fn serialize(&self) -> Vec<Bytes>
Serialize each row into value encoding bytes.
The returned vector’s size is self.capacity()
and for the invisible row will give a empty
bytes.
sourcepub fn serialize_with(&self, serializer: &impl ValueRowSerializer) -> Vec<Bytes>
pub fn serialize_with(&self, serializer: &impl ValueRowSerializer) -> Vec<Bytes>
Serialize each row into bytes with given serializer.
This is similar to serialize
but it uses a custom serializer. Prefer serialize
if
possible since it might be more efficient due to columnar operations.
sourcepub fn estimate_value_encoding_size(&self, column_indices: &[usize]) -> usize
pub fn estimate_value_encoding_size(&self, column_indices: &[usize]) -> usize
Estimate size of hash keys. Their indices in a row are indicated by column_indices
.
Size here refers to the number of u8s required to store the serialized datum.
source§impl DataChunk
impl DataChunk
sourcepub fn rows(&self) -> DataChunkRefIter<'_> ⓘ
pub fn rows(&self) -> DataChunkRefIter<'_> ⓘ
Get an iterator for visible rows.
sourcepub fn rows_in(&self, range: Range<usize>) -> DataChunkRefIter<'_> ⓘ
pub fn rows_in(&self, range: Range<usize>) -> DataChunkRefIter<'_> ⓘ
Get an iterator for visible rows in range.
sourcepub fn rows_with_holes(&self) -> DataChunkRefIterWithHoles<'_> ⓘ
pub fn rows_with_holes(&self) -> DataChunkRefIterWithHoles<'_> ⓘ
Get an iterator for all rows in the chunk, and a None
represents an invisible row.
Trait Implementations§
source§impl DataChunkTestExt for DataChunk
impl DataChunkTestExt for DataChunk
source§fn from_pretty(s: &str) -> Self
fn from_pretty(s: &str) -> Self
source§fn with_invisible_holes(self) -> Selfwhere
Self: Sized,
fn with_invisible_holes(self) -> Selfwhere
Self: Sized,
source§fn assert_valid(&self)
fn assert_valid(&self)
source§fn gen_data_chunk(
chunk_offset: usize,
chunk_size: usize,
data_types: &[DataType],
varchar_properties: &VarcharProperty,
visibility_percent: f64,
) -> Self
fn gen_data_chunk( chunk_offset: usize, chunk_size: usize, data_types: &[DataType], varchar_properties: &VarcharProperty, visibility_percent: f64, ) -> Self
chunk_size
and column data types.source§fn gen_data_chunks(
num_of_chunks: usize,
chunk_size: usize,
data_types: &[DataType],
varchar_properties: &VarcharProperty,
visibility_percent: f64,
) -> Vec<Self>
fn gen_data_chunks( num_of_chunks: usize, chunk_size: usize, data_types: &[DataType], varchar_properties: &VarcharProperty, visibility_percent: f64, ) -> Vec<Self>
chunk_size
and column data types.source§impl EstimateSize for DataChunk
impl EstimateSize for DataChunk
source§fn estimated_heap_size(&self) -> usize
fn estimated_heap_size(&self) -> usize
source§fn estimated_size(&self) -> usizewhere
Self: Sized,
fn estimated_size(&self) -> usizewhere
Self: Sized,
estimated_heap_size
and the size of Self
.source§impl<'a> From<&'a StructArray> for DataChunk
impl<'a> From<&'a StructArray> for DataChunk
source§fn from(array: &'a StructArray) -> Self
fn from(array: &'a StructArray) -> Self
source§impl From<DataChunk> for StreamChunk
impl From<DataChunk> for StreamChunk
StreamChunk
can be created from DataChunk
with all operations set to Insert
.
source§impl From<DataChunk> for StructArray
impl From<DataChunk> for StructArray
impl StructuralPartialEq for DataChunk
Auto Trait Implementations§
impl Freeze for DataChunk
impl RefUnwindSafe for DataChunk
impl Send for DataChunk
impl Sync for DataChunk
impl Unpin for DataChunk
impl UnwindSafe for DataChunk
Blanket Implementations§
source§impl<T> BorrowMut<T> for Twhere
T: ?Sized,
impl<T> BorrowMut<T> for Twhere
T: ?Sized,
source§fn borrow_mut(&mut self) -> &mut T
fn borrow_mut(&mut self) -> &mut T
source§impl<T> CloneToUninit for Twhere
T: Clone,
impl<T> CloneToUninit for Twhere
T: Clone,
source§unsafe fn clone_to_uninit(&self, dst: *mut T)
unsafe fn clone_to_uninit(&self, dst: *mut T)
clone_to_uninit
)§impl<T> FutureExt for T
impl<T> FutureExt for T
§fn with_context(self, otel_cx: Context) -> WithContext<Self>
fn with_context(self, otel_cx: Context) -> WithContext<Self>
§fn with_current_context(self) -> WithContext<Self>
fn with_current_context(self) -> WithContext<Self>
§impl<T> Instrument for T
impl<T> Instrument for T
§fn instrument(self, span: Span) -> Instrumented<Self>
fn instrument(self, span: Span) -> Instrumented<Self>
§fn in_current_span(self) -> Instrumented<Self>
fn in_current_span(self) -> Instrumented<Self>
source§impl<T> Instrument for T
impl<T> Instrument for T
source§fn instrument(self, span: Span) -> Instrumented<Self>
fn instrument(self, span: Span) -> Instrumented<Self>
source§fn in_current_span(self) -> Instrumented<Self>
fn in_current_span(self) -> Instrumented<Self>
source§impl<T> IntoEither for T
impl<T> IntoEither for T
source§fn into_either(self, into_left: bool) -> Either<Self, Self>
fn into_either(self, into_left: bool) -> Either<Self, Self>
self
into a Left
variant of Either<Self, Self>
if into_left
is true
.
Converts self
into a Right
variant of Either<Self, Self>
otherwise. Read moresource§fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
self
into a Left
variant of Either<Self, Self>
if into_left(&self)
returns true
.
Converts self
into a Right
variant of Either<Self, Self>
otherwise. Read moresource§impl<T> IntoRequest<T> for T
impl<T> IntoRequest<T> for T
source§fn into_request(self) -> Request<T>
fn into_request(self) -> Request<T>
T
in a tonic::Request
§impl<T> IntoResult<T> for T
impl<T> IntoResult<T> for T
type Err = Infallible
fn into_result(self) -> Result<T, <T as IntoResult<T>>::Err>
source§impl<M> MetricVecRelabelExt for M
impl<M> MetricVecRelabelExt for M
source§fn relabel(
self,
metric_level: MetricLevel,
relabel_threshold: MetricLevel,
) -> RelabeledMetricVec<M>
fn relabel( self, metric_level: MetricLevel, relabel_threshold: MetricLevel, ) -> RelabeledMetricVec<M>
RelabeledMetricVec::with_metric_level
.source§fn relabel_n(
self,
metric_level: MetricLevel,
relabel_threshold: MetricLevel,
relabel_num: usize,
) -> RelabeledMetricVec<M>
fn relabel_n( self, metric_level: MetricLevel, relabel_threshold: MetricLevel, relabel_num: usize, ) -> RelabeledMetricVec<M>
RelabeledMetricVec::with_metric_level_relabel_n
.source§fn relabel_debug_1(
self,
relabel_threshold: MetricLevel,
) -> RelabeledMetricVec<M>
fn relabel_debug_1( self, relabel_threshold: MetricLevel, ) -> RelabeledMetricVec<M>
RelabeledMetricVec::with_metric_level_relabel_n
with metric_level
set to
MetricLevel::Debug
and relabel_num
set to 1.