risingwave_stream::cache

Function cache_may_stale

source
pub(crate) fn cache_may_stale(
    previous_vnode_bitmap: &Bitmap,
    current_vnode_bitmap: &Bitmap,
) -> bool
Expand description

Returns whether we’re unsure about the fressness of the cache after the scaling from the previous partition to the current one, denoted by vnode bitmaps. If the value is true, we must evict the cache entries that does not belong to the previous partition before further processing.

TODO: currently most executors simply clear all the cache entries if true, we can make the cache aware of the consistent hashing to avoid unnecessary eviction in the future. Check this issue.

TODO: may encapsulate the logic into ExecutorCache when ready.

§Explanation

We use a lazy manner to manipulate the cache. When scaling out, the partition of the existing executors will likely shrink and becomes a subset of the previous one (to ensure the best locality). In this case, this function will return false and the cache entries that’re not in the current partition anymore are still kept. This achieves the best performance as we won’t touch and valiate the cache at all when scaling-out, which is the common case and the critical path.

This brings a problem when scaling in after a while. Some partitions may be reassigned back to the current executor, while the cache entries of these partitions are still unevicted. So it’s possible that these entries have been updated by other executors on other workers, and the content is now stale! The executor must evict these entries which are not in the previous partition before further processing.