Memory (Heap) Profiling Guide
Note that the content below is Linux-exclusive.
What is Heap Profile?
A heap profiler records the stack trace of the allocation of each live object, so it’s possible that function A allocates something and then hand over it to struct B, in this case, the allocation will still be counted on A.
Internals
RisingWave uses tikv-jemallocator on Linux, which is a Rust wrapper of jemalloc, as its memory allocator. On other platforms, RisingWave uses the default allocator.
Luckily, jemalloc provides built-in profiling support (official wiki). jemallocator exposes the feature via a cargo feature ‘profiling’. Here is a simple guide to profiling with jemallocator.
For RisingWave, feat: support heap profiling from risedev by fuyufjh · Pull Request #4871 added all things needed. Please just follow the below steps.
Step 1 - Collect Memory Profiling Dump
Depends on the deployment, click the corresponding section to read the instructions.
1.1. Profile RisingWave (locally) with risedev
Run a local cluster in EC2 instance with an additional environment variable RISEDEV_ENABLE_HEAP_PROFILE
.
RISEDEV_ENABLE_HEAP_PROFILE=1 ./risedev d full
Here we use full
instead of compose-3node-deploy
because compose-3node-deploy
uses Docker container to run RisingWave processes, which makes it more difficult to do profiling and analyzing.
Under the hood, risedev
set environment variable MALLOC_CONF
for RisingWave process. Here is the implementation.
By default, the profiler will output a profile result on every 4GB memory allocation. Running a query and waiting for a while, lots of .heap
files will be generated in the current folder:
...
compactor.266308.15.i15.heap
compactor.266308.16.i16.heap
compactor.266308.17.i17.heap
compactor.266308.18.i18.heap
...
compute-node.266187.116.i116.heap
compute-node.266187.117.i117.heap
compute-node.266187.118.i118.heap
compute-node.266187.119.i119.heap
...
1.2. Profile RisingWave in testing pipelines
Currently, some testing pipelines such as longevity tests have enabled memory profiling by default, but some are not, such as performance benchmarks.
To enable heap profiling of compute nodes in benchmark pipelines, set environment variable when starting a job:
ENABLE_MEMORY_PROFILING=true
Under the hood, the pipeline script passes the value to kube-bench’s parameter benchmark.risingwave.compute.memory_profiling.enable
(code here, and then kube-bench sets the environment to RisingWave Pods (code here).
Note that this is only for compute nodes. If you need to run profiling on other nodes, or need to tune the parameters of profiling, you may modify the parameters in risingwave-test’s env.override.toml manually and run the job with that branch. (Example)
1.3. Profile RisingWave in Kubernetes/EKS
If you run into an OOM issue in Kukernetes, now you will need to enable memory profiling first and reproduce the problem.
To enable memory profiling, set the environment variables MALLOC_CONF
to Pods.
# Example: `statefulsets` for CN and Meta
kubectl edit statefulsets/benchmark-risingwave-compute-c
# Example: `deployments` for other nodes
kubectl edit deployments/benchmark-risingwave-connector-c
Add the MALLOC_CONF
env var. Note the prof_prefix
is used to specify the path and file names of dump. By default, /risingwave/cache/
is mounted to HostPath and will persist after Pod restarts, so we use it as dump path here.
env:
- name: MALLOC_CONF
value: prof:true,lg_prof_interval:38,lg_prof_sample:19,prof_prefix:/risingwave/cache/cn
The suggested values of lg_prof_interval
are different for different nodes. See risedev
code: compactor_service, compute_node_service.rs, meta_node_service.rs.
Afterwards, the memory dump should be outputted to the specified folder. Use kubectl cp
to download it to local.
1.4. Dump memory profile with risectl
You can manually dump a heap profiling with risectl for a compute node with Jemalloc profiling enabled (MALLOC_CONF=prof:true
).
./risedev ctl profile heap --dir [dumped_file_dir]
The dumped files will be saved in the directory you specified.
Note: To profile compute nodes remotely, please make sure all remote nodes have a public IP address accessible from your local machine (where you are running risedev
).
Step 2 - Analyze with jeprof
Note that each of the .heap
files are full snapshots instead of increments. Hence, simply pick the latest file (or any historical snapshot).
jeprof is a utility provided by jemalloc to analyze heap dump files. It reads both the executable binary and the heap dump to get a full heap profiling.
Note that the heap profiler dump file must be analyzed along with exactly the same binary that it generated from. If the memory dump is collected from Kubernetes, please refer to 2.2.
2.1. Use jeprof locally
jeprof
is already compiled in jemallocator and should be compiled by cargo, use it as follows:
# find jeprof binary
find . -name 'jeprof'
# set execution permission
chmod +x ./target/release/build/tikv-jemalloc-sys-22f0d47d5c562226/out/build/bin/jeprof
Faster jeprof
(recommend)
In some platforms jeprof
runs very slow. The bottleneck is addr2line
, if you want to speed up from 30 minutes to 3s, please use :
git clone https://github.com/gimli-rs/addr2line
cd addr2line
cargo b --examples -r
cp ./target/release/examples/addr2line <your-path>
2.2. Use jeprof in Docker images
jeprof
is included in RisingWave image v1.0.0
or later. For earlier versions, please copy an jeprof
manually into the container.
Find a Linux machine and use docker
command to start an environment with the specific RisingWave version. Here, -v $(pwd):/dumps
mounts current directory to /dumps
folder inside the container, so that you don’t need to copy the files in and out.
docker run -it --rm --entrypoint /bin/bash -v $(pwd):/dumps ghcr.io/risingwavelabs/risingwave:latest
Generate collapsed
file.
jeprof --collapsed binary_file heap_file > heap_file.collapsed
For example:
jeprof --collapsed /risingwave/bin/risingwave jeprof.198272.123.i123.heap > jeprof.198272.123.i123.heap.collapsed
Step 3 - Visualize Flame Graph
We recommend you to analyze collapsed
file with speedscope. Just drop the .collapsed
file into it. Click Left Heavy
in the top-left corner to merge shared calling stacks.
Alternative: Generate flame graph locally
Download and unarchive FlameGraph utility.
Run
./flamegraph.pl --color=mem --countname=bytes heap_file.collapsed > flamegraph.svg
Example:
./flamegraph.pl --color=mem --countname=bytes jeprof.198272.4741.i4741.collapsed > flamegraph.svg
By the way, the step 2 and 3 can be written in one line with pipe:
jeprof --collapsed target/release/risingwave compute-node.10404.2466.i2466.heap | ~/FlameGraph/flamegraph.pl --color=mem --countname=bytes > flamegraph.svg