Skip to content

[Notes] Profiling Devito compilation

George Bisbas edited this page Mar 13, 2025 · 1 revision

This document aims to help as a guide for profiling the hotspots in the Devito compilation pipeline. For reference, we aim to use the TTI example.

To find the right hotspots, ideally, you should drop the C-land execution time of operators to nearly zero. We want to drop to zero the percentage of op.apply() Thus, try to use only a few time steps, and shrink your problem size as much as possible. To stress the compiler more, it is probably helpful to increase the space order

DEVITO_LOGGING=DEBUG DEVITO_LANGUAGE=openmp python -m cProfile -s tottime -o profile_results.prof examples/seismic/tti/tti_example.py -so 16 -d 10 10 10 --tn 5 | head -20

gprof2dot -f pstats profile_results.prof -o profile_results.dot

dot -Tpdf profile_results.dot -o profile_results.pdf

# -n : This option eliminates nodes (functions) below a specified percentage threshold. It helps reduce the graph size by excluding less
# significant functions.

gprof2dot -f pstats -n 0.5 profile_results.prof -o profile_results.dot
# Usually 5%, 7%, or 8% work well enough.
gprof2dot -f pstats -n 5 profile_results.prof -o profile_results.dot
gprof2dot -f pstats -n 7.5 profile_results.prof -o profile_results.dot
gprof2dot -f pstats -n 10 profile_results.prof -o profile_results.dot

Useful links:

https://github.com/jrfonseca/gprof2dot/blob/main/README.md#usage

Clone this wiki locally