Profiling/Benchmarking Tools C++ Profilers C++ Profilers
Intel VTune Profiler
- free of charge
- GUI & command line interface
- sampling-based profiling
- hardware event sampling for Intel chips
- locks & waits analysis
- memory access analysis
- storage analysis
- integrates with MS Visual Studio
Tracy
- integrates with apps on source-code level
- client-server architecture for remote profiling
- real-time profiling with nanosecond resolution
- memory allocations, locks, context switches
- possible to profile OpenGL, Vulkan, Direc3d, OpenCL
Coz – Causal Profiler
- unique approach to profiling
- creates causal profile: "optimizing function X will have effect Y"
- profile is based on performance experiments
- program is partitioned into parts based on progress points (that are set in source code)
- no additional instrumentation of source code required
NVIDIA Nsight Compute
- profiling for NVIDIA GPUs (architectures: Pascal, Volta, Turing)
- supplants NVIDIA Visual Profiler
AMD uProf
- sampling profiler for AMD CPUs and some AMD GPUs
- instruction based sampling, micro-architecture analysis,
- call stack sampling, timer-based profiling
- cache analysis, power profiling
Benchmarking Libraries / Frameworks Libraries / Frameworks Libraries
- Apache-2.0
Google Benchmark
- BSD-3-Clause
gperftools
(originally Google Performance Tools)
- high-performance multithreaded
malloc()
- heap checker
- heap profiler
- cpu profiler
- Apache-2.0
Celero
Timing Tools
Hyperfine
Benchmarking of executables similar to the classic 'time' command but much more sophisticated.
- statistical analysis across multiple runs
- support for arbitrary shell commands
- constant feedback about the benchmark progress and current estimates
- warmup runs can be executed before the actual benchmark
- cache-clearing commands can be set up before each timing run
- statistical outlier detection to detect interference from other programs and caching effects
- export results to various formats: CSV, JSON, Markdown, AsciiDoc
- parameterized benchmarks (e.g. vary the number of threads)
- cross-platform
Comments…