Performance Analysis¶
| Performance Analysis Tools | Version |
|---|---|
| GNU gprof | 2.43, 2.35 |
| Scalasca | 2.6 |
| mpiP | 3.5 |
| nvprof | 11.4.120, 11.6.124, 11.8.87, 12.3.101, 12.5.82, 12.8.90 |
GPROF¶
GNU profiler gprof
module load gnu
GNU gprof is a widely used profiling tool for Unix systems which produces an execution profile of C and Fortran programs. It can show the application call graph, which represents the calling relationships between functions in the program, and the percentage of total execution time spent in each function.
Compile and Link your code with -pg flag
gcc [flags] -g [source_file] -o [output_file] -pg
Invoke gprof to analyse and display profiling results.
gprof options [executable-file] gmon.out bb-data [yet-more-profile-data-files...] [> outfile]
Output Options
--flat-profile: prints the total amount of time spent and the number of calls to each function--graph: prints the call-graph analysis from the application execution--annotated-source: prints profiling information next to the original source code
For more information visit: GPROF Documentation
SCALASCA¶
Scalasca is a software tool that supports the performance optimization of parallel programs by measuring and analyzing their runtime behavior. The analysis identifies potential performance bottlenecks – in particular those concerning communication and synchronization – and offers guidance in exploring their causes.
module load scalasca
For more information visit: Scalasca Documentation
mpiP¶
mpiP is a lightweight profiling library for MPI applications. Because it only collects statistical information about MPI functions, mpiP generates considerably less overhead and much less data than tracing tools. All the information captured by mpiP is task-local. It only uses communication during report generation, typically at the end of the experiment, to merge results from all of the tasks into one output file.
Compile and Link with the mpiP library¶
module load mpiP
mpif90 mycode.f -g -L$MPIPROOT/lib -lmpiP -lbfd -lunwind -o mycode.x
In your slurm script just run the executable
srun mycode.x
after completion check the report file mycode.x.NPROCS.PID.mpiP
For more information visit: mpiP Documentation
nvprof¶
You can you use the nvprof to collect and view profiling data from the command-line. To use it import a cuda version with: module load cuda.
For more information visit: http://docs.nvidia.com/cuda/profiler-users-guide/index.html#nvprof-overview
nvprof <GPU_EXECUTABLE>
nvprof --export-profile timeline.nvprof <GPU_EXECUTABLE>
To view collected timeline data, the timeline.nvprof file can be imported into nvvp as described in Import Single-Process nvprof Session - See more at: http://docs.nvidia.com/cuda/profiler-users-guide/index.html#import-session
MPI Profiling¶
The nvprof profiler can be used to profile individual MPI processes. For more information visit: http://docs.nvidia.com/cuda/profiler-users-guide/index.html#mpi-profiling
srun nvprof -o output.%h.%p.%q{SLURM_PROCID} <GPU_EXECUTABLE>