Skip to content

Performance Analysis

Performance Analysis Tools Version
GNU gprof 2.43, 2.35
Scalasca 2.6
mpiP 3.5
nvprof 11.4.120, 11.6.124, 11.8.87, 12.3.101, 12.5.82, 12.8.90

GPROF

GNU profiler gprof

module load gnu

GNU gprof is a widely used profiling tool for Unix systems which produces an execution profile of C and Fortran programs. It can show the application call graph, which represents the calling relationships between functions in the program, and the percentage of total execution time spent in each function.

Compile and Link your code with -pg flag

gcc [flags] -g [source_file] -o [output_file] -pg

Invoke gprof to analyse and display profiling results.

gprof options [executable-file] gmon.out bb-data [yet-more-profile-data-files...] [> outfile]

Output Options

  • --flat-profile : prints the total amount of time spent and the number of calls to each function
  • --graph: prints the call-graph analysis from the application execution
  • --annotated-source : prints profiling information next to the original source code

For more information visit: GPROF Documentation

SCALASCA

Scalasca is a software tool that supports the performance optimization of parallel programs by measuring and analyzing their runtime behavior. The analysis identifies potential performance bottlenecks – in particular those concerning communication and synchronization – and offers guidance in exploring their causes.

module load scalasca

For more information visit: Scalasca Documentation

mpiP

mpiP is a lightweight profiling library for MPI applications. Because it only collects statistical information about MPI functions, mpiP generates considerably less overhead and much less data than tracing tools. All the information captured by mpiP is task-local. It only uses communication during report generation, typically at the end of the experiment, to merge results from all of the tasks into one output file.

module load mpiP
mpif90 mycode.f -g -L$MPIPROOT/lib -lmpiP -lbfd -lunwind -o mycode.x

In your slurm script just run the executable

srun mycode.x

after completion check the report file mycode.x.NPROCS.PID.mpiP

For more information visit: mpiP Documentation

nvprof

You can you use the nvprof to collect and view profiling data from the command-line. To use it import a cuda version with: module load cuda.

For more information visit: http://docs.nvidia.com/cuda/profiler-users-guide/index.html#nvprof-overview

nvprof <GPU_EXECUTABLE>
nvprof --export-profile timeline.nvprof <GPU_EXECUTABLE>

To view collected timeline data, the timeline.nvprof file can be imported into nvvp as described in Import Single-Process nvprof Session - See more at: http://docs.nvidia.com/cuda/profiler-users-guide/index.html#import-session

MPI Profiling

The nvprof profiler can be used to profile individual MPI processes. For more information visit: http://docs.nvidia.com/cuda/profiler-users-guide/index.html#mpi-profiling

srun nvprof -o output.%h.%p.%q{SLURM_PROCID} <GPU_EXECUTABLE>