perf - check the performance arguments of a program

If you want to investigate the performance arguments, like clock cycles, cpu utilize rate, etc., of a program under Linux, perf is a great tool here to help you.

Installation

$ sudo apt update && sudo apt install linux-tools-$(uname -r) linux-tools-generic

(ref: http://www.tecmint.com/perf-performance-monitoring-and-analysis-tool-for-linux/)

Simple Usage

$ ll test
-rwxr-xr-x 1 root root 8608 Mar 14 10:55 test*
$ perf stat ./test
haha                 <-------- this is the output of test*
 Performance counter stats for './test':

          0.433868      task-clock (msec)         #    0.564 CPUs utilized          
                 0      context-switches          #    0.000 K/sec                  
                 0      cpu-migrations            #    0.000 K/sec                  
                49      page-faults               #    0.113 M/sec                  
   <not supported>      cycles                   
   <not supported>      stalled-cycles-frontend  
   <not supported>      stalled-cycles-backend   
   <not supported>      instructions             
   <not supported>      branches                 
   <not supported>      branch-misses            

       0.000769191 seconds time elapsed

Some items say , like cycles/instructions. It's because the linux kernel doesn't support them.

$ perf list

You can view all the available arguments here.

$ perf stat -e cycles-ct -e cycles-t ./test
 Performance counter stats for './test':

           745,964      cycles-ct                                                   
           747,306      cycles-t                                                    

       0.000888103 seconds time elapsed

This might give you some hints about the cpu cycles

$ perf stat -e task-clock ./test
 Performance counter stats for './test':

          0.423036      task-clock (msec)         #    0.511 CPUs utilized          

       0.000828073 seconds time elapsed

task-clock means the total cpu(s) execution time (msec) for just this task. 0.511 represents the utilize rate of this task over the whole cpu(s).
This number is calculated by task-clock/totaltime(shown in the last line). In this case, 0.511 = 0.423036(msec)/0.000828073(sec)

Reference