R&D/OS

[Summary] The New Linux perf Tools

sunshout 2013. 3. 22. 22:48


[2010] The New Linux perf Tools.pdf


Title : The New Linux 'perf' Tools


Source code : <Linux Kernel source directory>/tools/perf/

Reference : https://perf.wiki.kernel.org/index.php/Tutorial


Motivation: 

Modern CPUs have hardware dedicated to counting events associated with performance.


ex) Intel provides performance monitoring after Pentium process, which is supported by MSR(machine specific register)


Functions:

- record

- report

- annotate

- diff

- probe

- test

- tracing


List 

List shows the list of events that can be monitored


Command: perf list

List of pre-defined events (to be used in -e):

  cpu-cycles OR cycles                               [Hardware event]

  instructions                                       [Hardware event]

  cache-references                                   [Hardware event]

  cache-misses                                       [Hardware event]

  branch-instructions OR branches                    [Hardware event]

  branch-misses                                      [Hardware event]

  bus-cycles                                         [Hardware event]

  stalled-cycles-frontend OR idle-cycles-frontend    [Hardware event]

  stalled-cycles-backend OR idle-cycles-backend      [Hardware event]


  cpu-clock                                          [Software event]

  task-clock                                         [Software event]

  page-faults OR faults                              [Software event]

  context-switches OR cs                             [Software event]

  cpu-migrations OR migrations                       [Software event]

  minor-faults                                       [Software event]

  major-faults                                       [Software event]

  alignment-faults                                   [Software event]

  emulation-faults                                   [Software event]


  L1-dcache-loads                                    [Hardware cache event]

  L1-dcache-load-misses                              [Hardware cache event]

  L1-dcache-stores                                   [Hardware cache event]

...



Top

Top shows the snapshot of system activity according to a specific event or cycles.

Command: perf top



   PerfTop:    3750 irqs/sec  kernel:26.6%  exact:  0.0% [1000Hz cycles],  (all, 16 CPUs)

----------------------------------------------------------------------------------------


             samples  pcnt function                        DSO

             _______ _____ _______________________________ __________________


             1827.00 41.9% processQueuedPacket             libnprobe-4.9.4.so

              306.00  7.0% intel_idle                      [kernel.kallsyms]

              202.00  4.6% ixgbe_clean_rx_ring             [ixgbe]

               74.00  1.7% PyEval_EvalFrameEx              /usr/bin/python2.7

...

Stat
count the number of events that took place while some workload.
Command: perf stat <command>

ex) perf stat sleep 1
root@cnode01-m:/etc/init.d# perf stat sleep 1

 Performance counter stats for 'sleep 1':

          0.641173 task-clock                #    0.001 CPUs utilized
                 2 context-switches          #    0.003 M/sec
                 1 CPU-migrations            #    0.002 M/sec
               171 page-faults               #    0.267 M/sec
         1,771,360 cycles                    #    2.763 GHz
         1,397,665 stalled-cycles-frontend   #   78.90% frontend cycles idle
         1,054,003 stalled-cycles-backend    #   59.50% backend  cycles idle
           664,300 instructions              #    0.38  insns per cycle
                                             #    2.10  stalled cycles per insn
           124,352 branches                  #  193.945 M/sec
             6,867 branch-misses             #    5.52% of all branches

       1.042632385 seconds time elapsed

Issue: 
- Full Virtualization may show wrong stat. (# of instruction is too big)
  . may be wrong register value, since vcpu is changed by cpu scheduler
  . we need to check stat using cpu pinning

root@ubuntu-hvm:~# perf stat sleep 1

 Performance counter stats for 'sleep 1':

          1.354863 task-clock                #    0.001 CPUs utilized
                 1 context-switches          #    0.738 K/sec
                 0 CPU-migrations            #    0.000 K/sec
               149 page-faults               #    0.110 M/sec
                 0 cycles                    #    0.000 GHz
                 0 stalled-cycles-frontend   #    0.00% frontend cycles idle
                 0 stalled-cycles-backend    #    0.00% backend  cycles idle
     4,294,967,294 instructions              #    0.00  insns per cycle
                 0 branches                  #    0.000 K/sec
     <not counted> branch-misses

       1.002176528 seconds time elapsed

- Para virtualization does not show stat.

root@ubuntu11:/usr/src# perf stat sleep 1

  Error: cache-misses event is not supported.

  Fatal: Not all events could be opened.



o Perf with KVM

$ perf kvm [--host] [--guest] [--guestmount=<path> [--guestkalsyms=<path> --guestmodules=<path> | --guestvmlinux=<path>]] {top|record|report|diff|buildid-list}


- http://infoscience.epfl.ch/record/162329/files/VEE11_performance_profiling_of_virtual_machines.pdf

- http://www.linux-kvm.org/page/Perf_events


패키지 설치

(CentOS) $ yum install perf



record

특정 프로그램의 profile 을 측정하기 위해서

perf record program [program_options]


현재 돌고 있는 process 를 모니터링 하려면

perf record -p pid sleep <기록 시간>



Build

custom kernel 의 경우, perf 를 빌드해서 사용해야 한다.

cd $KERNEL_SRC/tools/perf

make


perf 에서 kernel symbol table 이름이 나오지 않을 경우, 해당 커널 모듈의 디렉토리에 존재하는 vmlinx 파일명을 vmlinux.OFF 로 변경

root@cnode14-m:/lib/modules/3.13.11+/build# mv vmlinux vmlinux.OFF

참고:

http://events.linuxfoundation.org/sites/events/files/lcjp13_takata.pdf