Digger: the tool for tracing of unix processes
Contents:
1. Introduction
2. New Features
3. How to use
4. Oracle kernel layers and functions
5. Important notes
6. Examples
7. Post Scriptum
Introduction
I would like to introduce new tool – Digger.
This tool allows to see tree of process’ calls such as application calls, library, system calls (and even kernel functions and OS scheduler actions) with additional information as function arguments, result, cpu & elapsed time.
This is not something Oracle specific, the tool can be used for tracing of any unix process (DTrace is required).
Sourcecode: digger.sh
Examples:
CPU# timestamp relative elapsed cpu module call(args) = return ---- --------- -------- ------- ------ --------- --------------------------------- 4 10650 296149 . . oracle -> kdsgrp(0xFFFFFD7FFDC28CC8, 0x0, 0xFFFFFD7FFDC28CC8) 4 10662 296150 . . oracle -> kcbispnd(0xFFFFFD7FFDC28CE0, 0x0, 0xFFFFFD7FFDC28CC8) 4 10670 296151 8 0 oracle <- kcbispnd = 0x0 4 10686 296153 . . oracle -> ktrget2(0xFFFFFD7FFDC28CD0, 0xFFFFFD7FFDC28BF8, 0x360) 4 10698 296156 . . oracle -> ktsmg_max_query(0x0, 0x0, 0xFFFFFD7FFDC11B28) 4 10707 296157 8 0 oracle <- ktsmg_max_query = 0x0 4 10724 296160 . . oracle -> kcbgtcr(0xFFFFFD7FFDC28CE0, 0x0, 0x360) 4 10735 296162 . . oracle -> ktrexf(0xFFFFFD7FFFDFADD0, 0xE771F80, 0x0) 4 10743 296163 7 0 oracle <- ktrexf = 0x9 4 10758 296166 . . oracle -> kcbzgs(0x1, 0xE771F80, 0x1) 4 10768 296168 . . oracle -> kssadf_numa_intl(0x26, 0x91F2BB48, 0x8DCA3948) 4 10777 296169 8 1 oracle <- kssadf_numa_intl = 0x91A75F98 4 10790 296170 32 3 oracle <- kcbzgs = 0x91A75F98 4 10805 296171 . . oracle -> kcbz_fp_buf(0x80FF38A8, 0x91A76018, 0x1) 4 10813 296172 8 1 oracle <- kcbz_fp_buf = 0x1 4 10826 296173 102 13 oracle <- kcbgtcr = 0x80EE8014 4 10840 296176 . . oracle -> kcbcge(0xFFFFFD7FFDC28CE0, 0x0, 0x0) 4 10848 296176 7 0 oracle <- kcbcge = 0x0 4 10862 296178 . . oracle -> ktcckv(0xFFFFFD7FFDC28CE0, 0xFFFFFD7FFDC281F4, 0x0) 4 10870 296179 7 0 oracle <- ktcckv = 0x0 4 10881 296180 195 26 oracle <- ktrget2 = 0x80EE807C 4 10895 296181 244 32 oracle <- kdsgrp = 0x80EE9FAD
CPU# timestamp relative elapsed cpu module call(args) = return ---- --------- -------- ------- ------ --------- --------------------------------- 1 28814 204036 . . oracle -> ksfd_skgfqio(0xFFFFFD7FFFDF8B70, 0x4, 0x0) 1 28834 204043 . . oracle -> skgfqio(0xFFFFFD7FFDD79580, 0xFFFFFD7FFFDF8B70, 0x4) 1 28851 204049 . . oracle -> skgfrvldtrq(0xFFFFFD7FFFDF8B70, 0x4, 0xFFFFFD7FFDD79580) 1 28880 204051 29 1 oracle <- skgfrvldtrq = 0x0 1 28904 204054 . . syscall => pread(0x101, 0x7F4E2000, 0x2000) 1 28923 204064 19 9 syscall <= pread = 0x2000 1 28941 204066 107 23 oracle <- skgfqio = 0x0 1 28958 204068 143 31 oracle <- ksfd_skgfqio = 0x0
The Digger can be considered as improved and extended version of open source tool
dapptrace, the tool from DTraceToolkit, included in several operating systems, developed by Brendan Gregg.
I thought should it be new version of dapptrace or dedicated tool and decided that scope of the Digger is wider then dapptrace, thus it should be considered as dedicated branch with dedicated name. Perhaps some features of the Digger will be propagated into dapptrace. I am working on it.
New features of the Digger
1) The capability to trace calls only inside some exact function.
If the option is set then all tracing will be enabled only inside this function only.
Original version of dapptrace is not convinient to be used on Oracle processes.
Oracle process has hundreds of thousands of functions (~166K functions in 11.2.0.2) and if you are going to trace some the simplest “select * from dual” you will get huge amount of info where you most likely need only very small part.
Usually it is required to trace only some operation, for example it can be function performing some operation as Full Table Scan, or parse call, I/O calls, etc.
2) The capability to filter traced calls.
It means to show not all calls but only some functions. For example to trace only kcb* (Kernel Cache Buffer laers’s calls) calls.
This especially convinient in Oracle because in Oracle all function have prefixes denoting affiliation to kernel subsystem.
For example: k*: Kernel, kc*: Kernel Cache, kt*: Kernel Transaction layer, etc.
3) The capability to trace system calls
These features implemented as following options:
-f function(s) # to trace calls inside some function (patterns * and ? are allowed) -t function(s) # calls filter -s function(s) # trace syscalls with filter -S # trace all syscalls
Parameters “function(s)” can be
– exact function name
– pattern using symbols * or ?, example: ktr*
– list separated by commas, example: kcbget,kds*
– function can include module name, example: libc.so.1:gethrtime
If the module is not set then the module is determined by option -U (if set then all modules, otherwise only application calls are traced).
Valid usage:
digger -p PID -FS -f kdsgrp
means to trace all application calls and system calls inside function kdsgrp
digger -p PID -F -f qertbFetch -t kcb*,ktr* -s *read*,*aio
means to trace kcb* and ktr* functions and system calls *read* and *aio inside function qertbFetch
4) The capability to trace kernel calls (only for hardcore geeks).
Kernel is the SYS component of CPU utilization. This is a chance to look inside a system call.
5) The capability to trace scheduler actions
CPU# call(args) = return ---- --------------------------------- 0 -> kslfre(0x9191A5D0, 0x29, 0x0) 0 => yield(0x9191A5D0, 0x2000000000000029, 0xFFFFFFFD) 0 - sched:TS:ts_yield:change-pri [pr_clname=TS, pr_sname=O, pr_pri=59, new_pri=39] 0 - sched:unix:setbackdq:enqueue [pr_clname=TS, pr_sname=O, cpu_id=0] 0 - sched:unix:disp:dequeue [pr_clname=TS, pr_sname=R, cpu_id=0] 0 - sched:unix:swtch:remain-cpu 0 <= yield = 0x0 0 <- kslfre = 0x0
6) The capability to trace actions related with I/O
syscall => pread(0x101, 0x694BC000, 0x2000) genunix - io:genunix:bdev_strategy:start [d=sd3 f=/u02/oradata/orcl/example01.dbf b=8192] genunix - io:genunix:biowait:wait-start [d=sd3 f=/u02/oradata/orcl/example01.dbf b=8192] genunix - io:genunix:biowait:wait-done [d=sd3 f=/u02/oradata/orcl/example01.dbf b=8192] syscall <= pread = 0x2000
7) Mechanics of indents differs from dapptrace.
There are functions where return probes do not exist due to compiler optimizations.
See also:
Tail-call Optimization
http://dtrace.org/blogs/brendan/2011/02/14/dtrace-pid-provider-return
It leads to moving of function tree on right side. In the original version of dapptrace it looks like following piece
oracle -> kpofdr(0xFFFFFD7FFFDF2780, 0x64, 0x0) oracle -> ttcrbur(0xE77ADF0, 0xE768C18, 0xFFFFFD7FFFDF26C0) libc.so.1 -> memcpy(0xE9A2751, 0xFFFFFD7FFFDF26C0, 0x1) libc.so.1 -> memcpy(0xE9A2752, 0xFFFFFD7FFFDF26C2, 0x2) libc.so.1 -> memcpy(0xE9A2754, 0xFFFFFD7FFFDF26C4, 0x4) libc.so.1 -> memcpy(0xE9A2758, 0xFFFFFD7FFFDF26C8, 0x4) libc.so.1 -> memcpy(0xE9A275C, 0xFFFFFD7FFFDF26CC, 0x2) libc.so.1 -> memcpy(0xE9A275E, 0xFFFFFD7FFFDF21B4, 0x4) libc.so.1 -> memcpy(0xE9A2762, 0xFFFFFD7FFFDF21B4, 0x4) oracle <- ttcrbur = 0x0 oracle <- kpofdr = 0x0 oracle -> ...
In the Digger return point always has the same indent as entry point.
It does not completely prevent of the moving on right side, but makes an output more convinient for reading.
oracle -> kpofdr(0xFFFFFD7FFFDF2780, 0x64, 0x0) oracle -> ttcrbur(0xE77ADF0, 0xE768C18, 0xFFFFFD7FFFDF26C0) libc.so.1 -> memcpy(0xE9A2751, 0xFFFFFD7FFFDF26C0, 0x1) libc.so.1 -> memcpy(0xE9A2752, 0xFFFFFD7FFFDF26C2, 0x2) libc.so.1 -> memcpy(0xE9A2754, 0xFFFFFD7FFFDF26C4, 0x4) libc.so.1 -> memcpy(0xE9A2758, 0xFFFFFD7FFFDF26C8, 0x4) libc.so.1 -> memcpy(0xE9A275C, 0xFFFFFD7FFFDF26CC, 0x2) libc.so.1 -> memcpy(0xE9A275E, 0xFFFFFD7FFFDF21B4, 0x4) libc.so.1 -> memcpy(0xE9A2762, 0xFFFFFD7FFFDF21B4, 0x4) oracle <- ttcrbur = 0x0 oracle <- kpofdr = 0x0 oracle -> ...
USAGE: digger [-acdeFHIgKlmnoSUz] [-u lib] [-t functions] [-f functions] [-s functions] [-k functions] { -p PID | command } -p PID # examine this PID -a # print all details -c # print cpu ID -d # print relative timestamps (us) -e # print elapsed times (us) -F # print flow indentation -l # print pid/lwpid per line -m # print module name -o # print on cpu times (us) -g # print timestamps (us) -u lib # trace this library instead -U # trace all libraries + user funcs -b bufsize # dynamic variable buf size -S # trace all syscalls -s function(s) # trace syscalls with filter -f function(s) # to trace calls inside some function (patterns * and ? are allowed) -K # to trace kernel calls (fbt provider) -k function(s) # to trace kernel calls (fbt provider) using filter -z # print call stack -I # print fired probes of io provider -H # print fired probes of sched provider
If you use wide filter (for example if you do not set filter at all, option -t) too much time can be required to set all probes.
It is necessary to wait for the prompt “The Digger is ready…”. Time of waiting can be from a few seconds to several minutes.
Oracle kernel layers and functions
In order to use the tool more efficiently it is helpful to know more about Oracle kernel layers and function names
Useful info:
- Steve Adams: Oracle Kernel Layers (must read)
- There was a MOS note: 175982.1 ORA-600 Lookup Error Categories. It does not exist there now. The copy.
- ORA-600 INDEX FOR 8.0.5
- oradebug doc
Important notes
1) Obviously some important actions happen inside functions without dedicated function calls. Thus these actions are invisible.
As an example, which I would like to emphasize is the functions related with latches.
There are dedicated functions related with latches in KSL (Kernel Service layer Latching & Wait) layer as kslgetl (KSL Get Exclusive Latch), kslgetsl (KSL Get Shared Latch), etc.
But also latches can be captured inside functions like kcbgtcr (consistent get) or kcbgcur (current get) without dedicated calls of ksl* functions.
Looks like duplicated code. In fact historically there are macroses KSLBEGIN / KSLEND in many functions, so code is not duplicated but as a result we do not see actions with latches as dedicated calls.
Following query can be helpful
SQL> SELECT t1.ksllasnam "parent_name", 2 t2.ksllwnam "location" 3 FROM x$ksllw t2, x$kslwsc t1 4 WHERE t2.indx = t1.indx 5 and ksllwnam like 'kcbrls%'; parent_name location ---------------------- ----------------- cache buffers chains kcbrls: kslbegin
2) not enough memory for probes
If you will try to trace oracle processes without calls filter (option -f) or with wide filter, for example k* (to trace all kernel functions), you very likely will be faced with an error “failed to create probe … Not enough space”, solution is here
3) Use the tool only in testing environments. Some options (for example option -K – to trace kernel calls) can reduce response time of a system.
4) DTrace has lazy load
Some probes from shared libraries are not fired if patterns are used.
If you will do something like (pay attention the first and the third command are the same)
dtrace -n "pid<PID>:lib*:gethrtime:entry" dtrace -n "pid<PID>:libc.so.1:gethrtime:entry" dtrace -n "pid<PID>:lib*:gethrtime:entry"
then the first run you will get an error
dtrace: invalid probe specifier pid7144:lib*:gethrtime:entry: probe description pid7144:lib*:gethrtime:entry does not match any probes
but the second and the third run will be succesfully.
The reason is in lazy load.
See also: http://www.mail-archive.com/dtrace-discuss@opensolaris.org/msg03853.html
Examples
– Buffer is pinned count
– Timing: rowsource statistics. Part 1: How it works
– Timing: rowsource statistics. Part 2: Overhead and inconsistent time
Post Scriptum
Any ideas how to improve the tool or bugs are welcome.
Could you please provide source code of dagger tool. The link which you have provided in the post is not taking me to the site.