28 October, 2015

Trace Files -- 5.1 : Reading an SQL Trace

Here's a short introduction to reading an SQL Trace.

First I execute these in my sqlplus session :

SQL> connect hr/oracle
Connected.
SQL> exec DBMS_SESSION.SESSION_TRACE_ENABLE(waits=>TRUE,binds=>FALSE);

PL/SQL procedure successfully completed.

SQL> select count(*) from employees;

  COUNT(*)
----------
       107

SQL> exec DBMS_SESSION.SESSION_TRACE_DISABLE;

PL/SQL procedure successfully completed.

SQL> select value from v$diag_info where name = 'Default Trace File';

VALUE
--------------------------------------------------------------------------------
/u01/app/oracle/diag/rdbms/orcl/orcl/trace/orcl_ora_3039.trc

SQL> 



Now, I extract selective portions of the trace file.

The header of the trace file gives me inormation about the platform and session/module :

Trace file /u01/app/oracle/diag/rdbms/orcl/orcl/trace/orcl_ora_3039.trc
Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options
ORACLE_HOME = /u01/app/oracle/product/11.2.0/orcl
System name:    Linux
Node name:      ora11204
Release:        2.6.39-400.17.1.el6uek.x86_64
Version:        #1 SMP Fri Feb 22 18:16:18 PST 2013
Machine:        x86_64
Instance name: orcl
Redo thread mounted by this instance: 1
Oracle process number: 27
Unix process pid: 3039, image: oracle@ora11204 (TNS V1-V3)


*** 2015-10-28 22:19:42.291
*** SESSION ID:(141.7) 2015-10-28 22:19:42.291
*** CLIENT ID:() 2015-10-28 22:19:42.291
*** SERVICE NAME:(SYS$USERS) 2015-10-28 22:19:42.291
*** MODULE NAME:(SQL*Plus) 2015-10-28 22:19:42.291
*** ACTION NAME:() 2015-10-28 22:19:42.291


The trace file tells me that I am running 64bit Oracle 11.2.0.4 on a speciic Linux (UEK) kernel on a host called "ora11204".  This is useful information if I have to provide the trace file to Oracle Support.  (True, it doesn't list all the Oracle patches that may have been applied).
It then identifies my Service Name and Module (and Action if I had set it with DBMS_APPLICATION_INFO).

Next, the trace file tells me when and how I have enabled tracing.

PARSING IN CURSOR #140174891232936 len=73 dep=0 uid=43 oct=47 lid=43 tim=1446041982290677 hv=3228613492 ad='99446cc8' sqlid='dc03x7r071fvn'
BEGIN DBMS_SESSION.SESSION_TRACE_ENABLE(waits=>TRUE,binds=>FALSE); END;
END OF STMT
EXEC #140174891232936:c=1000,e=31608,p=0,cr=0,cu=0,mis=1,r=1,dep=0,og=1,plh=0,tim=1446041982290670
WAIT #140174891232936: nam='SQL*Net message to client' ela= 4 driver id=1650815232 #bytes=1 p3=0 obj#=13246 tim=1446041982291265


It tells me that I have used DBMS_SESSION to enable tracing.

That cursor is then closed.

*** 2015-10-28 22:19:53.745
WAIT #140174891232936: nam='SQL*Net message from client' ela= 11453710 driver id=1650815232 #bytes=1 p3=0 obj#=13246 tim=1446041993745004
CLOSE #140174891232936:c=0,e=38,dep=0,type=0,tim=1446041993745175


We can identify the CURSOR Number (140174891232936-- this numbering format was introduced in 11g, if I am correct) that was closed.  This number is NOT the SQL_ID (which is dc03x7r071fvn -- yes even PLSQL blocks and procedures have SQL_IDs).  This number is NOT the SQL Hash Value either (the SQL Hash Value is present as hv=3228613492).

After that we start seeing the recursive (SYS) calls executed during the parsing of the actual user query (the query on EMPLOYEES).

PARSING IN CURSOR #140174891988920 len=202 dep=1 uid=0 oct=3 lid=0 tim=1446041993745737 hv=3819099649 ad='997c03e0' sqlid='3nkd3g3ju5ph1'
select obj#,type#,ctime,mtime,stime, status, dataobj#, flags, oid$, spare1, spare2 from obj$ where owner#=:1 and name=:2 and namespace=:3 and remoteowner is null and linkname is null and subname is null
END OF STMT
EXEC #140174891988920:c=0,e=50,p=0,cr=0,cu=0,mis=0,r=0,dep=1,og=4,plh=2853959010,tim=1446041993745733
FETCH #140174891988920:c=999,e=88,p=0,cr=3,cu=0,mis=0,r=1,dep=1,og=4,plh=2853959010,tim=1446041993745957
CLOSE #140174891988920:c=0,e=11,dep=1,type=3,tim=1446041993746025
=====================
PARSING IN CURSOR #140174891955728 len=493 dep=1 uid=0 oct=3 lid=0 tim=1446041993746138 hv=2584065658 ad='997a6590' sqlid='1gu8t96d0bdmu'
select t.ts#,t.file#,t.block#,nvl(t.bobj#,0),nvl(t.tab#,0),t.intcols,nvl(t.clucols,0),t.audit$,t.flags,t.pctfree$,t.pctused$,t.initrans,t.maxtrans,t.rowcnt,t.blkcnt,t.empcnt,t.avgspc,t.chncnt,t.avgrln,t.analyzetime,t.samplesize,t.cols,t.property,nvl(t.degree,1),nvl(t.instances,1),t.avgspc_flb,t.flbcnt,t.kernelcols,nvl(t.trigflag, 0),nvl(t.spare1,0),nvl(t.spare2,0),t.spare4,t.spare6,ts.cachedblk,ts.cachehit,ts.logicalread from tab$ t, tab_stats$ ts where t.obj#= :1 and t.obj# = ts.obj# (+)
END OF STMT
EXEC #140174891955728:c=0,e=30,p=0,cr=0,cu=0,mis=0,r=0,dep=1,og=4,plh=3526770254,tim=1446041993746134
FETCH #140174891955728:c=0,e=53,p=0,cr=4,cu=0,mis=0,r=1,dep=1,og=4,plh=3526770254,tim=1446041993746225
CLOSE #140174891955728:c=0,e=8,dep=1,type=3,tim=1446041993746254


I have shown only the first two recursive calls.  These are identified by "dep=1"  (the parent call by the user we shall see is at depth level 0) and also by uid=0, lid=0  (schema id and privilege userid being 0 for SYS).  len=493 is the length of the SQL statement.

The elapsed time in microseconds is the "e=" value while the CPU time is "c=" in microseconds.  The "p=" value is Physical Reads, the "cr=" value is Consistent Gets, "cu=" is Current Gets, "r=" is Number of Rows.  "og=" shows the Optimizer Goal (og=4 is "Choose").  "tim=" is an ever-increasing timestamp (i.e. to calculate elapsed time between two SQL calls you can deduct the first "tim" value from the second.

Following these two are a number of other recursive SQL calls that I shall skip over until I come to my user SQL.

PARSING IN CURSOR #140174891232936 len=30 dep=0 uid=43 oct=3 lid=43 tim=1446041993782992 hv=282764354 ad='99446f28' sqlid='7c1rnh08dp922'
select count(*) from employees
END OF STMT
PARSE #140174891232936:c=8998,e=37717,p=3,cr=165,cu=0,mis=1,r=0,dep=0,og=1,plh=3580537945,tim=1446041993782979
EXEC #140174891232936:c=0,e=69,p=0,cr=0,cu=0,mis=0,r=0,dep=0,og=1,plh=3580537945,tim=1446041993783188
WAIT #140174891232936: nam='SQL*Net message to client' ela= 5 driver id=1650815232 #bytes=1 p3=0 obj#=13246 tim=1446041993783286
WAIT #140174891232936: nam='Disk file operations I/O' ela= 76 FileOperation=2 fileno=4 filetype=2 obj#=16404 tim=1446041993790690
WAIT #140174891232936: nam='db file sequential read' ela= 6784 file#=4 block#=211 blocks=1 obj#=16404 tim=1446041993797581
FETCH #140174891232936:c=7000,e=14409,p=1,cr=1,cu=0,mis=0,r=1,dep=0,og=1,plh=3580537945,tim=1446041993797727
STAT #140174891232936 id=1 cnt=1 pid=0 pos=1 obj=0 op='SORT AGGREGATE (cr=1 pr=1 pw=0 time=14407 us)'
STAT #140174891232936 id=2 cnt=107 pid=1 pos=1 obj=16404 op='INDEX FULL SCAN EMP_EMAIL_UK (cr=1 pr=1 pw=0 time=14577 us cost=1 size=0 card=107)'
WAIT #140174891232936: nam='SQL*Net message from client' ela= 486 driver id=1650815232 #bytes=1 p3=0 obj#=16404 tim=1446041993819989
FETCH #140174891232936:c=0,e=4,p=0,cr=0,cu=0,mis=0,r=0,dep=0,og=0,plh=3580537945,tim=1446041993820032
WAIT #140174891232936: nam='SQL*Net message to client' ela= 2 driver id=1650815232 #bytes=1 p3=0 obj#=16404 tim=1446041993820057


The "dep=0" and "uid" and "lid" not being 0 indicate that this is a User SQL.  "oct" is the Oracle Command Type.
Here in addition to information on the PARSE call, the EXEC call, the individual WAITs and the FETCH call, we can also see Row Source Statistics indicated by the STAT lines.
The Parse was 37,717 microseconds and the FETCH time was 14,409 microseconds.  (This is a very quick running SQL but since it had never been parsed in ths instance, the Parse time exceeded the FETCH time),  The "mis=1" for the Parse indicates a Miss in the Library Cache -- so Oracle had to do a Hard Parse.
I would look at EXEC for INSERT/UPDATE/DELETE statements but here for a simple SELECT, I look at the FETCH time.
For the 'db file sequential read' wait of 6,784microseconds, we can also see the File Number ("file#"), the Block ID ("block#"), number of Blocks (1 for this wait event) and Object ID ("obj#")


The STAT lines have additional information about the position ("pos")  and parent id ("pid") in the Execution Plan.  The Object ID is indicated by "obj" and the operation by "op". STAT lines show the Consistent Gets ("cr"), Physical Reads ("pr") , the Time ("time") in microseconds, the cost ("cost") and expected cardinality ("card") for each step of the Execution Plan. Note that the expected cardinality is "card" but the actual count of rows is "cnt".

Next, the cursor is closed, and tracing disabled.

*** 2015-10-28 22:20:07.096
WAIT #140174891232936: nam='SQL*Net message from client' ela= 13276436 driver id=1650815232 #bytes=1 p3=0 obj#=16404 tim=1446042007096511
CLOSE #140174891232936:c=0,e=19,dep=0,type=0,tim=1446042007096668
=====================
PARSING IN CURSOR #140174891232936 len=48 dep=0 uid=43 oct=47 lid=43 tim=1446042007097875 hv=2592126054 ad='96fff810' sqlid='9wuhwhad81d36'
BEGIN DBMS_SESSION.SESSION_TRACE_DISABLE; END;
END OF STMT
PARSE #140174891232936:c=999,e=1129,p=0,cr=0,cu=0,mis=1,r=0,dep=0,og=1,plh=0,tim=1446042007097870
EXEC #140174891232936:c=0,e=313,p=0,cr=0,cu=0,mis=0,r=1,dep=0,og=1,plh=0,tim=1446042007098285


Note how the CURSOR ID get's reused by the next SQL if the previous SQL cursor was closed.  Thus "140174891232936" was used by the PLSQL call  that enabled Tracing, then closed, then by the user SQL query on EMPLOYEES, then Closed, then by the  PLSQL call that disabled Tracing.
(The recursive SQLs in between had different CURSOR IDs).

(As an observation : Notice how the "obj#" doesn't get cleared even when the next wait is an "SQL*Net message to client" or "SQL*Net message from client")
.
.
.

1 comment:

Foued said...

Wonderfully explained , thanks Hemant for sharing this post.
Foued