如果Oracle数据库hang了,对Oracle进行system dump或hang analyze,是研究和解决问题的有效方法。如果能够连接数据库,并进行操作,则使用oradebug是最简单快捷的办法。
但有的时候,数据库由于hang住,sqlplus不能连接时(在10g可以尝试用sqlplus -prelim连接数据库),可以使用操作系统上的调试工具来dump oracle系统状态。因为我的环境是linux,所以我先从gdb来介绍。
①首先获得要dump的进程号
ps -ef | grep LOCAL
oracle 9015 1 0 12:28 ? 00:00:00 oracleretest (DESCRIPTION=(LOCAL=YES)(ADDRESS=(PROTOCOL=beq)))
oracle 9110 8981 0 14:09 pts/4 00:00:00 grep LOCAL
②调用gdb进行dump
gdb $ORACLE_HOME/bin/oracle 9015
GNU gdb Red Hat Linux (6.1post-1.20040607.62rh)
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type “show copying” to see the conditions.
There is absolutely no warranty for GDB. Type “show warranty” for details.
This GDB was configured as “i386-redhat-linux-gnu”…(no debugging symbols found)…Using host libthread_db library “/lib/tls/libthread_db.so.1″.
Attaching to program: /u01/app/oracle/product/10.1.0/db_1/bin/oracle, process 9015
Reading symbols from /u01/app/oracle/product/10.1.0/db_1/lib/libskgxp10.so…(no debugging symbols found)…done.
Loaded symbols for /u01/app/oracle/product/10.1.0/db_1/lib/libskgxp10.so
Reading symbols from /u01/app/oracle/product/10.1.0/db_1/lib/libhasgen10.so…done.
Loaded symbols for /u01/app/oracle/product/10.1.0/db_1/lib/libhasgen10.so
Reading symbols from /u01/app/oracle/product/10.1.0/db_1/lib/libskgxn2.so…done.
Loaded symbols for /u01/app/oracle/product/10.1.0/db_1/lib/libskgxn2.so
Reading symbols from /u01/app/oracle/product/10.1.0/db_1/lib/libocr10.so…done.
Loaded symbols for /u01/app/oracle/product/10.1.0/db_1/lib/libocr10.so
Reading symbols from /u01/app/oracle/product/10.1.0/db_1/lib/libocrb10.so…done.
Loaded symbols for /u01/app/oracle/product/10.1.0/db_1/lib/libocrb10.so
Reading symbols from /u01/app/oracle/product/10.1.0/db_1/lib/libocrutl10.so…done.
Loaded symbols for /u01/app/oracle/product/10.1.0/db_1/lib/libocrutl10.so
Reading symbols from /u01/app/oracle/product/10.1.0/db_1/lib/libjox10.so…done.
Loaded symbols for /u01/app/oracle/product/10.1.0/db_1/lib/libjox10.so
Reading symbols from /u01/app/oracle/product/10.1.0/db_1/lib/libclsra10.so…done.
Loaded symbols for /u01/app/oracle/product/10.1.0/db_1/lib/libclsra10.so
Reading symbols from /u01/app/oracle/product/10.1.0/db_1/lib/libdbcfg10.so…done.
Loaded symbols for /u01/app/oracle/product/10.1.0/db_1/lib/libdbcfg10.so
Reading symbols from /u01/app/oracle/product/10.1.0/db_1/lib/libnnz10.so…done.
Loaded symbols for /u01/app/oracle/product/10.1.0/db_1/lib/libnnz10.so
Reading symbols from /usr/lib/libaio.so.1…done.
Loaded symbols for /usr/lib/libaio.so.1
Reading symbols from /lib/libdl.so.2…done.
Loaded symbols for /lib/libdl.so.2
Reading symbols from /lib/tls/libm.so.6…done.
Loaded symbols for /lib/tls/libm.so.6
Reading symbols from /lib/tls/libpthread.so.0…done.
[Thread debugging using libthread_db enabled]
[New Thread -1219938624 (LWP 3765)]
Loaded symbols for /lib/tls/libpthread.so.0
Reading symbols from /lib/libnsl.so.1…done.
Loaded symbols for /lib/libnsl.so.1
Reading symbols from /lib/tls/libc.so.6…done.
Loaded symbols for /lib/tls/libc.so.6
Reading symbols from /lib/ld-linux.so.2…done.
Loaded symbols for /lib/ld-linux.so.2
Reading symbols from /lib/libnss_files.so.2…done.
Loaded symbols for /lib/libnss_files.so.2
0×006967a2 in _dl_sysinfo_int80 () from /lib/ld-linux.so.2
(gdb) print ksudss(10)
[Switching to Thread -1219938624 (LWP 9015)]
{GetProperty(Content)} = 213658428
(gdb) detach
Detaching from program: /u01/app/oracle/product/10.1.0/db_1/bin/oracle, process 9015
(gdb) quit
③随后即可找到有dump结果的trace文件。
ls -lrt | grep 9015
-rw-r—– 1 oracle oinstall 1461 Feb 23 12:37 retest_ora_9015.trc
此时,可以用ass.awk工具对trace文件进行简单的分析。具体的ass.awk文件。
awk -f ass.awk retest_ora_9015.trc
就可以获得简单的等待信息。
在LINUX下用gdb,在AIX下用dbx。
# dbx -a 446910
Waiting to attach to process 446910 …
Successfully attached to oracle.
Type ‘help’ for help.
reading symbolic information …
stopped in iosl.select at 0×9000000000c94d8 ($t2)
0×9000000000c94d8 (select+0xfffffffffff06318) e8410028 ld r2,0×28(r1)
(dbx) print ksudss(10)
Segmentation fault in slrac at 0×100083aa0 ($t2)
0×100083aa0 (slrac+0xe4) 88030000 lbz r0,0×0(r3)
(dbx) detach
在HP-UX下呢,可以用HP的wdb(可以到HP WDB查看HP WDB的详细信息和下载最新的版本。在solaris上,也会有dbx或gdb(各个平台有多种不同的debugger,其他还有adb,mdb等等)。
除了通过print ksudss(10)进行systemstate dump,还可以进行下面的dump
print ksdhng(3,1,0) 相当于oradebug hanganalyze 3
print ksudps(10) 相当于oradebug dump processstate 10
print curdmp() 相当于oradebug call curdmp(也就是oradebug dump cursordump)
print ksdtrc(4) 相当于oradebug dump events 4(这里参数表示level,1–session,2–process,4–system)
print ksdsel(10046,12) –相当于为attach的进程设置10046事件level 12
print skdxipc() –相当于oradebug ipc
print skdxprst() –相当于oradebug procstat
当然如果能用oradebug,就应该使用oradebug,毕竟方便得多,也更安全