技术开发 频道

ORA-01555错误浅析(3)

  实例分析

  实际上,你在了解了1555错误为什么会发生的前提,遇到了1555错误就不应该再手足无措了。但是,根据我个人的经验,大多数的1555错误的发生,其根本原因还是语句写得太烂,导致了大量的consistent gets和超长的执行时间,最后引发了1555错误。下面就是一个典型例子:

  错误的发生

  近来生产系统反馈,时常有作业被异常中止,导致应用程序被hung住。经过检查日志,是某个作业在运行时发生了1555错误,导致程序无法返回结果:

  *

  ERROR at line 1:

  ORA-01555: snapshot too old: rollback segment number 9 with name "RBS08" too small

  ORA-06512: at "SQLUSER.EXT_PKG", line 4917

  ORA-06512: at line 1

  相关程序记录下的日志:

  STAT-S.20060313185536.lg = 2 processed =20060313185536 end date

  Error = ORA-01555: snapshot too old: rollback segment number 9 with name "RBS08" too small

  BEGIN ext_pkg.main('extfiles','50','EAT'); END;

  错误分析解决

  这是一个典型的1555错误。检查引发该错误的PACKAGE,发现它只有一个入口函数main(及程序日志中记录的函数),但这个函数还调用了其他N个PACKAGE里面的函数。这是一个大作业,执行时,设置它使用了一个大的回滚段:RBS_BATCH1。

  先看看相关配置:rollback tablespace空间为8G,undo_retention为1800。

  看看回滚段的统计数据:

 SQL> Select rn.Name "Rollback Segment", rs.RSSize/1024 "Size (KB)", rs.Gets "Gets",

  
2 rs.waits "Waits", (rs.Waits/rs.Gets)*100 "% Waits",

  
3 rs.Shrinks "# Shrinks", rs.Extends "# Extends"

  
4 from sys.v_$RollName rn, sys.v_$RollStat rs

  
5 where rn.usn = rs.usn;

  
Rollback Segment Size (KB) Gets Waits % Waits # Shrinks # Extends

  
------------------------------ ---------- ---------- ---------- ---------- ---------- ----------

  SYSTEM
952 1189 0 0 0 0

  RBS_BATCH1
409592 681422 667 .097883544 0 0

  ... ...

  RBS10
204792 478502 10 .002089855 0 0

  RBS11
204792 477366 8 .001675863 0 0

  RBS12
204792 491070 6 .001221822 0 0

  RBS_BATCH2
409592 650088 644 .099063511 0 0

  RBS_BATCH3
409592 243849 3 .00123027 0 0

  
16 rows selected.

  注意到RBS_BATCHT1的wait%是0.098%,这个值应该是比较好的一个值。

  回过头再来看依法错误的语句:调用ext_pkg.main函数。在程序日志中已经记录下了输入参数,这就比较好办了:作一个trace,看看到底哪条语句的性能最差:

SQL> alter system set events ‘10046 trace name context forever, level 1’;

  System altered.

  SQL
> exec ext_pkg.main('extfiles','50','EAT');

  PL
/SQL procedure successfully completed.

  SQL
> alter system set events ‘10046 trace name context off’;

  System altered.

  用tkprof处理trace文件后检查trace文件:

  TKPROF: Release 9.2.0.5.0 - Production on Tue Mar 14 09:21:58 2006

  Copyright (c) 1982, 2002, Oracle Corporation. All rights reserved.

  Trace file: prod_ora_29225.trc

  Sort options: default

  ********************************************************************************

  count = number of times OCI procedure was executed

  cpu = cpu time in seconds executing

  elapsed = elapsed time in seconds executing

  disk = number of physical reads of buffers from disk

  query = number of buffers gotten for consistent read

  current = number of buffers gotten in current mode (usually for update)

  rows = number of rows processed by the fetch or execute call

  ********************************************************************************

  alter session set events='10046 trace name context forever, level 1'

  ... ...

  BEGIN

  ext_pkg.main('extfiles','50','EAT');

  END;

  call count cpu elapsed disk query current rows

  ------- ------ -------- ---------- ---------- ---------- ---------- ----------

  Parse 1 0.01 0.04 4 39 0 0

  Execute 1 0.20 0.21 24560 37808 0 1

  Fetch 0 0.00 0.00 0 0 0 0

  ------- ------ -------- ---------- ---------- ---------- ---------- ----------

  total 2 0.21 0.26 24564 37847 0 1

  Misses in library cache during parse: 1

  Optimizer goal: CHOOSE

  Parsing user id: 30

  ********************************************************************************

  ... ...

  SELECT QID

  FROM

  CSQUE QUE, ASTP STP WHERE

  QUE.CDE IN (:B3, :B2) AND QUE.TID =

  STP.TID AND STP.ACDE = :B1 ORDER BY

  QUE.CDE, QUE.DT

  call count cpu elapsed disk query current rows

  ------- ------ -------- ---------- ---------- ---------- ---------- ----------

  Parse 1 0.00 0.00 0 0 0 0

  Execute 2 0.00 0.00 0 0 0 0

  Fetch 2682 0.41 0.54 7557 10968 0 2680

  ------- ------ -------- ---------- ---------- ---------- ---------- ----------

  total 2685 0.41 0.54 7557 10968 0 2680

  Misses in library cache during parse: 1

  Optimizer goal: CHOOSE

  Parsing user id: 31 (recursive depth: 1)

  ********************************************************************************

  ... ...

  7 user SQL statements in session.

  41 internal SQL statements in session.

  48 SQL statements in session.

  ********************************************************************************

  Trace file: prod_ora_29225.trc

  Trace file compatibility: 9.02.00

  Sort options: default

  1 session in tracefile.

  7 user SQL statements in trace file.

  41 internal SQL statements in trace file.

  48 SQL statements in trace file.

  20 unique SQL statements in trace file.

  8626 lines in trace file.

  从trace文件中,发现有一条语句性能相当差,通过对这条语句做SQL Trace,发现它的consistent gets达到80万!

  于是对该语句进行优化,调整了它的写法,并建立了缺少的索引(优化过程略)。最终将consistent gets数量降低到了5000。

  重新安排上线,经过一周的观察,1555错误没再发生。

  其实这个案例的解决是比较简单的,最终的处理就是将一条语句进行优化。

0
相关文章