技术开发 频道

Oracle IO问题解析

  3.2.2 db file scattered read

  这是另外一个常见的引起数据库IO性能问题的等待事件。它通常发生在Oracle将“多数据块”读取到Buffer Cache中的非连续(分散的 Scattered)区域。多数据块读就是我们上述所说的一次读取“DB_FILE_MULTIBLOCK_READ_COUNT”块数据块,前面提到,它通常发生在全表扫描(Full Table Scan)和快速全索引扫描(Fast Full Index Scan)时。当发现db file scattered read等待事件是系统引起IO性能的主要原因时,我们可以采取以下措施对系统进行优化。

  3.2.2.1 优化存在Full Table Scan和Fast Full Index Scan的SQL语句

  我们可以首先从statspack或者awr报告中的“SQL ordered by Reads”部分中找出存在Full Table Scan和Fast Full Index Scan的Top SQL。因为这些Top SQL往往是整个系统的瓶颈。

  从9i开始,我们还可以通过视图V$SQL_PLAN来查找系统中存在Full Table Scan和Fast Full Index Scan的SQL语句。查找Full Table Scan的语句:

  select sql_text from v$sqlarea t, v$sql_plan p

  where t.hash_value=p.hash_value and p.operation='TABLE ACCESS'

  and p.options='FULL';

  查找Fast Full Index Scan的语句

  select sql_text from v$sqlarea t, v$sql_plan p

  where t.hash_value=p.hash_value and p.operation='INDEX'

  and p.options='FULL SCAN';

  Full Table Scan通常是由于以下几个原因引起的:

  条件字段上没有索引;

  在这种情况下,如果表的数据量比较大,我们就需要在相应字段上建立起索引。

  CBO中,对象的统计数据不正确

  CBO中,如果对象的统计数据或者其柱状图(Histogram)信息不正确,会导致优化器计算出错误的查询计划,从而选择全表扫描。这种情况下,我们要做的就重新分析(Analyze)表、索引及字段。

  CBO中,SQL语句中引用到了无法估算统计数据的对象

  在PLSQL中,可以建立一些高级的数据类型,如“TABLE OF”、ARRAY等,通过TABLE、CAST函数可以在SQL语句中将这些对象当成表来处理。而这些对象的数据只存在于调用PLSQL的会话中,因此他们没有相应的统计数据,Oracle会为他们生产一些假的统计数据以完成查询计划代价估算。但是基于这些假的数据计算出的查询计划一般是错误的。我们可以考虑通过提示来强制SQL使用索引或者强制SQL采用RBO优化器。

  此外,如果SQL中引用到了临时表(Temporary Table)也会产生同样的问题。其原因和解决方法和上面相同。

  优化器认为索引扫描代价过高;

  在Oracle中存在一个参数optimizer_index_cost_adj,该参数的值代表一个百分数,如果对索引扫描的代价达到或超过全表扫描的代价的这个百分比值时,优化器就采用全表扫描。

  optimizer_index_cost_adj是一个全局性的参数,它的合理值是通过长期调整出来的。一般来说是一个介于1到100之间的数字。我们可以按照以下方法来选取optimizer_index_cost_adj的合理值。

  先由以下语句得出optimizer_index_cost_adj的一个初始值:

  SQL> select

  2 a.average_wait "Average Waits FTS"

  3 ,b.average_wait "Average Waits Index Read"

  4 ,a.total_waits /(a.total_waits + b.total_waits) "Percent of FTS"

  5 ,b.total_waits /(a.total_waits + b.total_waits) "Percent of Index Scans"

  6 ,(b.average_wait / a.average_wait)*100 "optimizer_index_cost_adj"

  7 from

  8 v$system_event a,

  9 v$system_event b

  10 where a.EVENT = 'db file sequential read'

  11 and b.EVENT = 'db file scattered read';

  Average Waits FTS Average Waits Index Read Percent of FTS Percent of Index Scans

  ----------------- ------------------------ -------------- ----------------------

  optimizer_index_cost_adj

  ------------------------

  1.25 1.06 .041867874 .958132126

  84.8

  这里,84.8是我们系统的初始值。在系统经过一段时间运行后,再次运行上面的语句,重新调整optimizer_index_cost_adj的值。经过多次如此反复的调整之后,最终上面语句得出值趋于稳定,这时这个值就是符合我们系统性能需求的最合理的值。

  当然这个数值也可以通过statspack的历史数据来调整,在9i中:

  select to_char(c.end_interval_time, 'MM/DD/YYYY') "Date",

  sum(a.time_waited_micro)/sum(a.total_waits)/10000 "Average Waits FTS",

  sum(b.time_waited_micro)/sum(b.total_waits)/10000 "Average Waits Index Read",

  (sum(a.total_waits) / sum(a.total_waits + b.total_waits)) * 100 "Percent of FTS",

  (sum(b.total_waits) / sum(a.total_waits + b.total_waits)) * 100 "Percent of Index Scans",

  (sum(b.time_waited_micro)/sum(b.total_waits)) /

  (sum(a.time_waited_micro)/sum(a.total_waits)) * 100 "optimizer_index_cost_adj"

  from dba_hist_system_event a, dba_hist_system_event b, dba_hist_snapshot c

  where a.event_name = 'db file scattered read'

  and b.event_name = 'db file sequential read'

  and a.snap_id = c.snap_id

  and b.snap_id = c.snap_id

  group by c.end_interval_time

  order by 1;

  10g中:

  select to_char(c.snap_time, 'MM/DD/YYYY') "Date",

  sum(a.time_waited_micro)/sum(a.total_waits)/10000 "Average Waits FTS",

  sum(b.time_waited_micro)/sum(b.total_waits)/10000 "Average Waits Index Read",

  (sum(a.total_waits) / sum(a.total_waits + b.total_waits)) * 100 "Percent of FTS",

  (sum(b.total_waits) / sum(a.total_waits + b.total_waits)) * 100 "Percent of Index Scans",

  (sum(b.time_waited_micro)/sum(b.total_waits)) /

  (sum(a.time_waited_micro)/sum(a.total_waits)) * 100 "optimizer_index_cost_adj"

  from stats$system_event a, stats$system_event b, stats$snapshot c

  where a.event = 'db file scattered read'

  and b.event = 'db file sequential read'

  and a.snap_id = c.snap_id

  and b.snap_id = c.snap_id

  group by c.snap_time

  order by 1;

  当optimizer_index_cost_adj的值对于整个系统来说已经是比较合理的值,而某些语句由于该值选择了全表扫描扫描导致了IO性能问题时,我们可以考虑通过提示来强制语句命中索引。

  建立在条件字段上的索引的选择性不高,结合上一条导致全表扫描;

  当索引的选择性不高,且其代价过高,系统则会选择全表扫描来读取数据。这时我们可以考虑通过选择/建立选择性比较高的索引,使查询命中索引从而避免全表扫描。

  SQL> create index t_test1_idx1 on t_test1(owner) compute statistics;

  Index created.

  SQL> set autot trace

  SQL> select object_name

  2 from t_test1

  3 where owner = 'SYS'

  4 and created > sysdate - 30;

  no rows selected

  Execution Plan

  ----------------------------------------------------------

  Plan hash value: 1883417357

  -----------------------------------------------------------------------------

  | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |

  -----------------------------------------------------------------------------

  | 0 | SELECT STATEMENT | | 49 | 1715 | 152 (2)| 00:00:02 |

  |* 1 | TABLE ACCESS FULL| T_TEST1 | 49 | 1715 | 152 (2)| 00:00:02 |

  -----------------------------------------------------------------------------

  Predicate Information (identified by operation id):

  ---------------------------------------------------

  1 - filter("OWNER"='SYS' AND "CREATED">SYSDATE@!-30)

  ... ...

  SQL> create index t_test1_idx2 on t_test1(owner, created) compute statistics;

  Index created.

  SQL> select object_name

  2 from t_test1

  3 where owner = 'SYS'

  4 and created > sysdate - 30;

  no rows selected

  Execution Plan

  ----------------------------------------------------------

  Plan hash value: 3417015015

  --------------------------------------------------------------------------------

  | Id | Operation | Name | Rows | Bytes | Cost (%CPU)

  | Time |

  --------------------------------------------------------------------------------

  | 0 | SELECT STATEMENT | | 49 | 1715 | 2 (0)

  | 00:00:01 |

  | 1 | TABLE ACCESS BY INDEX ROWID| T_TEST1 | 49 | 1715 | 2 (0)

  | 00:00:01 |

  |* 2 | INDEX RANGE SCAN | T_TEST1_IDX2 | 49 | | 1 (0)

  | 00:00:01 |

  --------------------------------------------------------------------------------

  Predicate Information (identified by operation id):

  ---------------------------------------------------

  2 - access("OWNER"='SYS' AND "CREATED">SYSDATE@!-30)

  ... ...

  3.2.2.2 调整DB_FILE_MULTIBLOCK_READ_COUNT

  当SQL已经没有优化余地后,问题仍没有解决,我们可以考虑调整DB_FILE_MULTIBLOCK_READ_COUNT大小。其作用我们在3.1.2中有做叙述,这里不再赘述。不过要注意一点就是,DB_FILE_MULTIBLOCK_READ_COUNT * DB_BLOCK_SIZE是一次IO读取的传输量,它不能大于系统的max_io_size大小。

  从Oracle 10gR2开始,如果没有设置DB_FILE_MULTIBLOCK_READ_COUNT的大小,Oracle会自动为其调整一个默认值,这个默认值的大小与平台最大IO大小(max_io_size)相关(对大多数平台来说max_io_size是1M),其大小被设置为(max_io_size / DB_BLOCK_SIZE)。

  3.2.2.3 将频繁访问的全扫描的表CACHE住

  由于通过Full Table Scan和Fast Full Index Scan读取的数据块会被放置到Buffer Cache的LRU链表的LRU端,从而使数据块尽快从Buffer Cache中移出。因此,对于那些会被频繁访问到全扫描的表,且其数据量不大的情况下,我们可以考虑将它们CACHE住。

  SQL> alter table t_test1 cache;

  Table altered.

  对于Fast Full Index Scan的索引对象,则可以考虑把它放置在KEEP池中。

  SQL> alter index t_test1_idx1 storage(buffer_pool keep);

  Index altered.

  利用V$SESSION_EVENT视图,我们同样可以找到当前系统中发生全扫描的对象。

  SQL> select p1 "fileid", p2 "block_id", p3 "block_num"

  2 from v$session_wait

  3 where event = 'db file scattered read';

  fileid block_id block_num

  ---------- ---------- ----------

  359 152972 16

  SQL> select

  2 segment_name "Segment Name",

  3 segment_type "Segment Type",

  4 block_id "First Block of Segment",

  5 block_id+blocks "Last Block of Segment"

  6 from dba_extents

  7 where &fileid = file_id

  8 and &block_id >= block_id

  9 and &block_id <= block_id+blocks;

  Enter value for fileid: 359

  old 7: where &fileid = file_id

  new 7: where 359 = file_id

  Enter value for block_id: 152972

  old 8: and &block_id >= block_id

  new 8: and 152972 >= block_id

  Enter value for block_id: 152972

  old 9: and &block_id <= block_id+blocks

  new 9: and 152972 <= block_id+blocks

  Segment Name

  --------------------------------------------------------------------------------

  Segment Type First Block of Segment Last Block of Segment

  ------------------ ---------------------- ---------------------

  CSS_TP_SHMT_QUEUE

  TABLE 152969 153001

  3.2.2.4 利用分区表减少全扫描操作读取的数据块数量

  前面我们有介绍分区裁剪(Partition Pruning)技术。将表分区,利用分区裁剪技术,在进行全扫描时只会扫描在WHERE条件中出现的分区,从而可以减少全扫描所读取到的数据块数量。

  3.2.2.5 Housekeep历史数据

  同样,housekeep不需要的、历史的数据,减少数据段中的数据块数量,也能减少全扫描的IO请求次数。

0
相关文章