1.2 读
1.2.1 物理读
产生物理读主要有以下几种情况:
第一次读取
当数据块第一次被读取到,Oracle会先将其从磁盘上读入Buffer Cache中,并将他们放在LRU(Last Recently Used)链表的MRU(Most Recently Used)端。再次访问数据块时就可以直接从Buffer Cache中读取、修改了。看以下例子:
SQL> select owner, index_name from t_test3; 2856 rows selected. Execution Plan----------------------------------------------------------Plan hash value: 2878488296 -----------------------------------------------------------------------------| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |-----------------------------------------------------------------------------| 0 | SELECT STATEMENT | | 2856 | 68544 | 22 (0)| 00:00:01 || 1 | TABLE ACCESS FULL| T_TEST3 | 2856 | 68544 | 22 (0)| 00:00:01 |----------------------------------------------------------------------------- Statistics---------------------------------------------------------- 407 recursive calls 32 db block gets 344 consistent gets 89 physical reads 0 redo size 103888 bytes sent via SQL*Net to client 2475 bytes received via SQL*Net from client 192 SQL*Net roundtrips to/from client 9 sorts (memory) 0 sorts (disk) 2856 rows processed SQL> select owner, index_name from t_test3; 2856 rows selected. Elapsed: 00:00:00.03 Execution Plan----------------------------------------------------------Plan hash value: 2878488296 -----------------------------------------------------------------------------| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |-----------------------------------------------------------------------------| 0 | SELECT STATEMENT | | 2856 | 68544 | 22 (0)| 00:00:01 || 1 | TABLE ACCESS FULL| T_TEST3 | 2856 | 68544 | 22 (0)| 00:00:01 |----------------------------------------------------------------------------- Statistics---------------------------------------------------------- 0 recursive calls 0 db block gets 276 consistent gets 0 physical reads 0 redo size 103888 bytes sent via SQL*Net to client 2475 bytes received via SQL*Net from client 192 SQL*Net roundtrips to/from client 0 sorts (memory) 0 sorts (disk) 2856 rows processed
- 数据块被重新读入Buffer Cache
如果有新的数据需要被读入Buffer Cache中,而Buffer Cache又没有足够的空闲空间,Oracle就根据LRU算法将LRU链表中LRU端的数据置换出去。当这些数据被再次访问到时,需要重新从磁盘读入。SQL> select owner, table_name from t_test2
2 where owner = 'SYS'; 718 rows selected. Execution Plan----------------------------------------------------------Plan hash value: 1900296288--------------------------------------------------------------------------------| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |--------------------------------------------------------------------------------| 0 | SELECT STATEMENT | | 99 | 2178 | 10 (0)| 00:00:01 || 1 | TABLE ACCESS BY INDEX ROWID| T_TEST2 | 99 | 2178 | 10 (0)| 00:00:01 ||* 2 | INDEX RANGE SCAN | T_TEST2_IDX1 | 99 | | 1 (0)| 00:00:01 |-------------------------------------------------------------------------------- Predicate Information (identified by operation id):--------------------------------------------------- 2 - access("OWNER"='SYS') Statistics---------------------------------------------------------- 0 recursive calls 0 db block gets 145 consistent gets 0 physical reads 0 redo size 21690 bytes sent via SQL*Net to client 902 bytes received via SQL*Net from client 49 SQL*Net roundtrips to/from client 0 sorts (memory) 0 sorts (disk) 718 rows processed SQL> select * from t_test1; --占用Buffer Cache 47582 rows selected. Execution Plan----------------------------------------------------------Plan hash value: 1883417357 -----------------------------------------------------------------------------| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |-----------------------------------------------------------------------------| 0 | SELECT STATEMENT | | 47582 | 3996K| 151 (2)| 00:00:02 || 1 | TABLE ACCESS FULL| T_TEST1 | 47582 | 3996K| 151 (2)| 00:00:02 |----------------------------------------------------------------------------- Statistics---------------------------------------------------------- 195 recursive calls 0 db block gets 3835 consistent gets 5 physical reads 0 redo size 5102247 bytes sent via SQL*Net to client 35277 bytes received via SQL*Net from client 3174 SQL*Net roundtrips to/from client 5 sorts (memory) 0 sorts (disk) 47582 rows processed SQL> select owner, table_name from t_test2 2 where owner = 'SYS'; 718 rows selected. Execution Plan----------------------------------------------------------Plan hash value: 1900296288 --------------------------------------------------------------------------------| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |--------------------------------------------------------------------------------| 0 | SELECT STATEMENT | | 99 | 2178 | 10 (0)| 00:00:01 || 1 | TABLE ACCESS BY INDEX ROWID| T_TEST2 | 99 | 2178 | 10 (0)| 00:00:01 ||* 2 | INDEX RANGE SCAN | T_TEST2_IDX1 | 99 | | 1 (0)| 00:00:01 |-------------------------------------------------------------------------------- Predicate Information (identified by operation id):--------------------------------------------------- 2 - access("OWNER"='SYS') Statistics---------------------------------------------------------- 0 recursive calls 0 db block gets 145 consistent gets 54 physical reads 0 redo size 21690 bytes sent via SQL*Net to client 902 bytes received via SQL*Net from client 49 SQL*Net roundtrips to/from client 0 sorts (memory) 0 sorts (disk) 718 rows processed全表扫描
当发生全表扫描(Full Table Scan)时,用户进程读取表的数据块,并将他们放在LRU链表的LRU端(和上面不同,不是放在MRU端)。这样做的目的是为了使全表扫描的数据尽快被移出。因为全表扫描一般发生的频率较低,并且全表扫描的数据块大部分在以后都不会被经常使用到。
而如果你希望全表扫描的数据能被cache住,使之在扫描时放在MRU端,可以通过在创建或修改表(或簇)时,指定CACHE参数。
1.2.2 逻辑读
逻辑读指的就是从(或者视图从)Buffer Cache中读取数据块。按照访问数据块的模式不同,可以分为即时读(Current Read)和一致性读(Consistent Read)。注意:逻辑IO只有逻辑读,没有逻辑写。
即时读
即时读即读取数据块当前的最新数据。任何时候在Buffer Cache中都只有一份当前数据块。即时读通常发生在对数据进行修改、删除操作时。这时,进程会给数据加上行级锁,并且标识数据为“脏”数据。
SQL> select * from t_test1 where owner='SYS' for update; 22858 rows selected. Execution Plan----------------------------------------------------------Plan hash value: 3323170753 ------------------------------------------------------------------------------| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |------------------------------------------------------------------------------| 0 | SELECT STATEMENT | | 22858 | 1919K| 151 (2)| 00:00:02 || 1 | FOR UPDATE | | | | | ||* 2 | TABLE ACCESS FULL| T_TEST1 | 22858 | 1919K| 151 (2)| 00:00:02 |------------------------------------------------------------------------------ Predicate Information (identified by operation id):--------------------------------------------------- 2 - filter("OWNER"='SYS') Statistics---------------------------------------------------------- 44 recursive calls 23386 db block gets 2833 consistent gets 0 physical reads 5044956 redo size 2029221 bytes sent via SQL*Net to client 17138 bytes received via SQL*Net from client 1525 SQL*Net roundtrips to/from client 0 sorts (memory) 0 sorts (disk) 22858 rows processed一致性读
Oracle是一个多用户系统。当一个会话开始读取数据还未结束读取之前,可能会有其他会话修改它将要读取的数据。如果会话读取到修改后的数据,就会造成数据的不一致。一致性读就是为了保证数据的一致性。在Buffer Cache中的数据块上都会有最后一次修改数据块时的SCN。如果一个事务需要修改数据块中数据,会先在回滚段中保存一份修改前数据和SCN的数据块,然后再更新Buffer Cache中的数据块的数据及其SCN,并标识其为“脏”数据。当其他进程读取数据块时,会先比较数据块上的SCN和自己的SCN。如果数据块上的SCN小于等于进程本身的SCN,则直接读取数据块上的数据;如果数据块上的SCN大于进程本身的SCN,则会从回滚段中找出修改前的数据块读取数据。通常,普通查询都是一致性读。
下面这个例子帮助大家理解一下一致性读:
会话1中:
SQL> select object_name from t_test1 where object_id = 66; OBJECT_NAME------------------------------I_SUPEROBJ1 SQL> update t_test1 set object_name = 'TEST' where object_id = 66; 1 row updated.
会话2中:
SQL> select object_name from t_test1 where object_id = 66; OBJECT_NAME------------------------------I_SUPEROBJ1 Execution Plan----------------------------------------------------------Plan hash value: 1883417357 -----------------------------------------------------------------------------| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |-----------------------------------------------------------------------------| 0 | SELECT STATEMENT | | 1 | 27 | 151 (2)| 00:00:02 ||* 1 | TABLE ACCESS FULL| T_TEST1 | 1 | 27 | 151 (2)| 00:00:02 |----------------------------------------------------------------------------- Predicate Information (identified by operation id):--------------------------------------------------- 1 - filter("OBJECT_ID"=66) Statistics---------------------------------------------------------- 0 recursive calls 0 db block gets 661 consistent gets 0 physical reads 108 redo size 423 bytes sent via SQL*Net to client 385 bytes received via SQL*Net from client 2 SQL*Net roundtrips to/from client 0 sorts (memory) 0 sorts (disk) 1 rows processed1.2.3 查找数据
在一个查询操作中,大量的读操作都产生于数据的查找过程中。减少查找过程是我们优化IO性能问题的重要目标。
下面介绍几种主要的数据查找方式。
Full Table Scan
当查询条件无法命中任何索引、或者扫描索引的代价大于全表扫描代价的某一比例时(由参数optimizer_index_cost_adj设定),Oracle会采用全表扫描的方式查找数据。当发生全表扫描时,Oracle会自下向上一次读取一定数量(由参数db_file_multiblock_read_count设定)的数据块,一直读取到高水位标志(HWM,High Water Mark)下。Full Table Scan会引起db file scattered read事件。
INDEX UNIQUE SCAN
全表扫描查找数据的效率是非常低的。而索引能大幅提高查找效率。普通索引的数据结构是B-Tree,树的叶子节点中包含数据的ROWID,指向数据记录,同时还有指针指向前一个/后一个叶子节点。索引扫描每次读取一个数据块,索引扫描是“连续的”(Sequential)。当索引为UNIQUE索引时,每个叶子节点只会指向一条数据。如果Oracle能预知扫描结果只有0或1条记录时,会采用INDEX UNIQUE SCAN。当对Unique Index中的所有字段进行完全匹配时,会发生INDEX UNIQUE SCAN。
SQL> select object_name from t_test1 2 where object_id = 66; Execution Plan----------------------------------------------------------Plan hash value: 2634232531 ---------------------------------------------------------------------------------| Id | Operation | Name | Rows | Bytes | Cost (%CPU)|Time |---------------------------------------------------------------------------------| 0 | SELECT STATEMENT | | 1 | 27 | 1 (0)|00:00:01 || 1 | TABLE ACCESS BY INDEX ROWID| T_TEST1 | 1 | 27 | 1 (0)|00:00:01 ||* 2 | INDEX UNIQUE SCAN | T_TEST1_PK | 1 | | 1 (0)|00:00:01 |---------------------------------------------------------------------------------INDEX UNIQUE SCAN的查找过程如下:
- 从数的根节点数据块开始查找;
- 查找根节点块中所有key值中大于或等于要查找的值的最小key值;
- 如果key值大于查找值,则继续查找这个key值之前一个key值所指向的子节点数据块;
- 如果key值等于查找值,则继续查找这个key值所指向的子节点数据块;
- 如果没有key值大于或等于查找值,则继续查找最大key值所指向的子节点数据块;
- 如果继续查找的节点数据块是数一个分支节点,则重复2~4步;
- 如果查找的节点是叶子节点数据块,则在数据块中查找等于查找值的key值;
- 如果找到相等的key值,则返回数据和ROWID;
- 如果没找到相等的key值,则说明没有符合条件的数据,返回NULL。
INDEX RANGE SCAN
如果通过索引查找数据时,Oracle认为会返回数据可能会大于1,会进行INDEX RANGE SCAN,例如Unique Index中字段不完全匹配查找时、非Unique Index查找时。
SQL> select object_name from t_test1 2 where object_id < 66; 64 rows selected. Execution Plan----------------------------------------------------------Plan hash value: 1635545337 ---------------------------------------------------------------------------------| Id | Operation | Name | Rows | Bytes | Cost (%CPU)|Time |---------------------------------------------------------------------------------| 0 | SELECT STATEMENT | | 57 | 1539 | 2 (0)|00:00:01 || 1 | TABLE ACCESS BY INDEX ROWID| T_TEST1 | 57 | 1539 | 2 (0)|00:00:01 ||* 2 | INDEX RANGE SCAN | T_TEST1_PK | 57 | | 1 (0)|00:00:01 |---------------------------------------------------------------------------------INDEX RANGE SCAN分为闭包(有前后查找边界)和非闭包(只有一边或者没有边界)。返回数据会依据索引增序排序,多个相同值则会按照ROWID的增序排序。以下的查找条件都是闭包的:
WHERE column = 'Value'WHERE column like 'value%'WHERE column between 'value1' and 'value2'WHERE column in ('value1', 'value2')以下查找条件非闭包:
WHERE column < 'value1'WHERE column > 'value2'闭包条件下的INDEX RANGE SCAN的查找过程如下:
- 从数的根节点数据块开始查找;
- 查找根节点块中所有key值中大于或等于要查找的起始值的最小key值;
- 如果key值大于起始值,则继续查找这个key值之前一个key值所指向的子节点数据块;
- 如果key值等于起始值,则继续查找这个key值所指向的子节点数据块;
- 如果没有key值大于或等于起始值,则继续查找最大key值所指向的子节点数据块;
- 如果继续查找的节点数据块是数一个分支节点,则重复2~4步;
- 如果查找的节点是叶子节点数据块,则在数据块中大于或等于要查找的起始值的最小key值;
- 如果Key值小于或等于结束值,则:如果所有Key字段都符合WHERE字句中的查找条件,则返回数据和ROWID;否则继续查找当前叶子节点所指向的右边的叶子节点。
INDEX UNIQUE SCAN和INDEX RANGE SCAN都会引起db file sequential read事件。
TABLE ACCESS BY INDEX ROWID
当发生索引扫描时,如果需要返回的字段都在索引上,则直接返回索引上的数据,而如果还需要返回非索引上的字段的值,Oracle则需要根据从索引上查找的ROWID到对应的数据块上取回数据,这时就是TABLE ACCESS BY INDEX ROWID。
INDEX FAST FULL SCAN & INDEX FULL SCAN
索引快速全扫描和全表扫描类似,一次读取db_file_multiblock_read_count个数据块来描所有索引的叶子节点。INDEX FAST FULL SCAN和其他索引扫描不同,它不会从树的根节点开始读取,而是直接扫描所有叶子节点;也不会一次读取一个数据块,而是一次读取db_file_multiblock_read_count个数据块。INDEX FAST FULL SCAN会引起db file scattered read事件。
SQL> select count(1) from t_test1 where object_id < 21314; Execution Plan----------------------------------------------------------Plan hash value: 1586700957 ---------------------------------------------------------------------------------| Id | Operation | Name | Rows | Bytes| Cost (%CPU)| Time |---------------------------------------------------------------------------------| 0 | SELECT STATEMENT | | 1| 4| 24 (5)| 00:00:01|| 1 | SORT AGGREGATE | | 1| 4| | ||* 2 | INDEX FAST FULL SCAN| T_TEST1_PK | 18264| 73056| 24 (5)| 00:00:01|---------------------------------------------------------------------------------
在某些情况下,如db_file_multiblock_read_count值过小、强制使用索引扫描时,会发生INDEX FULL SCAN。INDEX FULL SCAN和INDEX FAST FULL SCAN不同,它是一种索引扫描,按照B-Tree的查找法从树的根节点开始扫描,遍历整棵树,并且一次读取一个数据块。它会引起db file sequential read事件。
SQL> select /*+index(a t_test1_pk)*/count(1) from t_test1 a; Execution Plan----------------------------------------------------------Plan hash value: 138350774 -----------------------------------------------------------------------| Id | Operation | Name | Rows | Cost (%CPU)| Time |-----------------------------------------------------------------------| 0 | SELECT STATEMENT | | 1 | 61 (2)| 00:00:01 || 1 | SORT AGGREGATE | | 1 | | || 2 | INDEX FULL SCAN| T_TEST1_PK | 47582 | 61 (2)| 00:00:01 |-----------------------------------------------------------------------