四、性能调整和优化
Oracle性能调整和优化是个复杂的命题,涵盖表结构设计、查询设计、参数调整等方面,前文介绍的压缩和并行都是简单的参数调整优化手段,如果是实际的查询,而不是基准测试,我们就需要充分利用Oracle的功能,针对每个查询单独优化。
1、查询的改写
由于Oracle的查询优化器相对比较智能,对SQL语句书写的要求比其他数据库要相对低一些,这给应用开发人员带来了方便。比如第15个查询,下面2种完全迥异的写法,执行效果却是差不多的。
--根据原始的第15个查询语句,将视图改为子查询
SQL> select * from(
2 select
3 s_suppkey,
4 s_name,
5 s_address,
6 s_phone,
7 total_revenue
8 from
9 supplier,
10 (
11 select
12 l_suppkey supplier_no,
13 sum(l_extendedprice * (1 - l_discount))total_revenue
14 from
15 lineitem
16 where
17 l_shipdate >= date '1995-02-01'
18 and l_shipdate < date '1995-02-01' + interval '3' month
19 group by
20 l_suppkey
21 )
22 revenue0
23 where
24 s_suppkey = supplier_no
25 and total_revenue = (
26 select
27 max(total_revenue)
28 from
29 (
30 select
31 l_suppkey supplier_no,
32 sum(l_extendedprice * (1 - l_discount))total_revenue
33 from
34 lineitem
35 where
36 l_shipdate >= date '1995-02-01'
37 and l_shipdate < date '1995-02-01' + interval '3' month
38 group by
39 l_suppkey
40 )
41 revenue0
42 )
43 order by
44 s_suppkey)
45 where rownum <= 10;
S_SUPPKEY S_NAME S_ADDRESS S_PHONE TOTAL_REVENUE
---------- ------------------- ------------------- --------------- -------------
83966 Supplier#000083966 0ITp9HCIUHEHgWCjeTt 24-897-113-5492 2147201.69
已用时间: 00: 00: 20.46
--修改后的第15个查询语句,分析函数写法
SQL> select s_suppkey, s_name, s_address, s_phone, total_revenue
2 from supplier
3 ,(select l_suppkey as supplier_no,
4 sum(l_extendedprice * (1 - l_discount)) as total_revenue
5 ,RANK() OVER(ORDER BY sum(l_extendedprice * (1 - l_discount)) DESC) AS rnk
6 from lineitem
7 where l_shipdate >= date '1995-02-01'
8 and l_shipdate < date '1995-02-01' + interval '3' month
9 group by
10 l_suppkey
11 ) revenue1
12 where s_suppkey = supplier_no
13 AND rnk=1
14 order by s_suppkey;
S_SUPPKEY S_NAME S_ADDRESS S_PHONE TOTAL_REVENUE
---------- ------------------- ------------------- --------------- -------------
83966 Supplier#000083966 0ITp9HCIUHEHgWCjeTt 24-897-113-5492 2147201.69
已用时间: 00: 00: 20.16
SQL> select * from(
2 select
3 s_suppkey,
4 s_name,
5 s_address,
6 s_phone,
7 total_revenue
8 from
9 supplier,
10 (
11 select
12 l_suppkey supplier_no,
13 sum(l_extendedprice * (1 - l_discount))total_revenue
14 from
15 lineitem
16 where
17 l_shipdate >= date '1995-02-01'
18 and l_shipdate < date '1995-02-01' + interval '3' month
19 group by
20 l_suppkey
21 )
22 revenue0
23 where
24 s_suppkey = supplier_no
25 and total_revenue = (
26 select
27 max(total_revenue)
28 from
29 (
30 select
31 l_suppkey supplier_no,
32 sum(l_extendedprice * (1 - l_discount))total_revenue
33 from
34 lineitem
35 where
36 l_shipdate >= date '1995-02-01'
37 and l_shipdate < date '1995-02-01' + interval '3' month
38 group by
39 l_suppkey
40 )
41 revenue0
42 )
43 order by
44 s_suppkey)
45 where rownum <= 10;
S_SUPPKEY S_NAME S_ADDRESS S_PHONE TOTAL_REVENUE
---------- ------------------- ------------------- --------------- -------------
83966 Supplier#000083966 0ITp9HCIUHEHgWCjeTt 24-897-113-5492 2147201.69
已用时间: 00: 00: 20.46
--修改后的第15个查询语句,分析函数写法
SQL> select s_suppkey, s_name, s_address, s_phone, total_revenue
2 from supplier
3 ,(select l_suppkey as supplier_no,
4 sum(l_extendedprice * (1 - l_discount)) as total_revenue
5 ,RANK() OVER(ORDER BY sum(l_extendedprice * (1 - l_discount)) DESC) AS rnk
6 from lineitem
7 where l_shipdate >= date '1995-02-01'
8 and l_shipdate < date '1995-02-01' + interval '3' month
9 group by
10 l_suppkey
11 ) revenue1
12 where s_suppkey = supplier_no
13 AND rnk=1
14 order by s_suppkey;
S_SUPPKEY S_NAME S_ADDRESS S_PHONE TOTAL_REVENUE
---------- ------------------- ------------------- --------------- -------------
83966 Supplier#000083966 0ITp9HCIUHEHgWCjeTt 24-897-113-5492 2147201.69
已用时间: 00: 00: 20.16
2、统计信息收集和管理
正确的统计信息对Oracle得出较好的执行计划有十分重要的影响,在大量插入或更新数据以后,甚至对表进行move后,需要重新收集统计信息。比如:对某个用户下所有的对象收集统计信息,degree表示并行收集的并行度。
SQL> set timi on
SQL> exec dbms_stats.gather_schema_stats('TPCH');
PL/SQL procedure successfully completed.
Elapsed: 00:19:29.98
SQL> exec dbms_stats.gather_schema_stats(ownname => 'TPCH', degree => 32)
PL/SQL procedure successfully completed.
Elapsed: 00:14:55.46
SQL> exec dbms_stats.gather_schema_stats('TPCH');
PL/SQL procedure successfully completed.
Elapsed: 00:19:29.98
SQL> exec dbms_stats.gather_schema_stats(ownname => 'TPCH', degree => 32)
PL/SQL procedure successfully completed.
Elapsed: 00:14:55.46
Oracle也提供了自动统计信息收集任务,一般在晚间执行,该过程首先检测统计信息缺失和陈旧的对象。然后确定优先级,再开始进行统计信息。
这个功能还是很有用的,比如第18个查询,单进程查询,没有收集统计信息前需要2个多小时,自动统计信息收集后只要不到2分钟就完成了。如果不利用统计信息,那么必须要求开发人员非常熟悉Oracle的各种连接和排序方法,人工添加提示来影响执行计划,这没有相当丰富的开发经验是做不到的。
还有其他的优化手段,比如添加必要的索引,由于时间所限,兼之前文所述的“TPC-H检查可用数据的大多数”原因,没有进行测试。但单从添加dss.ri中的主外键约束来看,测试结果和不带主外键约束,差别不大,具体数据就不易已列出了,有兴趣的读者可以自行检验。