6.性能调整和优化
Db2性能调整和优化,涵盖表结构设计、存储方式设计、查询设计、参数调整等方面,前文介绍的压缩和并行都是简单的参数调整手段,如果是实际的查询,而不是基准测试,我们就需要充分利用db2的功能,针对每个查询单独优化。
(1).查询的改写
和Oracle相比,DB2对SQL语句书写的要求比较高一些,这给应用开发人员带来了挑战。比如第20个查询,下面2种逻辑上完全等价的写法,执行效果却是天壤之别,运行时间差不多差了20倍。
--原始的第20个查询语句
select
s_name,
s_address
from
tpch.supplier,
tpch.nation
where
s_suppkey in (
select
ps_suppkey
from
tpch.partsupp
where
ps_partkey in (
select
p_partkey
from
tpch.part
where
p_name like 'bisque%'
)
and ps_availqty > (
select
0.5 * sum(l_quantity)
from
tpch.lineitem
where
l_partkey = ps_partkey
and l_suppkey = ps_suppkey
and l_shipdate >= date '1994-01-01'
and l_shipdate < date '1994-01-01' + 1 year
)
)
and s_nationkey = n_nationkey
and n_name = 'CANADA'
order by
s_name
fetch first 10 rows only;
S_NAME S_ADDRESS
------------------------- ----------------------------------------
Supplier#000000020 iybAE,RmTymrZVYaFZva2SH,j
Supplier#000000091 YV45D7TkfdQanOOZ7q9QxkyGUapU1oOWU6q3
Supplier#000000205 rF uV8d0JNEk
Supplier#000000285 Br7e1nnt1yxrw6ImgpJ7YdhFDjuBf
Supplier#000000287 7a9SP7qW5Yku5PvSg
Supplier#000000354 w8fOo5W,aS
Supplier#000000361 f8IUYRmdVXhQC9qJQjWknCXmzhe38vCbk6
Supplier#000000475 xw4V6,4QQW LI5Qg EOKy4JD B4Cq1tjzaOma9Y
Supplier#000000530 0qwCMwobKY OcmLyfRXlagA8ukENJv,
Supplier#000000532 ep92hT7VLaVlDKM7lgbj02kIL
* 10 row(s) fetched, 10 row(s) output.
* Elapsed Time is: 587.439039 seconds
--修改后的第20个查询语句
-sh-3.2$ date; time db2 -tvf /user1/ibm/q20newkid.sql ;date
2011年 05月 12日 星期四 19:19:05 CST
select s_name, s_address from supplier, nation where s_suppkey in (select ps_suppkey from partsupp ,(select l_partkey,l_suppkey, sum(l_quantity) l_quantity_SUM from lineitem,part where l_partkey = p_partkey and p_name like 'bisque%' and l_shipdate >= date '1994-01-01' and l_shipdate < date '1994-01-01' + 1 year GROUP BY l_partkey,l_suppkey ) where l_partkey = ps_partkey and l_suppkey = ps_suppkey and ps_availqty > 0.5*l_quantity_SUM ) and s_nationkey = n_nationkey and n_name = 'CANADA' order by s_name fetch first 10 rows only
S_NAME S_ADDRESS
------------------------- ----------------------------------------
结果与上一个语句相同,略
10 record(s) selected.
real 0m25.700s
user 0m0.013s
sys 0m0.011s
2011年 05月 12日 星期四 19:19:31 CST
select
s_name,
s_address
from
tpch.supplier,
tpch.nation
where
s_suppkey in (
select
ps_suppkey
from
tpch.partsupp
where
ps_partkey in (
select
p_partkey
from
tpch.part
where
p_name like 'bisque%'
)
and ps_availqty > (
select
0.5 * sum(l_quantity)
from
tpch.lineitem
where
l_partkey = ps_partkey
and l_suppkey = ps_suppkey
and l_shipdate >= date '1994-01-01'
and l_shipdate < date '1994-01-01' + 1 year
)
)
and s_nationkey = n_nationkey
and n_name = 'CANADA'
order by
s_name
fetch first 10 rows only;
S_NAME S_ADDRESS
------------------------- ----------------------------------------
Supplier#000000020 iybAE,RmTymrZVYaFZva2SH,j
Supplier#000000091 YV45D7TkfdQanOOZ7q9QxkyGUapU1oOWU6q3
Supplier#000000205 rF uV8d0JNEk
Supplier#000000285 Br7e1nnt1yxrw6ImgpJ7YdhFDjuBf
Supplier#000000287 7a9SP7qW5Yku5PvSg
Supplier#000000354 w8fOo5W,aS
Supplier#000000361 f8IUYRmdVXhQC9qJQjWknCXmzhe38vCbk6
Supplier#000000475 xw4V6,4QQW LI5Qg EOKy4JD B4Cq1tjzaOma9Y
Supplier#000000530 0qwCMwobKY OcmLyfRXlagA8ukENJv,
Supplier#000000532 ep92hT7VLaVlDKM7lgbj02kIL
* 10 row(s) fetched, 10 row(s) output.
* Elapsed Time is: 587.439039 seconds
--修改后的第20个查询语句
-sh-3.2$ date; time db2 -tvf /user1/ibm/q20newkid.sql ;date
2011年 05月 12日 星期四 19:19:05 CST
select s_name, s_address from supplier, nation where s_suppkey in (select ps_suppkey from partsupp ,(select l_partkey,l_suppkey, sum(l_quantity) l_quantity_SUM from lineitem,part where l_partkey = p_partkey and p_name like 'bisque%' and l_shipdate >= date '1994-01-01' and l_shipdate < date '1994-01-01' + 1 year GROUP BY l_partkey,l_suppkey ) where l_partkey = ps_partkey and l_suppkey = ps_suppkey and ps_availqty > 0.5*l_quantity_SUM ) and s_nationkey = n_nationkey and n_name = 'CANADA' order by s_name fetch first 10 rows only
S_NAME S_ADDRESS
------------------------- ----------------------------------------
结果与上一个语句相同,略
10 record(s) selected.
real 0m25.700s
user 0m0.013s
sys 0m0.011s
2011年 05月 12日 星期四 19:19:31 CST
(2). 统计信息收集和管理
正确的统计信息对db2得出较好的执行计划有十分重要的影响,在大量插入或更新数据以后,包括对表进行reorg后,需要重新收集统计信息。
-sh-3.2$ time db2 "runstats on table tpch.partsupp"
DB20000I The RUNSTATS command completed successfully.
--如果包含索引,也要一并重新统计
db2 => RUNSTATS ON TABLE tpch.lineitem WITH DISTRIBUTION AND INDEXES ALL;
DB20000I The RUNSTATS command completed successfully.
--如果重新加载表数据后数据库报SQL0668N错误,意味着需要一致性检查
[db2inst1@aix:/home/db2inst1]#>db2 "select count(*) from tpch.lineitem"
SQL0668N Operation not allowed for reason code "1" on table "TPCH.LINEITEM".
SQLSTATE=57016
[db2inst1@aix:/home/db2inst1]#> db2 set integrity for tpch.lineitem immediate checked
DB20000I The SQL command completed successfully.
[db2inst1@aix:/home/db2inst1]#>db2 "select count(*) from tpch.lineitem"
1
-----------
59986052
1 record(s) selected.
DB20000I The RUNSTATS command completed successfully.
--如果包含索引,也要一并重新统计
db2 => RUNSTATS ON TABLE tpch.lineitem WITH DISTRIBUTION AND INDEXES ALL;
DB20000I The RUNSTATS command completed successfully.
--如果重新加载表数据后数据库报SQL0668N错误,意味着需要一致性检查
[db2inst1@aix:/home/db2inst1]#>db2 "select count(*) from tpch.lineitem"
SQL0668N Operation not allowed for reason code "1" on table "TPCH.LINEITEM".
SQLSTATE=57016
[db2inst1@aix:/home/db2inst1]#> db2 set integrity for tpch.lineitem immediate checked
DB20000I The SQL command completed successfully.
[db2inst1@aix:/home/db2inst1]#>db2 "select count(*) from tpch.lineitem"
1
-----------
59986052
1 record(s) selected.
db2也提供了人工添加提示来影响执行计划,这需要相当丰富的db2开发经验。
还有其他的优化手段,比如添加必要的索引,由于时间所限,只测试了给part表添加一个size索引。单表查询的性能提高很大,但在复杂查询中的效果不太明显。有兴趣的读者可以自行检验。