6.性能调整和优化
性能调整和优化,涵盖表结构设计、存储方式设计、查询设计、参数调整等方面,前文介绍的压缩和并行都是简单的参数调整手段,如果是实际的查询,而不是基准测试,我们就需要充分利用PostgreSQL数据库的功能,针对每个查询单独优化。
1.执行计划的查看
要利用改写查询语句提高执行效率,首先要了解现有的执行计划。
执行计划的查看,使用explain命令后跟需要查看计划的查询语句。以第17个查询为例,原始写法和修改写法分别采用相关子查询和表连接方式。
pgdb=# explain -- 17
pgdb-# select
pgdb-# sum(l_extendedprice) / 7.0 as avg_yearly
pgdb-# from
pgdb-# lineitem,
pgdb-# part
pgdb-# where
pgdb-# p_partkey = l_partkey
pgdb-# and p_brand = 'Brand#23'
pgdb-# and p_container = 'MED CAN'
pgdb-# and l_quantity < (
pgdb(# select
pgdb(# 0.2 * avg(l_quantity)
pgdb(# from
pgdb(# lineitem
pgdb(# where
pgdb(# l_partkey = p_partkey
pgdb(# ) limit 10;
QUERY PLAN
----------------------------------------------------------------------------------------------------------
Limit (cost=9090485.90..9090485.91 rows=1 width=10)
-> Aggregate (cost=9090485.90..9090485.91 rows=1 width=10)
-> Nested Loop (cost=0.00..9090431.30 rows=21839 width=10)
Join Filter: (public.lineitem.l_quantity < (SubPlan 1))
-> Seq Scan on part (cost=0.00..71608.29 rows=2081 width=4)
Filter: ((p_brand = 'Brand#23'::bpchar) AND (p_container = 'MED CAN'::bpchar))
-> Index Scan using lineitem_fk2 on lineitem (cost=0.00..134.05 rows=31 width=21)
Index Cond: (public.lineitem.l_partkey = part.p_partkey)
SubPlan 1
-> Aggregate (cost=135.45..135.46 rows=1 width=7)
-> Index Scan using lineitem_fk2 on lineitem (cost=0.00..135.37 rows=31 width=7)
Index Cond: (l_partkey = $0)
(12 行)
时间: 3.414 ms
pgdb=# explain -- 17c
pgdb-# select
pgdb-# sum(l_extendedprice) / 7.0 as avg_yearly
pgdb-# from (select l_extendedprice, l_partkey ,l_quantity
pgdb(# from
pgdb(# lineitem,
pgdb(# part
pgdb(# where
pgdb(# p_partkey = l_partkey
pgdb(# and p_brand = 'Brand#23'
pgdb(# and p_container = 'MED CAN'
pgdb(# )b
pgdb-# ,(select l_partkey,0.2 * avg(l_quantity) p2avg_l_quantity
pgdb(# from
pgdb(# (select l_partkey,l_quantity from
pgdb(# lineitem
pgdb(# where
pgdb(# l_partkey in (select p_partkey from part where p_brand = 'Brand#23'and p_container = 'MED CAN')
pgdb(# )a
pgdb(# group by l_partkey
pgdb(# )c
pgdb-# where b.l_partkey=c.l_partkey and b.l_quantity <p2avg_l_quantity
pgdb-# ;
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------
Aggregate (cost=746887.13..746887.14 rows=1 width=10)
-> Hash Join (cost=710378.97..745168.26 rows=687545 width=10)
Hash Cond: (public.lineitem.l_partkey = public.part.p_partkey)
Join Filter: (public.lineitem.l_quantity < ((0.2 * avg(public.lineitem.l_quantity))))
-> GroupAggregate (cost=357749.59..359223.70 rows=65516 width=11)
-> Sort (cost=357749.59..357913.38 rows=65516 width=11)
Sort Key: public.lineitem.l_partkey
-> Nested Loop (cost=71613.49..351388.45 rows=65516 width=11)
-> HashAggregate (cost=71613.49..71634.30 rows=2081 width=4)
-> Seq Scan on part (cost=0.00..71608.29 rows=2081 width=4)
Filter: ((p_brand = 'Brand#23'::bpchar) AND (p_container = 'MED CAN'::bpchar))
-> Index Scan using lineitem_fk2 on lineitem (cost=0.00..134.05 rows=31 width=11)
Index Cond: (public.lineitem.l_partkey = public.part.p_partkey)
-> Hash (cost=351362.44..351362.44 rows=65516 width=25)
-> Nested Loop (cost=0.00..351362.44 rows=65516 width=25)
-> Seq Scan on part (cost=0.00..71608.29 rows=2081 width=4)
Filter: ((p_brand = 'Brand#23'::bpchar) AND (p_container = 'MED CAN'::bpchar))
-> Index Scan using lineitem_fk2 on lineitem (cost=0.00..134.05 rows=31 width=21)
Index Cond: (public.lineitem.l_partkey = public.part.p_partkey)
(19 行)
时间: 2.887 msEXPLAIN命令较快地给出了执行计划。从估计的执行成本来看,第一个语句的成本比第二个语句要大,实际执行的结果,第2个语句大约2.6秒,比第1个语句的4秒提高了不少。