技术开发 频道

主流行式数据库评测之开源MySQL5.5版本

  6.性能调整和优化

  性能调整和优化,涵盖表结构设计、存储方式设计、查询设计、参数调整等方面,添加主键、外键是优化查询性能的一种办法,如果是实际的查询,而不是基准测试,我们就需要充分利用MySQL数据库的功能,针对每个查询单独优化,必要时用等价的语句改写。

  1.执行计划的查看

  要利用改写查询语句提高执行效率,首先要了解现有的执行计划。

  执行计划的查看,使用explain 后接需要查看计划的SQL语句。

  先看一个简单的例子,复制一份customer表。然后查询符合c_name条件的记录数。表的总行数是150万,符合条件的999行,大约占总记录数的1500分之一。创建索引后,执行计划显示使用了Index。比不使用Index的执行时间减少了96%。(1-0.02/0.56)*100%

mysql> create table c2 as select * from customer;
Query OK,
1500000 rows affected (16.37 sec)
Records:
1500000  Duplicates: 0  Warnings: 0

mysql
> select count(*) from c2 where c_name like 'Customer#000000%';
+----------+
| count(*) |
+----------+
|      999 |
+----------+
1 row in set (0.89 sec)

mysql
> explain select count(*) from c2 where c_name like 'Customer#000000%';
+----+-------------+-------+------+---------------+------+---------+------+---------+-------------+
| id | select_type | table | type | possible_keys | key  | key_len | ref  | rows    | Extra       |
+----+-------------+-------+------+---------------+------+---------+------+---------+-------------+
|  1 | SIMPLE      | c2    | ALL  | NULL          | NULL | NULL    | NULL | 1489455 | Using where |
+----+-------------+-------+------+---------------+------+---------+------+---------+-------------+
1 row in set (0.00 sec)

mysql
> select count(*) from customer where c_name like 'Customer#000000%';
+----------+
| count(*) |
+----------+
|      999 |
+----------+
1 row in set (0.56 sec)

mysql
> explain select count(*) from customer where c_name like 'Customer#000000%';
+----+-------------+----------+------+---------------+------+---------+------+---------+-------------+
| id | select_type | table    | type | possible_keys | key  | key_len | ref  | rows    | Extra       |
+----+-------------+----------+------+---------------+------+---------+------+---------+-------------+
|  1 | SIMPLE      | customer | ALL  | NULL          | NULL | NULL    | NULL | 1495259 | Using where |
+----+-------------+----------+------+---------------+------+---------+------+---------+-------------+
1 row in set (0.00 sec)

mysql
> create index c2_name on c2(c_name);
Query OK,
0 rows affected (6.35 sec)
Records:
0  Duplicates: 0  Warnings: 0

mysql
> select count(*) from c2 where c_name like 'Customer#000000%';
+----------+
| count(*) |
+----------+
|      999 |
+----------+
1 row in set (0.02 sec)

mysql
> explain select count(*) from c2 where c_name like 'Customer#000000%';
+----+-------------+-------+-------+---------------+---------+---------+------+------+-----------------------+
| id | select_type | table | type  | possible_keys | key     | key_len | ref  | rows | Extra                 |
+----+-------------+-------+-------+---------------+---------+---------+------+------+-----------------------+
|  1 | SIMPLE      | c2    | range | c2_name       | c2_name | 27      | NULL |  998 | Using where; Using index
+----+-------------+-------+-------+---------------+---------+---------+------+------+-----------------------+
1 row in set (0.00 sec)

mysql
> explain select sum(c_custkey) from c2 where c_name like 'Customer#000000%';
+----+-------------+-------+-------+---------------+---------+---------+------+------+-------------+
| id | select_type | table | type  | possible_keys | key     | key_len | ref  | rows | Extra       |
+----+-------------+-------+-------+---------------+---------+---------+------+------+-------------+
|  1 | SIMPLE      | c2    | range | c2_name       | c2_name | 27      | NULL |  998 | Using where |
+----+-------------+-------+-------+---------------+---------+---------+------+------+-------------+
1 row in set (0.00 sec)

mysql
> select sum(c_custkey) from c2 where c_name like 'Customer#000000%';
+----------------+
| sum(c_custkey) |
+----------------+
|         499500 |
+----------------+
1 row in set (0.01 sec)

mysql
> select sum(c_custkey) from customer where c_name like 'Customer#000000%';
+----------------+
| sum(c_custkey) |
+----------------+
|         499500 |
+----------------+
1 row in set (0.62 sec)

   一个有趣的现象,如果一个复杂查询的执行计划显示很快,那么基本上这是一个不良的执行计划,执行时间会比较长,反之,如果一个执行计划用了数十秒到1分钟多才显示出来,那么实际查询的时间也差不多就是这个时间了,好像优化器是在一边扫描表一边评估执行计划。比如下面第20个查询,在非压缩的情况下,第一个需要8分钟多,第2个只要差不多1分58秒。当然,这个经验也不绝对,对于一些查询,可能很长时间也求不出执行计划,但查询依然很慢。比如,tpc-h原始的第17个查询,就是既查不到执行计划,也无法在合理的时间完成,只能根据业务含义人工改写了。

0
相关文章