【IT168 技术】连续范围问题也是一个非常经典的SQL编程问题。为了使讲解易于理解,我们先来创建一些测试数据。
CREATE TABLE t ( a INT UNSIGNED NOT NULL PRIMARY KEY );
INSERT INTO t VALUES(1);
INSERT INTO t VALUES(2);
INSERT INTO t VALUES(3);
INSERT INTO t VALUES(100);
INSERT INTO t VALUES(101);
INSERT INTO t VALUES(103);
INSERT INTO t VALUES(104);
INSERT INTO t VALUES(105);
INSERT INTO t VALUES(1);
INSERT INTO t VALUES(2);
INSERT INTO t VALUES(3);
INSERT INTO t VALUES(100);
INSERT INTO t VALUES(101);
INSERT INTO t VALUES(103);
INSERT INTO t VALUES(104);
INSERT INTO t VALUES(105);
可以看到1~3是连续的,100~101是连续的,103~105是连续的,那么怎么能得到如下表所示的结果呢?
我们来看下面这句SQL语句及其返回的结果集。
mysql> SELECT a,@a:=@a+1 rn FROM t,(SELECT @a:=0) AS a;
+-----+------+
| a | rn |
+-----+------+
| 1 | 1 |
| 2 | 2 |
| 3 | 3 |
| 100 | 4 |
| 101 | 5 |
| 103 | 6 |
| 104 | 7 |
| 105 | 8 |
+-----+------+
8 rows in set (0.00 sec)
+-----+------+
| a | rn |
+-----+------+
| 1 | 1 |
| 2 | 2 |
| 3 | 3 |
| 100 | 4 |
| 101 | 5 |
| 103 | 6 |
| 104 | 7 |
| 105 | 8 |
+-----+------+
8 rows in set (0.00 sec)
rn列是人为计算出来的行号。是不是可以通过连续给出的行号来反映出连续范围的规律呢?如果还没有看出,那么再看下面这个SQL及它的返回结果集。
mysql> SELECT a,rn,a-rn
-> FROM
-> (SELECT a,@a:=@a+1 rn FROM t,(SELECT @a:=0) AS a)
-> AS b;
+-----+------+------+
| a | rn | a-rn |
+-----+------+------+
| 1 | 1 | 0 |
| 2 | 2 | 0 |
| 3 | 3 | 0 |
| 100 | 4 | 96 |
| 101 | 5 | 96 |
| 103 | 6 | 97 |
| 104 | 7 | 97 |
| 105 | 8 | 97 |
+-----+------+------+
8 rows in set (0.00 sec)
-> FROM
-> (SELECT a,@a:=@a+1 rn FROM t,(SELECT @a:=0) AS a)
-> AS b;
+-----+------+------+
| a | rn | a-rn |
+-----+------+------+
| 1 | 1 | 0 |
| 2 | 2 | 0 |
| 3 | 3 | 0 |
| 100 | 4 | 96 |
| 101 | 5 | 96 |
| 103 | 6 | 97 |
| 104 | 7 | 97 |
| 105 | 8 | 97 |
+-----+------+------+
8 rows in set (0.00 sec)
是的,在同一组连续值内,连续数值的差是一个常量,因为组内没有间断。当出现一个新组时,其列和行号之间的差值开始增大。所以对于连续范围的统计,我们可以根据差值来进行分组统计,具体过程如下:
mysql> SELECT MIN(a) start_range,MAX(a) end_range
-> FROM
-> (
-> SELECT a,rn,a-rn AS diff
-> FROM
-> (SELECT a,@a:=@a+1 rn FROM t,(SELECT @a:=0) AS a)
-> AS b
->) AS c
-> GROUP BY diff
-> ;
+-------------+-----------+
| start_range | end_range |
+-------------+-----------+
| 1 | 3 |
| 100 | 101 |
| 103 | 105 |
+-------------+-----------+
3 rows in set (0.00 sec)
-> FROM
-> (
-> SELECT a,rn,a-rn AS diff
-> FROM
-> (SELECT a,@a:=@a+1 rn FROM t,(SELECT @a:=0) AS a)
-> AS b
->) AS c
-> GROUP BY diff
-> ;
+-------------+-----------+
| start_range | end_range |
+-------------+-----------+
| 1 | 3 |
| 100 | 101 |
| 103 | 105 |
+-------------+-----------+
3 rows in set (0.00 sec)
在这里留给读者一个思考题,给出不连续的范围,也就是间断的范围,如何得到如下表所示的间断范围结果呢?
作者简介
姜承尧(DavidJiang),《MySQL技术内幕:SQL编程》、《MySQL技术内幕:InnoDB存储引擎》作者,资深MySQL数据库专家,MySQL开源分支版本InnoSQL的创始人,独立数据库咨询顾问。不仅擅长于数据库的管理和维护,还擅长于数据库的开发,同时一直致力于MySQL数据库底层实现原理的研究和探索,对高性能数据库和数据仓库也有深刻而独到的理解。目前就职于网易研究院,担任后台技术中心技术经理一职,从事MySQL数据库底层以及云的相关的开发工作。