技术开发 频道

点评Oracle11g新特性:行列转换语句


【IT168 Oracle在11g以前,行列转化是一个比较麻烦的事情。对于列转行,需要使用很多的聚集函数加上DECODE语句。11g推出的PIVOT语法可以轻松的解决列转行的问题。

    先根据USER_OBJECTS建立一张测试表:

SQL> CREATE TABLE T AS SELECT * FROM DBA_SEGMENTS;

表已创建。

    为了将列转换为行,10g及以前版本需要:
SQL> SELECT 2 OWNER, 3 SUM(DECODE(SEGMENT_TYPE, 'TABLE', BYTES)) TABLE_SIZE, 4 SUM(DECODE(SEGMENT_TYPE, 'TABLE PARTITION', BYTES)) TABLE_PART_SIZE, 5 SUM(DECODE(SEGMENT_TYPE, 'INDEX', BYTES)) INDEX_SIZE, 6 SUM(DECODE(SEGMENT_TYPE, 'INDEX PARTITION', BYTES)) INDEX_PART_SIZE 7 FROM T 8 GROUP BY OWNER; OWNER TABLE_SIZE TABLE_PART_SIZE INDEX_SIZE INDEX_PART_SIZE ------------------------------ ---------- --------------- ---------- --------------- WKSYS 2621440 5177344 MDSYS 11993088 131072 6881280 393216 YANGTK 1114112 65536 TSMSYS 65536 65536 WK_TEST 5767168 6553600 OUTLN 196608 262144 CTXSYS 2031616 3407872 OLAPSYS 7471104 8847360 FLOWS_FILES 65536 262144 SYSTEM 7077888 2490368 9568256 3276800 EXFSYS 1310720 2490368 DBSNMP 1048576 524288 ORDSYS 3670016 5832704 SYSMAN 53608448 69402624 XDB 17170432 7471104 FLOWS_030000 41680896 54329344 SYS 547356672 8716288 165216256 7929856 WMSYS 2424832 3866624 已选择18行。
    在11g中可以使用新增的语法PIVOT:

SQL> SELECT * 2 FROM (SELECT OWNER, SEGMENT_TYPE, BYTES FROM T) 3 PIVOT (SUM(BYTES) FOR SEGMENT_TYPE IN 4 ('TABLE' TABLE_SIZE, 'TABLE PARTITION' TABLE_PART_SIZE, 5 'INDEX' INDEX_SIZE, 'INDEX PARTITION' INDEX_PART_SIZE)); OWNER TABLE_SIZE TABLE_PART_SIZE INDEX_SIZE INDEX_PART_SIZE ------------------------------ ---------- --------------- ---------- --------------- WKSYS 2621440 5177344 MDSYS 11993088 131072 6881280 393216 YANGTK 1114112 65536 TSMSYS 65536 65536 WK_TEST 5767168 6553600 OUTLN 196608 262144 CTXSYS 2031616 3407872 OLAPSYS 7471104 8847360 FLOWS_FILES 65536 262144 SYSTEM 7077888 2490368 9568256 3276800 EXFSYS 1310720 2490368 DBSNMP 1048576 524288 ORDSYS 3670016 5832704 SYSMAN 53608448 69402624 XDB 17170432 7471104 FLOWS_030000 41680896 54329344 SYS 547356672 8716288 165216256 7929856 WMSYS 2424832 3866624 已选择18行。
    简单介绍一下语法。PIVOT操作后面是聚集函数,这里需要计算汇总的空间,因此选择了SUM。然后是设置根据SEGMENT_TYPE的不同的值,显示为不同的列。

    下面看一下二者的执行计划:

SQL> SET AUTOT ON SQL> SELECT 2 OWNER, 3 SUM(DECODE(SEGMENT_TYPE, 'TABLE', BYTES)) TABLE_SIZE, 4 SUM(DECODE(SEGMENT_TYPE, 'TABLE PARTITION', BYTES)) TABLE_PART_SIZE, 5 SUM(DECODE(SEGMENT_TYPE, 'INDEX', BYTES)) INDEX_SIZE, 6 SUM(DECODE(SEGMENT_TYPE, 'INDEX PARTITION', BYTES)) INDEX_PART_SIZE 7 FROM T 8 GROUP BY OWNER; OWNER TABLE_SIZE TABLE_PART_SIZE INDEX_SIZE INDEX_PART_SIZE ------------------------------ ---------- --------------- ---------- --------------- WKSYS 2621440 5177344 MDSYS 11993088 131072 6881280 393216 YANGTK 1114112 65536 TSMSYS 65536 65536 WK_TEST 5767168 6553600 OUTLN 196608 262144 CTXSYS 2031616 3407872 OLAPSYS 7471104 8847360 FLOWS_FILES 65536 262144 SYSTEM 7077888 2490368 9568256 3276800 EXFSYS 1310720 2490368 DBSNMP 1048576 524288 ORDSYS 3670016 5832704 SYSMAN 53608448 69402624 XDB 17170432 7471104 FLOWS_030000 41680896 54329344 SYS 547356672 8716288 165216256 7929856 WMSYS 2424832 3866624 已选择18行。 执行计划 ---------------------------------------------------------- Plan hash value: 47235625 --------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | --------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 7786 | 311K| 36 (6)| 00:00:01 | | 1 | HASH GROUP BY | | 7786 | 311K| 36 (6)| 00:00:01 | | 2 | TABLE ACCESS FULL| T | 7786 | 311K| 34 (0)| 00:00:01 | --------------------------------------------------------------------------- Note ----- - dynamic sampling used for this statement 统计信息 ---------------------------------------------------------- 0 recursive calls 0 db block gets 113 consistent gets 0 physical reads 0 redo size 1129 bytes sent via SQL*Net to client 388 bytes received via SQL*Net from client 3 SQL*Net roundtrips to/from client 0 sorts (memory) 0 sorts (disk) 18 rows processed SQL> SELECT * 2 FROM (SELECT OWNER, SEGMENT_TYPE, BYTES FROM T) 3 PIVOT (SUM(BYTES) FOR SEGMENT_TYPE IN 4 ('TABLE' TABLE_SIZE, 'TABLE PARTITION' TABLE_PART_SIZE, 5 'INDEX' INDEX_SIZE, 'INDEX PARTITION' INDEX_PART_SIZE)); OWNER TABLE_SIZE TABLE_PART_SIZE INDEX_SIZE INDEX_PART_SIZE ------------------------------ ---------- --------------- ---------- --------------- WKSYS 2621440 5177344 MDSYS 11993088 131072 6881280 393216 YANGTK 1114112 65536 TSMSYS 65536 65536 WK_TEST 5767168 6553600 OUTLN 196608 262144 CTXSYS 2031616 3407872 OLAPSYS 7471104 8847360 FLOWS_FILES 65536 262144 SYSTEM 7077888 2490368 9568256 3276800 EXFSYS 1310720 2490368 DBSNMP 1048576 524288 ORDSYS 3670016 5832704 SYSMAN 53608448 69402624 XDB 17170432 7471104 FLOWS_030000 41680896 54329344 SYS 547356672 8716288 165216256 7929856 WMSYS 2424832 3866624 已选择18行。 执行计划 ---------------------------------------------------------- Plan hash value: 3924414983 ---------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ---------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 7786 | 311K| 36 (6)| 00:00:01 | | 1 | HASH GROUP BY PIVOT| | 7786 | 311K| 36 (6)| 00:00:01 | | 2 | TABLE ACCESS FULL | T | 7786 | 311K| 34 (0)| 00:00:01 | ---------------------------------------------------------------------------- Note ----- - dynamic sampling used for this statement 统计信息 ---------------------------------------------------------- 0 recursive calls 0 db block gets 113 consistent gets 0 physical reads 0 redo size 1129 bytes sent via SQL*Net to client 388 bytes received via SQL*Net from client 3 SQL*Net roundtrips to/from client 0 sorts (memory) 0 sorts (disk) 18 rows processed

    注意,虽然PIVOT没有GROUP BY语句,但是执行过程中隐含了一个GROUP BY操作,执行计划为HASH GROUP BY PIVOT。虽然执行计划多了一个PIVOT,但是从cost和统计信息看,两个SQL的几乎没有差别。

    也许有人认为,这个PIVOT并没有带来多大的简化,而且语法也没有原来的直观。不要着急,刚才只是最简单的一种情况,如果情况复杂一些,需要汇总两个字段,且判断行转列的条件也为两个,重写上面两个SQL。

SQL> SELECT TABLESPACE_NAME, 2 SUM(CASE WHEN SEGMENT_TYPE = 'TABLE' AND OWNER = 'SYS' THEN BYTES END) S_T_BYTES, 3 SUM(CASE WHEN SEGMENT_TYPE = 'TABLE' AND OWNER = 'SYS' THEN BLOCKS END) S_T_BLOCKS, 4 SUM(CASE WHEN SEGMENT_TYPE = 'TABLE' AND OWNER = 'YANGTK' THEN BYTES END) S_Y_BYTES, 5 SUM(CASE WHEN SEGMENT_TYPE = 'TABLE' AND OWNER = 'YANGTK' THEN BLOCKS END) S_Y_BLOCKS, 6 SUM(CASE WHEN SEGMENT_TYPE = 'INDEX' AND OWNER = 'SYS' THEN BYTES END) I_T_BYTES, 7 SUM(CASE WHEN SEGMENT_TYPE = 'INDEX' AND OWNER = 'SYS' THEN BLOCKS END) I_T_BLOCKS, 8 SUM(CASE WHEN SEGMENT_TYPE = 'INDEX' AND OWNER = 'YANGTK' THEN BYTES END) I_Y_BYTES, 9 SUM(CASE WHEN SEGMENT_TYPE = 'INDEX' AND OWNER = 'YANGTK' THEN BLOCKS END) I_Y_BLOCKS 10 FROM T 11 GROUP BY TABLESPACE_NAME; TABLESPACE_NAME S_T_BYTES S_T_BLOCKS S_Y_BYTES S_Y_BLOCKS I_T_BYTES I_T_BLOCKS I_Y_BYTES I_Y_BLOCKS ---------------- --------- ---------- --------- ---------- --------- ---------- --------- ---------- SYSAUX 33095680 4040 37683200 4600 YANGTK 1114112 136 65536 8 UNDOTBS1 SYSTEM 514260992 62776 127533056 15568 SQL> SELECT * 2 FROM (SELECT TABLESPACE_NAME, SEGMENT_TYPE, OWNER, BYTES, BLOCKS FROM T) 3 PIVOT (SUM(BYTES) BYTES, SUM(BLOCKS) BLOCKS 4 FOR (SEGMENT_TYPE, OWNER) IN 5 ( 6 ('TABLE', 'SYS') S_T, 7 ('TABLE', 'YANGTK') Y_T, 8 ('INDEX', 'SYS') S_I, 9 ('INDEX', 'YANGTK') Y_I 10 )); TABLESPACE_NAME S_T_BYTES S_T_BLOCKS Y_T_BYTES Y_T_BLOCKS S_I_BYTES S_I_BLOCKS Y_I_BYTES Y_I_BLOCKS ---------------- --------- ---------- --------- ---------- --------- ---------- --------- ---------- SYSAUX 33095680 4040 37683200 4600 YANGTK 1114112 136 65536 8 UNDOTBS1 SYSTEM 514260992 62776 127533056 15568

    这一次那个SQL更加简单就一目了然了。而且,如果使用PIVOT XML功能,这个操作的功能更强大,在查询之前甚至可以不用了解需要转换的行数以及对应名称。利用子查询功能或者ANY关键字就可以实现。

SQL> SELECT * 2 FROM (SELECT TABLESPACE_NAME, BYTES FROM T) 3 PIVOT XML (SUM(BYTES) FOR TABLESPACE_NAME IN (SELECT DISTINCT TABLESPACE_NAME FROM T)); TABLESPACE_NAME_XML ---------------------------------------------------------------------------------------------------- <PivotSet><item><column name = "TABLESPACE_NAME">SYSAUX</column><column name = "SUM(BYTES)">61551411 em><column name = "TABLESPACE_NAME">SYSTEM</column><column name = "SUM(BYTES)">727121920</column></i me = "TABLESPACE_NAME">UNDOTBS1</column><column name = "SUM(BYTES)">20250624</column></item><item><c PACE_NAME">YANGTK</column><column name = "SUM(BYTES)">1179648</column></item></PivotSet>

    上面就是利用子查询在完全不清楚TABLESPACE_NAME包含数据的情况下得到的结果。由于是XML格式,通过SQLPLUS看不是很清晰,下面使用ANY关键字将结果输出到XML文件中:

SQL> host type e:call.sql set feedback off set pages 0 set lines 30000 set trims on set trim on set ver off set heading off set time off set timing off set autot off set echo off set show off set term off spo &1 @@execute.sql spo off set feedback 6 set pages 100 set lines 100 set heading on SQL> host type e:execute.sql SELECT * FROM (SELECT TABLESPACE_NAME, BYTES FROM T) PIVOT XML (SUM(BYTES) FOR TABLESPACE_NAME IN (ANY)); SQL> @E:CALL.SQL E:TABLESPACE.XML

    最后直接用IE打开:

- <PivotSet> - <item> <column name="TABLESPACE_NAME">SYSAUX</column> <column name="SUM(BYTES)">615514112</column> </item> - <item> <column name="TABLESPACE_NAME">SYSTEM</column> <column name="SUM(BYTES)">727121920</column> </item> - <item> <column name="TABLESPACE_NAME">UNDOTBS1</column> <column name="SUM(BYTES)">20250624</column> </item> - <item> <column name="TABLESPACE_NAME">YANGTK</column> <column name="SUM(BYTES)">1179648</column> </item> </PivotSet>

    最后简单解释一下,前面所有的PIVOT查询都是针对子查询的,这是为了聚集函数对制定的列进行GROUP BY,直接对表进行PIVOT也是可以的,只不过得到的结果将没有聚集效果:

SQL> SELECT OWNER, SEGMENT_NAME, SEGMENT_TYPE FROM T WHERE OWNER = 'YANGTK'; OWNER SEGMENT_NAME SEGMENT_TYPE ------------------------------ ------------------------------ ------------------ YANGTK SYS_C009495 INDEX YANGTK T_STANDBY TABLE YANGTK T1 TABLE YANGTK T_PIVOT TABLE SQL> SELECT OWNER, SEGMENT_NAME, TABLE_SIZE, TABLE_PART_SIZE FROM T 2 PIVOT (SUM(BYTES) FOR SEGMENT_TYPE IN 3 ('TABLE' TABLE_SIZE, 'TABLE PARTITION' TABLE_PART_SIZE)) 4 WHERE OWNER = 'YANGTK'; OWNER SEGMENT_NAME TABLE_SIZE TABLE_PART_SIZE ------------------------------ ------------------------------ ---------- --------------- YANGTK SYS_C009495 YANGTK T_STANDBY 65536 YANGTK T_PIVOT 65536 YANGTK T1 983040

    在11g以前,行列转化是一个比较麻烦的事情。对于列转行来说,以前只能使用UNION ALL语句,显得十分的麻烦,11g提供了UNPIVOT语句,可以很方便的解决这个问题。

    先做一个测试表,利用上一篇介绍的PIVOT语句:
SQL> CREATE TABLE T_PIVOT AS SELECT * 2 FROM (SELECT OWNER, SEGMENT_TYPE, BYTES FROM DBA_SEGMENTS) 3 PIVOT (SUM(BYTES) FOR SEGMENT_TYPE IN 4 ('TABLE' TABLE_SIZE, 'TABLE PARTITION' TABLE_PART_SIZE, 5 'INDEX' INDEX_SIZE, 'INDEX PARTITION' INDEX_PART_SIZE)); 表已创建。 SQL> SELECT * FROM T_PIVOT; OWNER TABLE_SIZE TABLE_PART_SIZE INDEX_SIZE INDEX_PART_SIZE ------------------------------ ---------- --------------- ---------- --------------- WKSYS 2621440 5177344 MDSYS 11993088 131072 6881280 393216 YANGTK 2031616 65536 TSMSYS 65536 65536 WK_TEST 5767168 6553600 OUTLN 196608 262144 CTXSYS 2031616 3407872 OLAPSYS 7471104 8847360 FLOWS_FILES 65536 262144 SYSTEM 7077888 2490368 9568256 3276800 EXFSYS 1310720 2490368 DBSNMP 1048576 524288 ORDSYS 3670016 5832704 SYSMAN 53608448 69402624 XDB 17170432 7471104 FLOWS_030000 41680896 54329344 SYS 547356672 9043968 165216256 8388608 WMSYS 2424832 3866624 已选择18行。

    在10g及以前版本要实现列转行: 

SQL> SELECT OWNER, 'TABLE' OBJECT_TYPE, TABLE_SIZE BYTES FROM T_PIVOT WHERE OWNER IN ('SYS', 'YANGTK') 2 UNION ALL 3 SELECT OWNER, 'TABLE PARTITION', TABLE_PART_SIZE FROM T_PIVOT WHERE OWNER IN ('SYS', 'YANGTK') 4 UNION ALL 5 SELECT OWNER, 'INDEX', INDEX_SIZE FROM T_PIVOT WHERE OWNER IN ('SYS', 'YANGTK') 6 UNION ALL 7 SELECT OWNER, 'INDEX PARTITION', INDEX_PART_SIZE FROM T_PIVOT WHERE OWNER IN ('SYS', 'YANGTK'); OWNER OBJECT_TYPE BYTES ------------------------------ --------------- ---------- YANGTK TABLE 2031616 SYS TABLE 547356672 YANGTK TABLE PARTITION SYS TABLE PARTITION 9043968 YANGTK INDEX 65536 SYS INDEX 165216256 YANGTK INDEX PARTITION SYS INDEX PARTITION 8388608 已选择8行。

    这种方法相对来说比较麻烦,用UNPIVOT则会简化很多:

SQL> SELECT OWNER, OBJECT_TYPE, BYTES FROM T_PIVOT 2 UNPIVOT INCLUDE NULLS (BYTES FOR OBJECT_TYPE IN 3 (TABLE_SIZE AS 'TABLE', TABLE_PART_SIZE AS 'TABLE PARTITION', 4 INDEX_SIZE AS 'INDEX', INDEX_PART_SIZE AS 'INDEX PARTITION')) 5 WHERE OWNER IN ('SYS', 'YANGTK'); OWNER OBJECT_TYPE BYTES ------------------------------ --------------- ---------- YANGTK TABLE 2031616 YANGTK TABLE PARTITION YANGTK INDEX 65536 YANGTK INDEX PARTITION SYS TABLE 547356672 SYS TABLE PARTITION 9043968 SYS INDEX 165216256 SYS INDEX PARTITION 8388608 已选择8行。

    不光是语法上的简化,从执行计划和统计信息上看:

SQL> SET AUTOT TRACE SQL> SELECT OWNER, 'TABLE' OBJECT_TYPE, TABLE_SIZE BYTES FROM T_PIVOT WHERE OWNER IN ('SYS', 'YANGTK') 2 UNION ALL 3 SELECT OWNER, 'TABLE PARTITION', TABLE_PART_SIZE FROM T_PIVOT WHERE OWNER IN ('SYS', 'YANGTK') 4 UNION ALL 5 SELECT OWNER, 'INDEX', INDEX_SIZE FROM T_PIVOT WHERE OWNER IN ('SYS', 'YANGTK') 6 UNION ALL 7 SELECT OWNER, 'INDEX PARTITION', INDEX_PART_SIZE FROM T_PIVOT WHERE OWNER IN ('SYS', 'YANGTK'); 已选择8行。 执行计划 ---------------------------------------------------------- Plan hash value: 634273332 ------------------------------------------------------------------------------ | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ------------------------------------------------------------------------------ | 0 | SELECT STATEMENT | | 8 | 240 | 12 (75)| 00:00:01 | | 1 | UNION-ALL | | | | | | |* 2 | TABLE ACCESS FULL| T_PIVOT | 2 | 60 | 3 (0)| 00:00:01 | |* 3 | TABLE ACCESS FULL| T_PIVOT | 2 | 60 | 3 (0)| 00:00:01 | |* 4 | TABLE ACCESS FULL| T_PIVOT | 2 | 60 | 3 (0)| 00:00:01 | |* 5 | TABLE ACCESS FULL| T_PIVOT | 2 | 60 | 3 (0)| 00:00:01 | ------------------------------------------------------------------------------ Predicate Information (identified by operation id): --------------------------------------------------- 2 - filter("OWNER"='SYS' OR "OWNER"='YANGTK') 3 - filter("OWNER"='SYS' OR "OWNER"='YANGTK') 4 - filter("OWNER"='SYS' OR "OWNER"='YANGTK') 5 - filter("OWNER"='SYS' OR "OWNER"='YANGTK') Note ----- - dynamic sampling used for this statement 统计信息 ---------------------------------------------------------- 0 recursive calls 0 db block gets 13 consistent gets 0 physical reads 0 redo size 620 bytes sent via SQL*Net to client 377 bytes received via SQL*Net from client 2 SQL*Net roundtrips to/from client 0 sorts (memory) 0 sorts (disk) 8 rows processed SQL> SELECT OWNER, OBJECT_TYPE, BYTES FROM T_PIVOT 2 UNPIVOT INCLUDE NULLS (BYTES FOR OBJECT_TYPE IN 3 (TABLE_SIZE AS 'TABLE', TABLE_PART_SIZE AS 'TABLE PARTITION', 4 INDEX_SIZE AS 'INDEX', INDEX_PART_SIZE AS 'INDEX PARTITION')) 5 WHERE OWNER IN ('SYS', 'YANGTK'); 已选择8行。 执行计划 ---------------------------------------------------------- Plan hash value: 2063660069 ------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 7 | 273 | 12 (0)| 00:00:01 | | 1 | VIEW | | 7 | 273 | 12 (0)| 00:00:01 | | 2 | UNPIVOT | | | | | | |* 3 | TABLE ACCESS FULL| T_PIVOT | 1 | 69 | 3 (0)| 00:00:01 | ------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 3 - filter("T_PIVOT"."OWNER"='SYS' OR "T_PIVOT"."OWNER"='YANGTK') Note ----- - dynamic sampling used for this statement 统计信息 ---------------------------------------------------------- 0 recursive calls 0 db block gets 4 consistent gets 0 physical reads 0 redo size 631 bytes sent via SQL*Net to client 377 bytes received via SQL*Net from client 2 SQL*Net roundtrips to/from client 0 sorts (memory) 0 sorts (disk) 8 rows processed

    如果不使用UNPIVOT,那么需要多个UNION ALL,多次全表扫描。而对于UNPIVOT操作,只需要一个全表扫描就可以了。
    对于UNPIVOT还可以不选择为NULL的结果,将上面的INCLUDE NULLS去掉就可以了:

SQL> SET AUTOT OFF SQL> SELECT OWNER, OBJECT_TYPE, BYTES FROM T_PIVOT 2 UNPIVOT (BYTES FOR OBJECT_TYPE IN 3 (TABLE_SIZE AS 'TABLE', TABLE_PART_SIZE AS 'TABLE PARTITION', 4 INDEX_SIZE AS 'INDEX', INDEX_PART_SIZE AS 'INDEX PARTITION')) 5 WHERE OWNER IN ('SYS', 'YANGTK'); OWNER OBJECT_TYPE BYTES ------------------------------ --------------- ---------- YANGTK TABLE 2031616 YANGTK INDEX 65536 SYS TABLE 547356672 SYS TABLE PARTITION 9043968 SYS INDEX 165216256 SYS INDEX PARTITION 8388608 已选择6行。

    如果不指定列对应的常量,那么会直接将列名作为分类的名称: 

0
相关文章