Pivot Tables - Universidade NOVA de Lisboa...Pivot Tables – Motivation (1) 2 If we want to know...

26
Pivot Tables The Pivot relational operator (available in some SQL platforms/servers) allows us to write cross-tabulation queries from tuples in tabular layout. It takes data in separate rows, aggregates it and convert it into columns. 1

Transcript of Pivot Tables - Universidade NOVA de Lisboa...Pivot Tables – Motivation (1) 2 If we want to know...

  • Pivot Tables

    The Pivot relational operator (available in some SQL platforms/servers) allows us to write cross-tabulation queries from tuples in tabular layout. It takes data in separate rows, aggregates it and convert it into columns.

    1

  • Pivot Tables – Motivation (1)

    2

    If we want to know how many customers bought someting once, twice, thrice and so on, from each state, a regular SQL to satisfy that query would be, select state_code, times_purchased, count(*) cnt from customers group by state_code, times_purchased;

    Cust_id Cust_name State_code Times_purchased

    1 John CT 1

    2 Mary NY 10

    3 Alfredo NJ 2

    4 Ana NY 4

    ... ... ...

    Considering table customers as:

  • Pivot Tables – Motivation (2)

    3

    This is the information we need but it is a little hard to read. A crosstab where we could organize the data vertically and states horizontally would be preferable:

    State_code Times_purchased cnt

    CT 0 90

    CT 1 165

    CT 2 179

    ... ... ...

    NY 1 33048

    Whose result would be:

    Times_purchased CT NY NJ ...

    0 90 0 35 ...

    1 165 33048

    20 ...

    2 179 219 37 ...

    3 ...

  • Pivot Tables: another example

    order_id customer_ref product_id

    50001 SMITH 10

    50002 SMITH 20

    50003 ANDERSON 30

    50004 ANDERSON 40

    50005 JONES 10

    50006 JONES 20

    50007 SMITH 20

    50008 SMITH 10

    50009 SMITH 20

    The following tuples:

    4

    Can be shown as:

    customer_ref 10 20 30

    ANDERSON 0 0 1

    JONES 1 1 0

    SMITH 2 3 0

  • PIVOT clause – syntax (1)

    SELECT * FROM ( SELECT column1,…, columnj FROM tables WHERE conditions ) PIVOT ( aggregate_function(columnj) FOR columnj IN ( expr1, expr2, ... expr_n) | subquery ) ORDER BY expression [ ASC | DESC ];

    5

  • PIVOT clause – syntax (2)

    Where: aggregate_function can be a function such as SUM, COUNT, MIN, MAX or AVG IN ( expr1, expr2, ... expr_n ) is a list of values for columnj to pivot into headings in the cross-tabulation query. Each distinct value will be shown as a separate column subquery can be used instead of a list of values.

    6

  • PIVOT clause – Application (1)

    7

    select * from ( select times_purchased times, state_code from customers t ) pivot ( count(state_code) for state_code in ('NY','CT','NJ','FL','MO') ) order by times_purchased

    times NY CT NJ FL MO

    0 16601 90 35 0 0

    1 33048 165 20 0 0

    2 33151 179 37 0 0

    3 32978 173 0 0 0

    4 33109 173 0 1 0

  • Searching with PIVOT clause (1)

    8

    EMPNO ENAME JOB MGR HIREDATE SAL COMM DEPTNO

    7839 KING PRESIDENT 17-NOV-81 5000 10

    7698 BLAKE MANAGER 7839 01-MAY-81 2850 30

    7782 CLARK MANAGER 7839 09-JUN-81 2450 10

    7566 JONES MANAGER 7839 02-APR-81 2975 20

    ... ... ... ... ... ... ... ...

    EMP table

    Question: For each job, display the salary totals in a separate column for each department.

  • Searching with PIVOT clause (2)

    9

    JOB 10 20 30 40

    CLERK 1430 2090 1045

    SALESMAN 6160

    PRESIDENT 5500

    MANAGERT 2695 3272.5 3135

    ANALYST 6600

    WITH pivot_data AS (SELECT deptno, job, sal from EMP) select * from pivot_data PIVOT ( SUM(sal) for deptno in (10, 20, 30, 40) );

    The list of values in deptno was hard-coded in this example (10, 20, 30, 40)

  • Searching with PIVOT clause (2)

    10

    JOB 10 20 30 40

    CLERK 1430 2090 1045

    SALESMAN 6160

    PRESIDENT 5500

    MANAGER 2695 3272.5 3135

    ANALYST 6600

    select * from (SELECT deptno, job, sal from EMP) PIVOT ( SUM(sal) for deptno in (10, 20, 30, 40));

    Alternatively, an inline-view may be used to obtain the same result:

  • Searching with PIVOT clause (3)

    11

    Groupings will be affected if pivot queries are performed on a larger set of columns. Ex: SELECT * from EMP PIVOT ( SUM(sal) for deptno in (10, 20, 30, 40)); Here, deptno is still the pivot column but the large group of columns Including a superkey of EMP cause the effective useless of the pivot (results in the next slide).

  • Searching with PIVOT clause (4)

    12

    EMPNO ENAME JOB MGR HIREDATE COMM 10 20 30 40

    7654 MARTIN SALESMAN 7698 28/09/81 1400 1375

    7698 BLAKE MANAGER 7839 01/05/81 3135

    7934 MILLER CLERK 7782 23/01/82 1430

    7521 WARD SALESMAN 7782 22/02/81 500 1375

    7566 JONES MANAGER 7698 02/04/81 3272.5

    7844 TURNER SALESMAN 7839 08/09/81 0 1650

    7900 JAMES CLERK 7698 03/12/81 1045

    7839 KING PRESIDENT 19/04/87 5500

    7876 ADAMS CLERK 7788 23/05/87 1210

    7902 FORD ANALYST 7566 03/12/81 3300

    ... ... ... ... ... ... ... ... ... ...

  • Searching with PIVOT clause (5)

    13

    Question: For ANALYST, CLERK and SALESMAN, display the salary totals in a separate column for each department.

    WITH pivot_data AS (SELECT deptno, job, sal from EMP) select * from pivot_data PIVOT ( SUM(sal) for deptno in (10, 20, 30, 40)) where job in (‘ANALYST’, ‘CLERK’, ‘SALESMAN’);

    JOB 10 20 30 40

    CLERK 1430 2090 1045

    SALESMAN 6160

    ANALYST 6600

  • Searching with PIVOT clause (6)

    14

    Aliases can be used:

    WITH pivot_data AS (SELECT deptno, job, sal from EMP) select * from pivot_data PIVOT ( SUM(sal) as salaries for deptno in (10 as Dep10, 20 as Dep20, 30 as Dep30, 40 AS Dep40)) where job in (‘ANALYST’, ‘CLERK’, ‘SALESMAN’);

    JOB Dep10_salaries Dep20_salaries Dep30_salaries Dep40_salaries

    CLERK 1430 2090 1045

    SALESMAN 6160

    ANALYST 6600

  • Searching with PIVOT clause (7)

    15

    Pivoting multiple columns:

    WITH pivot_data AS (SELECT deptno, job, sal from EMP) select * from pivot_data PIVOT ( SUM(sal) as sum, count(sal) as cnt for deptno in (10 as D10, 20 as D20, 30 as D30));

    JOB D10_sum D10_cnt ... D30_sum D30_cnt

    CLERK 1430 1 ... 1045 1

    SALESMAN 0 ... 6160 4

    PRESIDENT 5500 1 ... 0

    MANAGER 2695 1 ... 3135 1

    ANALYST 0 ... 0

  • Searching with PIVOT clause (8)

    16

    Or:

    WITH pivot_data AS (SELECT deptno, job, sal from EMP) select * from pivot_data PIVOT ( SUM(sal) as sum, count(sal) as cnt for (deptno, job) in ((30, 'SALESMAN') as d30_sls, (30, 'MANAGER') as d30_mgr, (30, 'CLERK') AS d30_clk));

    D30_SLS_SUM D30_SLS_CNT D30_MGR_SUM D30_MGR_CNT ...

    6160 4 3135 1 ...

  • PIVOTing an Unknown Domain of Values (1)

    17

    By default, the pivot syntax does not support a dynamic list of values in the pivot_in_clause. A subquery instead of a hard-code list of values used in the pivot_in_clause will generate an error: SELECT * FROM emp PIVOT (SUM(sal) AS salaries FOR deptno IN (SELECT deptno FROM dept));

  • PIVOTing an Unknown Domain of Values (2)

    18

    A possible workaround to solve this problem: (with Oracle) select * from (SELECT deptno, job, sal from EMP) PIVOT XML ( SUM(sal) for deptno in (any));

    JOB DEPTNO_XML

    ANALYST 206600

    MANAGER ....

    ... ...

    It implies extra work to read the information from the XML format!

  • PIVOTing an Unknown Domain of Values (3)

    19

    Another workaround to solve the problem (with Oracle SQLplus): column namelist new_value nlist noprint; /* first obtain a string with the list of distinct values of deptno select wm_concat(''''||deptno||'''') namelist from (select distinct deptno from emp) connect by nocycle deptno = prior deptno group by level; WITH pivot_data AS (SELECT deptno, job, sal from EMP) select * from pivot_data PIVOT ( SUM(sal) for deptno in (&nlist)); /* &nlist is a variable containing the string “'10','20','30‘“(results in the next slide). */

  • PIVOTing an Unknown Domain of Values (4)

    20

    JOB 10 20 30

    CLERK 1430 2090 1045

    SALESMAN 6160

    PRESIDENT 5500

    MANAGER 2695 3272.5 3135

    ANALYST 6600

  • UnPIVOT – turning pivot tables into rows (1)

    21

    SELECT ... FROM ... UNPIVOT [INCLUDE|EXCLUDE NULLS] ( unpivot_clause unpivot_for_clause unpivot_in_clause ) WHERE ...

    unpivot clause: specifies a name for a column to represent the unpivoted measure values. unpivot_for_clause: specifies the name for the column that will result from our unpivot query. unpivot_for_clause: this contains the list of pivoted columns (not values) to be unpivoted

  • UnPIVOT – turning pivot tables into rows (2)

    22

    CREATE VIEW pivoted_data as SELECT * FROM pivot_data PIVOT (SUM(sal) FOR deptno IN (10 AS d10_sal, 20 as d20_sal, 30 aS d30_sal, 40 AS d40_sal));

    select * from pivoted_data;

    JOB D10_sal D20_sal D30_sal D40_sal

    CLERK 1430 2090 1045

    SALESMAN 6160

    PRESIDENT 5500

    MANAGERT 2695 3272.5 3135

    ANALYST 6600

  • UnPIVOT – turning pivot tables into rows (3)

    23

    SELECT * FROM pivoted_data UNPIVOT ( Deptsal FOR saldesc IN (d10_sal, d20_sal, d30_sal, d40_sal) );

    JOB SALDESC DEPTSAL

    CLERK D10_SAL 1430

    CLERK D20_SAL 2090

    CLERK D30_SAL 1045

    SALESMAN D30_SAL 6160

    PRESIDENT D10_SAL 5500

    MANAGER D10_SAL 2695

    MANAGER D20_SAL 3272.5

    MANAGER D30_SAL 3135

    ANALYST D20_SAL 6600

  • UnPIVOT – other uses (1)

    24

    Since columns in the unpivot_in_clause must all be of the same datatype, this would cause an error:

    SELECT empno, job, unpivot_col_name, unpivot_col_value FROM emp UNPIVOT (unpivot_col_value FOR unpivot_col_name IN (ename, deptno, hiredate));

  • UnPIVOT – other uses (2)

    25

    A workaround (in oracle) consists on datatype conversion: WITH emp_data AS ( SELECT empno, job , ename , TO_CHAR(deptno) as deptno, TO_CHAR(hiredate) as hiredate FROM emp) SELECT empno , job , unpivot_col_name , unpivot_col_value FROM emp_data UNPIVOT (unpivot_col_value FOR unpivot_col_name IN (ename, deptno, hiredate)); (results in the next page)

  • UnPIVOT – other uses (3)

    26

    EMPNO JOB UNPIVOT_COL_NAME UNPIVOT_COL_VALUE

    7369 CLERK ENAME SMITH

    7369 CLERK DEPTNO 20

    7369 CLERK HIREDATE 17/12/1980

    7499 SALESMAN ENAME ALLEN

    7499 SALESMAN DEPTNO 30

    7499 SALESMAN HIREDATE 20/02/1981

    ... ... ... ...