Thursday, January 25, 2007

REPOST: Pivot and Crosstab Queries

Here is another advanced concept that will come in useful when solving Oracle problems.

Imagine you're trying to create a result set where the rows need to be columns, or vice versa. In essence, you need to "pivot" rows into columns, or vice versa. That is a very common requirement, and this is where you need to look at a pivot (or crosstab) query to get the job done.

As always, when you want to understand something, you can start by Asking Tom.

A simple pivot query is accomplished by basically doing the following:
1. Add some kind of count or row number to your query, if necessary for the grouping
2. Then use your (revised) original query as a sub-query
3. Use "decode" to turn rows into columns (ie. a "sparse" matrix).
4. Use "max" to "squash" the multiple rows you moved to columns, into single rows. Don't forget to group by.
(Note: it gets more complicated if you don't know how many columns you'll need).

Here is another one of Ask Tom's examples. It clearly shows how you use decode to create a "sparse" matrix, and then use max to "squash" it down.

Let's look a simple example in slow motion.

Here's the data
CREATE TABLE CFL (season NUMBER(4), team VARCHAR2(16), points NUMBER(3));
INSERT INTO CFL (season, team, points) VALUES (2004, 'Argonauts', 21);
INSERT INTO CFL (season, team, points) VALUES (2004, 'Alouettes', 28);
INSERT INTO CFL (season, team, points) VALUES (2004, 'Tiger-Cats', 19);
INSERT INTO CFL (season, team, points) VALUES (2004, 'Renegades', 10);
INSERT INTO CFL (season, team, points) VALUES (2003, 'Argonauts', 18);
INSERT INTO CFL (season, team, points) VALUES (2003, 'Alouettes', 26);
INSERT INTO CFL (season, team, points) VALUES (2003, 'Tiger-Cats', 2);
INSERT INTO CFL (season, team, points) VALUES (2003, 'Renegades', 14);
INSERT INTO CFL (season, team, points) VALUES (2002, 'Argonauts', 16);
INSERT INTO CFL (season, team, points) VALUES (2002, 'Alouettes', 27);
INSERT INTO CFL (season, team, points) VALUES (2002, 'Tiger-Cats', 15);
INSERT INTO CFL (season, team, points) VALUES (2002, 'Renegades', 10);

What we want:
A table showing each of these 4 teams and their point tables for these 3 seasons.

So what is our pivot row/column? Season.

Step 1/2: We are using season, so we don't need to create our own grouping field, like count, rownum, or running total (sum) for example. That would be easy enough to do, but let's keep this simple.

Step 3: Use "decode" to turn the season row into a column. Take a look at our "sparse" matrix.

SELECT team,
DECODE (season, 2002, points, NULL) Yr2002,
DECODE (season, 2003, points, NULL) Yr2003,
DECODE (season, 2004, points, NULL) Yr2004
FROM (SELECT season, team, points FROM CFL);

TEAM                 YR2002     YR2003     YR2004
---------------- ---------- ---------- ----------
Argonauts 21
Alouettes 28
Tiger-Cats 19
Renegades 10
Argonauts 18
Alouettes 26
Tiger-Cats 2
Renegades 14
Argonauts 16
Alouettes 27
Tiger-Cats 15
Renegades 10


Step 4: Now let's use max to "squash" this into single rows. Don't forget GROUP BY.

SELECT team,
MAX (DECODE (season, 2002, points, NULL)) Yr2002,
MAX (DECODE (season, 2003, points, NULL)) Yr2003,
MAX (DECODE (season, 2004, points, NULL)) Yr2004
FROM (SELECT season, team, points FROM CFL)
GROUP BY team;

TEAM                 YR2002     YR2003     YR2004
---------------- ---------- ---------- ----------
Alouettes 27 26 28
Argonauts 16 18 21
Renegades 10 14 10
Tiger-Cats 15 2 19


Pretty cool, eh? Easy, too.

Notice that the key to this is DECODE. If DECODE is not already part of your toolbelt, I recommend studying up.

Ready for a tougher example? Let's look at another Ask Tom.

For further study of pivot queries and analytic functions in general, there is an awesome write-up in Chapter 12 of Tom Kyte's "Expert One-on-One Oracle." You'd think I'd get a kickback from Tom Kyte with all the promotion I'm doing, but the honest truth is that no one explains it as well as he.

So, do you understand pivot queries now? No problems?

If so, now you're ready for one of Ask Tom's more complex examples.

Pivot Tables

One final word: don't confuse pivot queries with pivot tables. Pivot tables are a different concept, and have different uses (most typically to fill in missing data). Until I blog about pivot tables, check out these two links:

Jonathan Gennick Updated
Laurent Schneider

Originally posted September 1, 2005

Comments:
Pivoting in SQL using the 10g Model :)

Test was done on Oracle Database 10g Express Edition Release 10.2.0.1.0

SQL> create table test(id varchar2(2), des varchar2(4), t number);

Table created

SQL> INSERT INTO test values(’A',’a1′,12);

1 row inserted

SQL> INSERT INTO test values(’A',’a2′,3);

1 row inserted

SQL> INSERT INTO test values(’A',’a3′,1);

1 row inserted

SQL> INSERT INTO test values(’B',’a1′,10);

1 row inserted

SQL> INSERT INTO test values(’B',’a2′,23);

1 row inserted

SQL> INSERT INTO test values(’C',’a3′,45);

1 row inserted

SQL> commit;

Commit complete

SQL> SELECT * FROM test;

ID DES T
– —- ———-
A a1 12
A a2 3
A a3 1
B a1 10
B a2 23
C a3 45

6 rows selected

SQL> select distinct i, A1, A2, A3
2 from test c
3 model
4 ignore nav
5 dimension by(c.id i,c.des d)
6 measures(c.t t, 0 A1, 0 A2, 0 A3)
7 rules(
8 A1[any,any] = t[cv(i),d = ‘a1′],
9 A2[any,any] = t[cv(i),d = ‘a2′],
10 A3[any,any] = t[cv(i),d = ‘a3′]
11 );

I A1 A2 A3
– ———- ———- ———-
C 0 0 45
B 10 23 0
A 12 3 1

SQL> select distinct d, A, B, C
2 from test c
3 model
4 ignore nav
5 dimension by(c.id i,c.des d)
6 measures(c.t t, 0 A, 0 B, 0 C)
7 rules(
8 A[any,any] = t[i = ‘A’, cv(d)],
9 B[any,any] = t[i = ‘B’, cv(d)],
10 C[any,any] = t[i = ‘C’, cv(d)]
11 );

D A B C
—- ———- ———- ———-
a1 12 10 0
a3 1 0 45
a2 3 23 0

SQL> explain plan set statement_id ‘menn’ for
2 select distinct d, A, B, C
3 from test c
4 model
5 ignore nav
6 dimension by( c.id i,c.des d)
7 measures(c.t t, 0 A, 0 B, 0 C)
8 rules(
9 A[any,any] = t[i = ‘A’, cv(d)],
10 B[any,any] = t[i = ‘B’, cv(d)],
11 C[any,any] = t[i = ‘C’, cv(d)]
12 );

Explained

SQL> select plan_table_output from table(dbms_xplan.display(’plan_table’,'menn’));

PLAN_TABLE_OUTPUT
——————————————————————————–
Plan hash value: 160770444
—————————————————————————-
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
—————————————————————————-
| 0 | SELECT STATEMENT | | 6 | 120 | 3 (34)| 00:00:01 |
| 1 | HASH UNIQUE | | 6 | 120 | 3 (34)| 00:00:01 |
| 2 | SQL MODEL ORDERED | | 6 | 120 | | |
| 3 | TABLE ACCESS FULL| TEST | 6 | 120 | 2 (0)| 00:00:01 |
—————————————————————————-

Oracle® Database Data Warehousing Guide 10g Release 2 (10.2)
Chapter 22 SQL for Modeling
http://download-uk.oracle.com/docs/cd/B19306_01/server.102/b14223/sqlmodel.htm#sthref1855
 
Also for most people this comes out to be the simplest solution;

INSERT ALL
INTO sales_info VALUES (employee_id,week_id,sales_MON)
INTO sales_info VALUES (employee_id,week_id,sales_TUE)
INTO sales_info VALUES (employee_id,week_id,sales_WED)
INTO sales_info VALUES (employee_id,week_id,sales_THUR)
INTO sales_info VALUES (employee_id,week_id, sales_FRI)
SELECT EMPLOYEE_ID, week_id, sales_MON, sales_TUE,
sales_WED, sales_THUR, sales_FRI
FROM sales_source_data;

5 rows created.

Best regards.
 
simple yet profound - as DBA's we reallyc cannot get away from having to cascase rows to columns and vice versa for numerous Reports.
Using the dimension in the query probably needs to OLAP option.
 
Post a Comment

<< Home

This page is powered by Blogger. Isn't yours?