Thursday, September 01, 2005

Pivot and Crosstab Queries

Here is another advanced concept that will come in useful when solving Oracle problems.

Imagine you're trying to create a result set where the rows need to be columns, or vice versa. In essence, you need to "pivot" rows into columns, or vice versa. That is a very common requirement, and this is where you need to look at a pivot (or crosstab) query to get the job done.

As always, when you want to understand something, you can start by Asking Tom:
http://asktom.oracle.com/pls/ask/f?p=4950:8:16663421538065257584::NO::
F4950_P8_DISPLAYID,F4950_P8_CRITERIA:766825833740


A simple pivot query is accomplished by basically doing the following:
1. Add some kind of count or row number to your query, if necessary for the grouping
2. Then use your (revised) original query as a sub-query
3. Use "decode" to turn rows into columns (ie. a "sparse" matrix).
4. Use "max" to "squash" the multiple rows you moved to columns, into single rows. Don't forget to group by.
(Note: it gets more complicated if you don't know how many columns you'll need).

Here is another one of Ask Tom's example. It clearly shows how you use decode to create a "sparse" matrix, and then use max to "squash" it down.
http://asktom.oracle.com/pls/ask/f?p=4950:8:::::F4950_P8_DISPLAYID:124812348063

Let's look a simple example in slow motion.

Here's the data
CREATE TABLE CFL (season NUMBER(4), team VARCHAR2(16), points NUMBER(3));
INSERT INTO CFL (season, team, points) VALUES (2004, 'Argonauts', 21);
INSERT INTO CFL (season, team, points) VALUES (2004, 'Alouettes', 28);
INSERT INTO CFL (season, team, points) VALUES (2004, 'Tiger-Cats', 19);
INSERT INTO CFL (season, team, points) VALUES (2004, 'Renegades', 10);
INSERT INTO CFL (season, team, points) VALUES (2003, 'Argonauts', 18);
INSERT INTO CFL (season, team, points) VALUES (2003, 'Alouettes', 26);
INSERT INTO CFL (season, team, points) VALUES (2003, 'Tiger-Cats', 2);
INSERT INTO CFL (season, team, points) VALUES (2003, 'Renegades', 14);
INSERT INTO CFL (season, team, points) VALUES (2002, 'Argonauts', 16);
INSERT INTO CFL (season, team, points) VALUES (2002, 'Alouettes', 27);
INSERT INTO CFL (season, team, points) VALUES (2002, 'Tiger-Cats', 15);
INSERT INTO CFL (season, team, points) VALUES (2002, 'Renegades', 10);

What we want:
A table showing each of these 4 teams and their point tables for these 3 seasons.

So what is our pivot row/column? Season.

Step 1/2: We are using season, so we don't need to create our own grouping field, like count, rownum, or running total (sum) for example. That would be easy enough to do, but let's keep this simple.

Step 3: Use "decode" to turn the season row into a column. Take a look at our "sparse" matrix.

SELECT team,
DECODE (season, 2002, points, NULL) Yr2002,
DECODE (season, 2003, points, NULL) Yr2003,
DECODE (season, 2004, points, NULL) Yr2004
FROM (SELECT season, team, points FROM CFL);

TEAM                 YR2002     YR2003     YR2004
---------------- ---------- ---------- ----------
Argonauts 21
Alouettes 28
Tiger-Cats 19
Renegades 10
Argonauts 18
Alouettes 26
Tiger-Cats 2
Renegades 14
Argonauts 16
Alouettes 27
Tiger-Cats 15
Renegades 10


Step 4: Now let's use max to "squash" this into single rows. Don't forget GROUP BY.

SELECT team,
MAX (DECODE (season, 2002, points, NULL)) Yr2002,
MAX (DECODE (season, 2003, points, NULL)) Yr2003,
MAX (DECODE (season, 2004, points, NULL)) Yr2004
FROM (SELECT season, team, points FROM CFL)
GROUP BY team;

TEAM                 YR2002     YR2003     YR2004
---------------- ---------- ---------- ----------
Alouettes 27 26 28
Argonauts 16 18 21
Renegades 10 14 10
Tiger-Cats 15 2 19


Pretty cool, eh? Easy, too.

Notice that the key to this is DECODE. If DECODE is not already part of your toolbelt, I recommend studying up.

http://thinkoracle.blogspot.com/2005/06/decode.html

Ready for a tougher example? Let's look at another Ask Tom:
http://asktom.oracle.com/pls/ask/f?p=4950:8:16663421538065257584::NO::
F4950_P8_DISPLAYID,F4950_P8_CRITERIA:7086279412131


For further study of pivot queries and analytic functions in general, there is an awesome write-up in Chapter 12 of Tom Kyte's "Expert One-on-One Oracle." You'd think I'd get a kickback from Tom Kyte with all the promotion I'm doing, but the honest truth is that no one explains it as well as he.

So, do you understand pivot queries now? No problems?

If so, now you're ready for one of Ask Tom's more complex examples:
http://asktom.oracle.com/pls/ask/f?p=4950:8:::::F4950_P8_DISPLAYID:6923393629227

Pivot Tables

One final word: don't confuse pivot queries with pivot tables. Pivot tables are a different concept, and have different uses (most typically to fill in missing data). Until I blog about pivot tables, check out these two links:

Jonathan Gennick
http://www.oracle.com/technology/oramag/oracle/02-sep/o52sql.html

Laurent Schneider
http://laurentschneider.blogspot.com/2005/08/pivot-table.html

Comments:
Thanks, Robert! This article helped me today.
 
Thank a lot , the information really helpful.
 
Thanks a lot...i was breaking my head since long to solve this problem :)
 
Thanks so much!!! It was really helpful
 
Thank you so much!!! It really saved my day.
 
thanks a lot robert...u saved so much of our time and effort..I really cannt tell how much u helped
 
You sample and links are much appreciated. I needed something like this only I'm using the case statement instead of decode, both work great.
 
Hi robert,

I was using the same trick with case statement but it was returning wrong numbers. When I replaced case with decode I am getting right answers. I am working on oracle 10g R2.

e.g
with yr
as
( select max(yr) cy
from sales_dtl_sum )
select
s.cst_nbr,
max( case when s.yr = yr.cy then s.dollars end) cy_sls_dollars ,
max( case when s.yr = yr.cy-1 then s.dollars end) py_sls_dollars,
from sales_dtl_sum s,
yr
where
s.yr in (yr.cy, yr.cy-1)
group by
s.cst_nbr

I am getting wrong values for the current year.( yr is not null in sales_dtl_sum table). When I replace case with the syntax of decode you have, it is returning right results.

I don't know, whether I am doing something wrong or it is not the right way to do squash or is it a bug.

Your advise on this is highly appreciated.

Thank you very much
CT
 
This a very good article...god job Robert
 
This article saved me!!! Thanks so much.
 
Thanks, thanks, thanks.
 
Thank you. This helped a lot.
 
Thanks man, helped a lot.
 
Thanks Tom. This is great stuff. Helped me out of a jam - Alex Kaniaru
 
My name is Robert. :)

But you're all welcome.
 
This comment has been removed by a blog administrator.
 
Hi Robert,

I have bit different requirement which requires same pivot concept. I have two tables and I need to select the columns from table2 which are stored as record in other table.

Table1:
======
COL_SEQ COL_NAME TABLE_ID
------- ------- -------
1 TASK 101
2 JOB 101
3 LOCATION 101
4 DESIGNATION 101

1 NAME 102
2 LNAME 102
3 FNAME 102
4 ADDRESS 102

Table2
======

TASK JOB LOCATION DESIGNATION
----- --- -------- -----------
1 10 ABC A2342
2 20 ABC1 B2342
3 30 ABC2 C2342
4 40 ABC3 D2342

Table3
======
NAME LNAME FNAME ADDRESS
---- ----- ----- -------
N1 L1 F1 ADD1
N2 L2 F2 ADD2
N3 L2 F3 ADD3
N4 L2 F4 ADD4

Now, I need to create a query dynamically based on table id given. So the program will read the values in column COL_NAME of table1 and generate the query for given TABLE_ID.

So, for TABLE_ID 101 the query should be generated as,

select TASK, JOB, LOCATION, DESIGNATION
from table2

How can I do using this pivot concept? Or is there any other way to achieve this?
 
The only problem with this is you have to know your columns in advance. Now I know Tom would say that SQL requires this, that it's a limitation of SQL... Yeah right. Since when did SQL ever stop Oracle from adding new syntax that would further lock in its grateful customers. :)
 
Great question! Check my archives, I've got articles on how to address situations where you don't know your columns in advance.
 
I have not seen a single example of 10G using multiple tables with known amount of columns but the values need to be evaluated.

(i.e. we have an audit table that has the following: SET DEFINE OFF;
Insert into ISSUE
(ISSUE_NO, UPDATE_DT, CHANGED_BY, OLD_VALUE, NEW_VALUE, COLUMN_NAME)
Values
(41, TO_DATE('10/06/1999 10:01:28', 'MM/DD/YYYY HH24:MI:SS'), 'XJ633GH', 'A', 'R', 'ISSUE_STATUS_TYPE');
Insert into ISSUE
(ISSUE_NO, UPDATE_DT, CHANGED_BY, NEW_VALUE, COLUMN_NAME)
Values
(41, TO_DATE('10/06/1999 10:01:28', 'MM/DD/YYYY HH24:MI:SS'), 'XJ633GH', 'I FIXED THE PROBLEM', 'RESOLUTION_DESC');
Insert into ISSUE
(ISSUE_NO, UPDATE_DT, CHANGED_BY, OLD_VALUE, NEW_VALUE, COLUMN_NAME)
Values
(41, TO_DATE('10/08/1999 10:40:53', 'MM/DD/YYYY HH24:MI:SS'), 'RECONRW', 'TEST', 'TEST Problem', 'ISSUE_DESC');
Insert into ISSUE
(ISSUE_NO, UPDATE_DT, CHANGED_BY, OLD_VALUE, NEW_VALUE, COLUMN_NAME)
Values
(41, TO_DATE('11/18/1999 11:01:07', 'MM/DD/YYYY HH24:MI:SS'), 'RECON', 'TEST Problem', 'TEST PROBLEM', 'ISSUE_DESC');
Insert into ISSUE
(ISSUE_NO, UPDATE_DT, CHANGED_BY, OLD_VALUE, NEW_VALUE, COLUMN_NAME)
Values
(41, TO_DATE('11/17/1999 11:31:57', 'MM/DD/YYYY HH24:MI:SS'), 'XS956PD', 'R', 'C', 'ISSUE_STATUS_TYPE');
COMMIT;


This is a terrible paradigm that we are stuck with from a legacy system that we must query. What has to happen is that these values must appear in our table but a new row would have to be created for all changes that occur at the same time (i.e. update_dt is the same)

So in the audit example we have the first two rows that would share a change in one row. The next three would be one change per row.

The base table is this (which all rows would get inserted into).

SET DEFINE OFF;
Insert into ISSUE
(ISSUE_NO, ACCOUNT_NUM, ISSUE_TYPE, ISSUE_STATUS_TYPE, RPT_PERIOD_DT, CUSIP, ISSUE_ORIGIN_TYPE, RESPONSIBLE_GROUP, CLIENT_NAME, ISSUE_DESC, ENTRY_DT, IM_REF_NO, MEL_REF_NO, RESOLVED_BY, RESOLUTION_DT, RESOLUTION_DESC, CLOSED_DT, ISSUE_OWNER, ENTERED_BY, AFFILIATE_CODE, ISSUE_SOURCE, PRIORITY)
Values
(41, '0801382', 'PR', 'C', TO_DATE('10/01/1999 00:00:00', 'MM/DD/YYYY HH24:MI:SS'), 'TEST', 'F', 'PRC', 'TEST', 'TEST PROBLEM', TO_DATE('10/06/1999 09:39:57', 'MM/DD/YYYY HH24:MI:SS'), 'TEST', 'TEST', 'XJ633GH', TO_DATE('10/06/1999 10:01:28', 'MM/DD/YYYY HH24:MI:SS'), 'I FIXED THE PROBLEM', TO_DATE('11/17/1999 11:31:57', 'MM/DD/YYYY HH24:MI:SS'), 'SDT', 'SYS', 'TRI', 'I', 'M');
COMMIT;


In the end we would have the values replaced if the "column_name" is the same of the column being selected and the issue_no is the same and the new_value is that of the value in the column. (column name)

Extremely tricky. For me anyway. I cannot seem to produce a way to do this no matter what I try there is a caveat that hinders my goal.

Please give me your thoughts.

Any help is greatly appreciated!

Thanks in advance,

B
 
great job!! thank's a lot..
 
create table jctdp (
site number,
subject number,
fieldname varChar2 ( 20),
fieldval varchar2(10)
);
/******insert values *****/
/** VIS1N **/
insert into jctdp values (10085,1, 'VSN1N','1');
insert into jctdp values (10085,1, 'VSN1N','2');
insert into jctdp values (10085,1, 'VSN1N','3');
insert into jctdp values (10085,1, 'VSN1N','4');
insert into jctdp values (10085,1, 'VSN1N','5');

/** STNSBP1N **/
insert into jctdp values (10085,1, 'STNSBP1N','140');
insert into jctdp values (10085,1, 'STNSBP1N','150');
insert into jctdp values (10085,1, 'STNSBP1N','160');
insert into jctdp values (10085,1, 'STNSBP1N','170');
insert into jctdp values (10085,1, 'STNSBP1N','175');
/**STNSBP2N **/
insert into jctdp values (10085,1, 'STNSBP2N','145');
insert into jctdp values (10085,1, 'STNSBP2N','155');
insert into jctdp values (10085,1, 'STNSBP2N','165');
insert into jctdp values (10085,1, 'STNSBP2N','175');
insert into jctdp values (10085,1, 'STNSBP2N','170');
/**STNSBP3N**/
insert into jctdp values (10085,1, 'STNSBP3N','165');
insert into jctdp values (10085,1, 'STNSBP3N','175');
insert into jctdp values (10085,1, 'STNSBP3N','160');
insert into jctdp values (10085,1, 'STNSBP3N','150');
insert into jctdp values (10085,1, 'STNSBP3N','145');

/***SUBJECT 2***/
insert into jctdp values (10085,2, 'VSN1N','1');
insert into jctdp values (10085,2, 'VSN1N','2');
insert into jctdp values (10085,2, 'VSN1N','3');
insert into jctdp values (10085,2, 'VSN1N','4');
insert into jctdp values (10085,2, 'VSN1N','5');
insert into jctdp values (10085,2, 'VSN1N','6');
insert into jctdp values (10085,2, 'VSN1N','7');
/** STNSBP1N **/
insert into jctdp values (10085,2, 'STNSBP1N','150');
insert into jctdp values (10085,2, 'STNSBP1N','160');
insert into jctdp values (10085,2, 'STNSBP1N','170');
insert into jctdp values (10085,2, 'STNSBP1N','170');
insert into jctdp values (10085,2, 'STNSBP1N','155');
insert into jctdp values (10085,2, 'STNSBP1N','150');
insert into jctdp values (10085,2, 'STNSBP1N','150');
/**STNSBP2N **/
insert into jctdp values (10085,2, 'STNSBP2N','165');
insert into jctdp values (10085,2, 'STNSBP2N','150');
insert into jctdp values (10085,2, 'STNSBP2N','160');
insert into jctdp values (10085,2, 'STNSBP2N','170');
insert into jctdp values (10085,2, 'STNSBP2N','170');
insert into jctdp values (10085,2, 'STNSBP1N','155');
insert into jctdp values (10085,2, 'STNSBP1N','165');
/**STNSBP3N**/
insert into jctdp values (10085,2, 'STNSBP3N','160');
insert into jctdp values (10085,2, 'STNSBP3N','170');
insert into jctdp values (10085,2, 'STNSBP3N','160');
insert into jctdp values (10085,2, 'STNSBP3N','150');
insert into jctdp values (10085,2, 'STNSBP3N','140');
insert into jctdp values (10085,2, 'STNSBP1N','175');
insert into jctdp values (10085,2, 'STNSBP1N','170');

/***** SQL QUERY ***/
SELECT SITE,SUBJECT,
VISIT,
STNSBP1N,
STNSBP2N,
STNSBP3N
FROM (
SELECT
SITE,SUBJECT,
max (CASE when fieldname like 'VSN1N' THEN FIELDVAL ELSE NULL END) VISIT,
max (CASE when fieldname like 'STNSBP1N' THEN FIELDVAL ELSE NULL END) STNSBP1N,
max (CASE when fieldname like 'STNSBP2N' THEN FIELDVAL ELSE NULL END) STNSBP2N,
max (CASE when fieldname like 'STNSBP3N' THEN FIELDVAL ELSE NULL END) STNSBP3N
FROM
JCTDP
where FIELDNAME IN ('VIS1N','STNSBP1N')
GROUP BY SITE,SUBJECT
);

I want to print LIKE THIS

SITE SUBJECT VISIT STNSBP1N STNSBP2N STNSBP3N
100085 1 1 140 145 165
100085 1 2 150 155 175
100085 1 3 160 165 170
100085 1 4 170 175 150
100085 1 5 175 170 145

100085 2 1 150 165 160
100085 2 2 160 150 170
100085 2 3 170 160 160
100085 2 4 170 170 150
100085 2 5 155 170 140
100085 2 6 150 155 175
100085 2 7 150 165 170

NOTE:: 1)Number of visits may vary for each subject in different project
2) There are many fieldnames(like VSN1N,STDSBP1N etc..) stored under fieldname
3) For each fieldname fieldval is stored under fieldval
The tool I am using does not support pl/sql and does not allow me to create any temp. tables.
I must use SQL.
 
This comment has been removed by a blog administrator.
 
You are a genius, this has saved my bacon! All the online tutorials and even poor Toms answers were too involved for me hehe

Grateful Peter, Manchester UK
 
In my opinion, I consider your "pivot-table" example requires too many steps. It border-lines on spaghetti-programming because you are too stuck with using the DECODE function.

Here's a much simpler "1-step" solution:


select A.TEAM, A."2002", B."2003", C."2004" from
(select team, points "2002"from cfl where SEASON = '2002') A
JOIN (select team, points "2003" from cfl where SEASON = '2003') B
ON A.TEAM = B.TEAM
JOIN (select team, points "2004" from cfl where SEASON = '2004') C
ON A.TEAM = C.TEAM
;

JOHN D.
JDDDDD2@YAHOO.COM
 
This comment has been removed by a blog administrator.
 
This comment has been removed by a blog administrator.
 
Hi,

The scenarios I have is little different. I have a table "Calendar" having 3 columns Qtr, Fin_Week & Date. This will display values like

Qtr Fin_Wk Date
====================
1 FW1 01-JAN
1 FW2 04-FEB
1 FW3 06-MAR
2 FW4 02-APR
2 FW5 09-MAY
2 FW6 01-JUN
3 FW7 03-JUL
3 FW8 08-AUG
3 FW9 05-SEP


But I want to write a query, where I can display the records in the following format:

Qtr1 Fin_Wk1 Date1 Qtr2 Fin_Wk2 Date2 Qtr3 Fin_Wk3 Date3
1 FW1 01-JAN 2 FW4 02-APR 3 FW7 03-JUL
1 FW2 04-FEB 2 FW5 09-MAY 3 FW8 08-AUG
1 FW3 06-MAR 2 FW6 01-JUN 3 FW9 05-SEP

Any help on this will be highliy appreciated.

Thanks,
Jingle Bells
 
Thanks for this clear and concise article, it was very helpful.

Regards
Stephen
 
Hi Robert,
Thanks for the blog, very clear and precise explanation in lay-mans terms... Just the way I like it!
I noticed the links to Tom Kytes web site are broken (probably due to Toms revised web site). I presume the first link referes to....
http://asktom.oracle.com/pls/asktom/f?p=100:11:0::::P11_QUESTION_ID:766825833740

but was wondering if you could update the more complex examples?

Thanks,
Gus..
 
Thanks, this helped!
 
you've saved my grades!!!!! thx a lot... bless you...
 
Muchas gracias, amigo!!!
 
Good Job

Thanks V Much

Darshan Shah
 
i need some help with oracle 11g pivot operator. is it possible to use multiple columns in the FOR clause and then compare it against multiple set of values

with case when I can write something like this

select country_code
,avg(case when (( dept = 30 and job = 'SALESMAN' ) or ( dept = 301 and job = 'SALESMAN' ) ) then sal end ) as d30_sls
,avg(case when (( dept = 30 and job = 'MANAGER' ) or ( dept = 301 and job = 'MANAGER' ) ) then sal end ) as d30_mgr
,avg(case when (( dept = 30 and job = 'CLERK' ) or ( dept = 302 and job = 'CLERK' ) ) then sal end ) as d30_clrk
from pivot_data group by country_code;

what I tried with pivot is like this , I can not give the same alias name for 30, SALESMAN and 301, SALESMAN to call d30_sls

SELECT *
FROM pivot_data
PIVOT (SUM(sal) AS sum
, COUNT(sal) AS cnt
FOR (dept,job) IN ( (30 , 'SALESMAN') AS d30_sls,
(301, 'SALESMAN') AS d30_sls1,
(30 , 'MANAGER') AS d30_mgr1,
(301, 'MANAGER') AS d30_mgr,
(30 , 'CLERK') AS d30_clk,
(302, 'CLERK') AS d30_clk1));

this is a simple example .... my real life scenario is compliated with more fields and more combinations .... So something like using substr(dept,1,2) won't work ..
I will post sqls to create table and data in the next post
 
here is the sql to create the sample data
create table pivot_data ( country_code number , dept number, job varchar2(20), sal number );

insert into pivot_data values (1,30 , 'SALESMAN', 5000);
insert into pivot_data values (1,301, 'SALESMAN', 5500);
insert into pivot_data values (1,30 , 'MANAGER', 10000);
insert into pivot_data values (1,301, 'MANAGER', 10500);
insert into pivot_data values (1,30 , 'CLERK', 4000);
insert into pivot_data values (1,302, 'CLERK',4500);
insert into pivot_data values (2,30 , 'SALESMAN', 6000);
insert into pivot_data values (2,301, 'SALESMAN', 6500);
insert into pivot_data values (2,30 , 'MANAGER', 11000);
insert into pivot_data values (2,301, 'MANAGER', 11500);
insert into pivot_data values (2,30 , 'CLERK', 3000);
insert into pivot_data values (2,302, 'CLERK',3500);
 
Thank you, very helpful!
 
Robert, this is a great tip, i use up to 132 columns this way for many rows, and get a quick and accurate return set. Thanks from Sjabbo van Timmeren
 
This is so helpful. Thanks so much!
 
Hi,
Could you please help me to clarify below mention doubt.

DECODE:-
SELECT DECODE (NULL,NULL,'TRUE','FALSE') AS VALUE FROM DUAL
OUTPUT:- TRUE

CASE:-
SELECT CASE NULL WHEN NULL THEN 'TRUE' ELSE 'FALSE' END AS CASE_TEST FROM DUAL

OUTPUT :- FALSE

Why above mention queries are giving different output and what is the difference between case and decode?
Please reply on my email-id 'nidhikavaidh@gmail.com'

Regards,
Nidhika
 
PivotGrid builder with tabs customizing
 
Post a Comment

<< Home

This page is powered by Blogger. Isn't yours?