Thursday, January 22, 2009

Cleaning up with ALL_TAB_COLS

The way he barely lifted his feet off the ground when he walked, I could hear him approaching from the other end of the hall. I knew he was coming, and I dreaded it.

I had just gotten off the phone with a former teammate who was moved to another department. They had a new database analyst who was struggling with his new "database independent" application, and he was sent my way. It was just around the time that he leaned his head into my office that I wished I had a second exit.

"You're the Oracle guru?"

I smiled at him, and nodded as modestly as I could. He dragged his feet into my office and fell into a chair. "Wow, you look so ... normal."

"What can I help you with?" I sighed.

He began going over the long list of issues he was trying to tackle, and data he was trying to clean up. He had a very sad story, I'll admit. He had a database shared by a host of buggy applications that had different ideas of how to handle NULLs and other values. He was left with tables that were supposed to be identical, but weren't, data that was supposed to match, but didn't, and felt more optimistic about finding needles in haystacks than the data he needed to fix. After several minutes it was time for one of my trademark interruptions.

"The first thing you need to study is the ALL_TAB_COLS table."

"What's that?" I waited a moment for him to pull out the Oracle documentation, but then I forgot that he's new, and I haven't trained him yet. I helpfully turned my monitor so he could see it, and brought it up on the screen.

"Just as the name suggests, it's your one-stop shop for all the information you might need about all the columns in all the tables," I explained as he read through the details. "For instance, remember your question about how to find all the columns in your database that store the province? Do you see how this table has your answer?"

"Absolutely," he said, sliding away my keyboard and bringing up a SQL*Plus session.

SQL> SELECT table_name, column_name
2 FROM all_tab_cols
3 WHERE column_name like '%PROV%';


"Aw man," he said upon seeing the results. "Some of these aren't even my tables. I don't want to wade through everything!"

"Then don't. Either use USER_TAB_COLS instead, or make use of the OWNER column." I helpfully added one more where clause to his query.

SQL> SELECT table_name, column_name
2 FROM all_tab_cols
3 WHERE column_name LIKE '%PROV%'
4 AND owner = 'NEWBIE';


"And you were trying to find columns that support NULL in one place, but not in others? Think you can manage that?"

He slid the keyboard back, studied the documentation, chewed on my pen for a while (grr!) and came up with something he found satisfying.

SQL> SELECT table_name, column_name FROM all_tab_cols WHERE nullable='Y'
2 AND column_name IN
3 (SELECT column_name FROM user_tab_columns WHERE nullable = 'N');


He smiled, and studied the documentation further. "Hey, I know how else I can use this. I had this import the other day that brought in a bunch of NULLs and my stupid buggy application doesn't handle NULLs very well. This can help me find the tables and columns with NULLs!"

SQL> SELECT table_name, column_name, num_nulls
2 FROM all_tab_cols
3 WHERE num_nulls > 0;


"What are these other tables? ALL_TAB_COLUMNS and ALL_TAB_COL_STATISTICS?" I paused for a moment, wondering whether I should step back and start educating him on NULLs before answering his question, but was gratified that this internal debate gave him enough time to bring up the Oracle documentation and answer the question for himself.

"Oh, I see. ALL_TAB_COLUMNS is the same, except with hidden columns filtered out, and ALL_TAB_COL_STATISTICS is just extracted from it." He leaned back and that's when I first noticed that he was smiling. "This is sweet! My application is also having trouble with duplicates and certain bad values, and I can use this to help me find the tables that have them."

"Yes, and as a bonus, you'll see there's useful information that can help you determine where you need to re-analyze your statistics. I'm glad ALL_TAB_COLS can get you started."

"Well before I go back to my desk and study all of this, I wanted to know if you had any suggestions on how to compare data between tables. I've got two tables that are supposed to have identical data, but I think they don't. How do I get all the rows that are different?"

I wordlessly typed the simplest query he ever expected to see.

SQL> SELECT * from table1
2 MINUS
3 SELECT * from table2;


"Wow, thanks, this was great," he said, as he got up to leave my office.

"Wait. There's one more thing I need to teach you," I interrupted one last time.

"What's that," he asked, eagerly sitting back down.

"Buying lunch."

Comments:
Classic!
 
Well, I guess "different" can be ambiguous:

JEG@XTST> create table table1 as select * from all_objects;

Table created.

JEG@XTST> create table table2 as select * from all_objects;

Table created.

JEG@XTST> insert into table2 select * from all_objects;

55106 rows created.

JEG@XTST> select * from table1 minus select * from table2;

no rows selected

JEG@XTST> select count(*) from table1;

COUNT(*)
----------
55105

JEG@XTST> select count(*) from table2;

COUNT(*)
----------
110212

JEG@XTST>
 
And, of course, you have warned the newbie the statistics are not real-time, right? :)
 
This comment has been removed by a blog administrator.
 
What is the rationale that ALL_TAB_COLS also contains the rows of views, but ALL_TABLES does NOT list views? Just wondering, maybe there is a good explanation.
 
This will show the records in table1 that are not in table2,

SELECT * from table1
MINUS
SELECT * from table2;

and this will show the records in table2 not in table1,

SELECT * from table2
MINUS
SELECT * from table1;
 
I knew he was coming, and I dreaded it. ... It was just around the time that he leaned his head into my office that I wished I had a second exit.... I smiled at him, and nodded as modestly as I could.
Wow!! All this schlock, and a humble attitude of service too!

...data that was supposed to match, but didn't...
Data is plural. Better keep your day job. =)
 
There's wide scope of optimization while you're implementing any query structure in Oracle. Also, there are multiple ways to implement same thing in Oracle, for instance, you can use UNION & UNION ALL interchangeably but one should use it carefully with considering the optimization in mind.
 
this was a outstanding.
 
Post a Comment

<< Home

This page is powered by Blogger. Isn't yours?