Friday, May 04, 2007

ANSI Joins

Like most of us, I still join tables in my SQL queries the old-school way. Simply put:

SELECT whatever
FROM table1 t1, table2 t2
WHERE t1.id = t2.id
AND t1.value > 10;

But increasingly often I run into people who use ANSI joins instead. They were either introduced to SQL with Oracle 9 (or Sybase 12, etc), and were taught to use the SQL standard way, or else they made the conversion at some point in the past few years. They would instead write that query like this:

SELECT whatever
FROM table1 t1
JOIN table2 t2 USING (id)
WHERE t1.value > 10;

Then they look at me all smug ... "Look how I have separated the JOIN clauses from the WHERE clauses. Isn't that infinitely more readable? Now go back to your cave."

I confess, I do like the elegance of separating those clauses. Our business analysts find that format much more readable, too, as do those who use several different databases. Every time they see one of my queries, especially a full outer join or a (+) in it, their nose starts to bleed.

Incidentally, if the columns are named differently, you'd use this syntax instead (which is less of a leap for old school SQL lovers):

SELECT whatever
FROM table1 t1
JOIN table2 t2 ON (t1.id1 = t2.id2)
WHERE t1.value > 10;

And, of course, you can have multiple joins, and can even nest one join inside another. You can use all the different types of joins, like INNER, OUTER, LEFT, RIGHT, UPSIDE-DOWN, INSIDE-OUT, whatever. Ok, not those last two. Tim Hall has some examples.

I started using ANSI joins for the first time recently. There certainly doesn't seem to be any performance hit. Of course, I wasn't expecting one, because you'd think Oracle would be smart enough to do both things the exact same way. I can't say for certain if one is better than the other, I guess you have to look at it case-by-case with your profiling tools. But I certainly haven't seen any difference.

The one thing I'm not crazy about is the NATURAL JOIN. In the example above, if id was the only column that those two tables had in common, they could specify is as a NATURAL JOIN and leave off the USING. To wit:

SELECT whatever
FROM table1 t1
NATURAL JOIN table2 t2
WHERE t1.value > 10;

What a NATURAL JOIN actually does, is join the two tables on all columns that exist in both tables. It does mean your query will still work if the structure of the table changes, but is that really what you want? Call me a caveman, but I like to explicitly state exactly what I want a query to do, and that includes exactly on which columns I want to join.

If you think I'm being too harsh, you should Ask Tom for his opinion:
"Natural joins -- a bug waiting to happen.",
"I strongly, vehemently strongly, discourage the use of this feature called natural joins.",
"To join tables by their column names -- how patently just a bad idea",
"Never never never use this feature."

I don't intend to use this one criticism to denounce ANSI joins altogether, far from it. Traditional joins (especially full outer joins) can look messy, especially the aforementioned group of new graduates, business analysts and multi-database SQL programmers.

Not everyone has really enjoyed moving from the traditional join syntax to ANSI syntax, including Doug Burns and Jeff Moss.

As you can see from the dates on those, I'm not the first one to write about this (but hopefully the last). I thought Mark Rittman was the first to expound on this, over 3 years ago, but he links to Jim Czuprynski, who appears to be truly the earliest crusader for the USE of ANSI joins, over 4 years ago. But, reminiscent of Wrath of Khan, it was actually Eddie Awad who first put the bug of ANSI joins into my ear. He's got a crash course on the different types of ANSI joins, but of course, that type of information you can find almost anywhere.

I'm not going to endorse that you immediately switch to ANSI joins, especially since if you haven't by now, you probably have no such intention anyway. But I think it makes a lot of sense for my fellow cavemen to at least be aware of it, and know how to use it.

Comments:
I couldn't agree more. Eddie Awad has a great series on all the join types, and he uses ANSI syntax in all his demonstrations. I'm making the switch!
 
Poking about in the bug database for ansi join is always a bit of fun. I particularly liked the one where ansi worked and + didn't.

Think I'd wait for 11g anywho, there seem to be enough things that aren't fixed 'til then.

I work with a 4GL which automagically does this natural join thing (it calls it "implicit domains" or you can define them explicitly) which every so often really trips people up. Especially when it has to resolve an ambiguity, and does it... alphabetically.
 
I made the switch to ANSI a few years ago and can't understand why anybody would continue to use the old style.
I guess old habits die hard.
 
In the Oracle database, v. 9.2.0.6.0, that I have access to, three table joins with ANSI syntax consistently takes twice as long as three table joins with the old style (+) syntax. I do not understand why this is but hopefully Oracle is improving their optimizer with the recent versions.
 
The DBA showed me a few things with this situation and I am following up with an explanation of the causes.

1) The Oracle optimizer engine evaluates the two styles of joining tables slightly differently. This can be seen by a few differences in the "Explain Plan" that is available by selecting the F5 button in PL/SQL Developer. This Explain Plan function evaluates the complexity of the query.

2) The Oracle optimizer engine gets confused due to the complexity of multiple view accesses.

3) The Oracle optimizer thought the query would run faster by putting a predicate from the WHERE clause inside one of the subqueries within the FROM clause. The actual result of this action was to dramatically increase the query run time from 15 seconds to 5 minutes. When the optimizer hint /*+ no_push_pred */ was used, the query run time dropped down to 15 seconds.

The need for use of the /*+ no_push_pred */ optimizer hint can be seen in the Explain Plan if the description states, "VIEW PUSHED PREDICATE".

All of this shows the poor performance from my example was only partially caused, in an undetermined small amount, by the ANSI syntax.
 
This comment has been removed by a blog administrator.
 
hi....

Can i insert record in the middle of the table?

for example i have a table called Emp and it contains 10 rows. I want to insert a record after 8th record.
Is it possible to insert? if yes can u tell the query please?.
 
Post a Comment

<< Home

This page is powered by Blogger. Isn't yours?