Wednesday, May 18, 2005
Steven Feuerstein on Refactoring
Refactoring for PL/SQL Developers
By Steven Feuerstein
From Oracle Magazine January/February 2005
http://www.oracle.com/technology/oramag/oracle/05-jan/o15plsql.html
Steven Feuerstein is one of my favourite PL/SQL authors, especially when he lays off the socio-political tangents.
I especially enjoyed his article on refactoring PL/SQL code. I had a few thoughts to add to it.
My additional thoughts on Refactoring:
1. One of the advantages that also bears note is that re-factoring your code can encourage code re-use. Clean, modular, well-written code is easy to re-use, reducing future programming efforts.
2. However, one of the disadvantages of code re-factoring is that you can introduce unknown factors to a (presumably) working, trusted procedure. There is an expense to re-testing and re-validating the modified code. Even if you FIXED errors, some other procedures might have been written in such a way that they depend on them, and they will now fail.
My additional imperfections in the unfactored code sample:
1. Inline documentation
I think Steven failed to point out the most egregious transgression: lack of inline documentation! I believe almost every procedure needs it, at LEAST a 'header' but preferrably also some comments that explain in plain language what was intended by the code. Steven did go ahead and add some in-line documentation, but didn't draw any attention to that.
* Note: Perhaps due to his friendly nature, I am referring to him in this post in the familiar. Certainly no disrespect is intended.
2. Bad variable names
If this article was to be re-written, I would change the initial code sample to use very bad variable names and then re-name them as part of the "refactored" version. I think that's a very common problem.
My comments on the refactored version:
1. utl_file_constants
Rather than define my own file utl_file_constants, I would use something that was already defined and possibly tied in. I mean, there's a REASON that 32767 was used. You are bound by a particular (probably OS-level, or DB-level) limitation that is probably defined in some package or some configuration value, or right in UTL_FILE.
If you define your own and then all of a sudden the 32767 limitation is changed at the OS or DB level, you STILL have to go change all your private constant packages. I mean that's a bit better than having to change magic numbers in your code, but we can do a little better, I think.
get_next_line_from_file isn't even aware of that limitation. How would this program fail if you set the value to, say, '4' instead of '32767' in utl_file_constants?
It appears like Steven didn't even use his utl_file_constant for 32767 in the final version (Code Listing 7). Why not? The answer is probably that you can not dynamically define the length of a string. That is, of course, a PL/SQL restriction. I suppose we should be happy with whatever magic numbers we can remove.
2. compare_latest_read
I'm really not sure I would have created "compare_latest_read". That seems like we are overdoing it. I mean this is not a big procedure and its entire point is to do what is in that procedure anyway. Why introduce the overhead of another procedure (assuming there is any overhead)?
Think of it this way, you are passing in the lines. Instead of that, you could pass in the files and compare_latest_read can call get_next_line_from_file to get the lines. That would be reasonable. But now, all of a sudden, the main procedure is doing nothing but initialising and calling a procedure. So basically "compare_latest_read" is one reasonable change away from already doing all the work.
So why is it even necessary? It is only done once ... How is the purpose of this internal procedure significantly different from the purpose of the overall procedure? I suppose this is just an opinion, and a matter of taste, but I really think the extra procedure is unnecessary.
Other comments and questions:
1. VARCHAR2 length
In the procedure Steven uses VARCHAR2 as a variable-lengthed string even though we know the maximum size (32767). Is there any value in specifying VARCHAR2 (32767) instead of VARCHAR2 in the final version (as the IN or OUT parameters)? Is there any advantage to that? Is it even possible?
2. Procedures vs Functions
Any reason why Steven uses procedures instead of functions?
3. Cleanup: closing an unopened file
In "cleanup", will there be a problem if you UTL_FILE.fclose a file that hasn't been opened? Because that will happen if there is an exception while opening the first file when cleanup closes both.
4. Handling "WHEN OTHERS"
WHEN OTHERS - cleanup .... without any error message written to screen or log?? For shame! Steven said earlier in the article you would re-raise the exception, but in the end he didn't.
Plus the problem with re-raising an exception is you are actually getting a NEW exception. If it is checked later, the error messages will say the error was created at this new line number instead of the original spot. Might as well fully handle the exception right here when we catch it.
Conclusion:
A sure sign of a good article is that it gets you thinking. Steven Feuerstein rarely fails to achieve that. I have shared these thoughts with him, and I'll follow-up with any answers or clarification he provides.
I noticed Harm Vershuren had some thoughts in his blog as well:
http://technology.amis.nl/blog/index.php?p=317
By Steven Feuerstein
From Oracle Magazine January/February 2005
http://www.oracle.com/technology/oramag/oracle/05-jan/o15plsql.html
Steven Feuerstein is one of my favourite PL/SQL authors, especially when he lays off the socio-political tangents.
I especially enjoyed his article on refactoring PL/SQL code. I had a few thoughts to add to it.
My additional thoughts on Refactoring:
1. One of the advantages that also bears note is that re-factoring your code can encourage code re-use. Clean, modular, well-written code is easy to re-use, reducing future programming efforts.
2. However, one of the disadvantages of code re-factoring is that you can introduce unknown factors to a (presumably) working, trusted procedure. There is an expense to re-testing and re-validating the modified code. Even if you FIXED errors, some other procedures might have been written in such a way that they depend on them, and they will now fail.
My additional imperfections in the unfactored code sample:
1. Inline documentation
I think Steven failed to point out the most egregious transgression: lack of inline documentation! I believe almost every procedure needs it, at LEAST a 'header' but preferrably also some comments that explain in plain language what was intended by the code. Steven did go ahead and add some in-line documentation, but didn't draw any attention to that.
* Note: Perhaps due to his friendly nature, I am referring to him in this post in the familiar. Certainly no disrespect is intended.
2. Bad variable names
If this article was to be re-written, I would change the initial code sample to use very bad variable names and then re-name them as part of the "refactored" version. I think that's a very common problem.
My comments on the refactored version:
1. utl_file_constants
Rather than define my own file utl_file_constants, I would use something that was already defined and possibly tied in. I mean, there's a REASON that 32767 was used. You are bound by a particular (probably OS-level, or DB-level) limitation that is probably defined in some package or some configuration value, or right in UTL_FILE.
If you define your own and then all of a sudden the 32767 limitation is changed at the OS or DB level, you STILL have to go change all your private constant packages. I mean that's a bit better than having to change magic numbers in your code, but we can do a little better, I think.
get_next_line_from_file isn't even aware of that limitation. How would this program fail if you set the value to, say, '4' instead of '32767' in utl_file_constants?
It appears like Steven didn't even use his utl_file_constant for 32767 in the final version (Code Listing 7). Why not? The answer is probably that you can not dynamically define the length of a string. That is, of course, a PL/SQL restriction. I suppose we should be happy with whatever magic numbers we can remove.
2. compare_latest_read
I'm really not sure I would have created "compare_latest_read". That seems like we are overdoing it. I mean this is not a big procedure and its entire point is to do what is in that procedure anyway. Why introduce the overhead of another procedure (assuming there is any overhead)?
Think of it this way, you are passing in the lines. Instead of that, you could pass in the files and compare_latest_read can call get_next_line_from_file to get the lines. That would be reasonable. But now, all of a sudden, the main procedure is doing nothing but initialising and calling a procedure. So basically "compare_latest_read" is one reasonable change away from already doing all the work.
So why is it even necessary? It is only done once ... How is the purpose of this internal procedure significantly different from the purpose of the overall procedure? I suppose this is just an opinion, and a matter of taste, but I really think the extra procedure is unnecessary.
Other comments and questions:
1. VARCHAR2 length
In the procedure Steven uses VARCHAR2 as a variable-lengthed string even though we know the maximum size (32767). Is there any value in specifying VARCHAR2 (32767) instead of VARCHAR2 in the final version (as the IN or OUT parameters)? Is there any advantage to that? Is it even possible?
2. Procedures vs Functions
Any reason why Steven uses procedures instead of functions?
3. Cleanup: closing an unopened file
In "cleanup", will there be a problem if you UTL_FILE.fclose a file that hasn't been opened? Because that will happen if there is an exception while opening the first file when cleanup closes both.
4. Handling "WHEN OTHERS"
WHEN OTHERS - cleanup .... without any error message written to screen or log?? For shame! Steven said earlier in the article you would re-raise the exception, but in the end he didn't.
Plus the problem with re-raising an exception is you are actually getting a NEW exception. If it is checked later, the error messages will say the error was created at this new line number instead of the original spot. Might as well fully handle the exception right here when we catch it.
Conclusion:
A sure sign of a good article is that it gets you thinking. Steven Feuerstein rarely fails to achieve that. I have shared these thoughts with him, and I'll follow-up with any answers or clarification he provides.
I noticed Harm Vershuren had some thoughts in his blog as well:
http://technology.amis.nl/blog/index.php?p=317
Comments:
<< Home
Regarding your question:
2. Procedures vs Functions
Any reason why Steven uses procedures instead of functions?
IMO, there are two reasons for this: First of all, he uses more then just one OUT parameter. With functions he would still need an out parameter, which would lead to even more confusion because return values now come through two different ways.
Secondly, in pl/sql I find the readability is much higher if you use procedures:
declare
l_stepresult varchar2(10) := 'GO';
begin
while l_stepresult != 'STOP' loop
do_first_step (param1, param2, param3, l_stepresult);
do_next_step (param3, param4, param5, l_stepresult);
. end loop;
.
.
end;
The line starts with what you actually do. Using functions I find it somewhat less obvious what
actually is going on.
declare
l_stepresult varchar2(10) := 'GO';
begin
while l_stepresult != 'STOP' loop
l_stepresult := do_first_step (param1, param2, param3);
l_stepresult := do_next_step ((param3, param4, param5, l_stepresult);
end loop;
.
.
end;
My 2 ct (euro)
Cheers,
Holger
Post a Comment
2. Procedures vs Functions
Any reason why Steven uses procedures instead of functions?
IMO, there are two reasons for this: First of all, he uses more then just one OUT parameter. With functions he would still need an out parameter, which would lead to even more confusion because return values now come through two different ways.
Secondly, in pl/sql I find the readability is much higher if you use procedures:
declare
l_stepresult varchar2(10) := 'GO';
begin
while l_stepresult != 'STOP' loop
do_first_step (param1, param2, param3, l_stepresult);
do_next_step (param3, param4, param5, l_stepresult);
. end loop;
.
.
end;
The line starts with what you actually do. Using functions I find it somewhat less obvious what
actually is going on.
declare
l_stepresult varchar2(10) := 'GO';
begin
while l_stepresult != 'STOP' loop
l_stepresult := do_first_step (param1, param2, param3);
l_stepresult := do_next_step ((param3, param4, param5, l_stepresult);
end loop;
.
.
end;
My 2 ct (euro)
Cheers,
Holger
<< Home