Tuesday, January 24, 2006

Gathering Requirements

I wanted to briefly follow up on Tip #2 on my recent post about PL/SQL Coding.
http://thinkoracle.blogspot.com/2005/12/20-plsql-coding-tips.html

I am not intending to convince anyone WHY they should spend time gathering requirements, but rather simply HOW to do it. However, let it be known that there are people whose careers are entirely based on the requirements gathering process, that there are volumes of books on the subject, and there exists methods much more sophisticated and involved that what I am going to describe here.

This post is understandably technology-independent, which is good because it is based on my experience along with some articles I've read recently, which has involved a variety of industries and technologies.

Step 1: State Your Vision

Before prattling off a list of what the software must do, you should start by naming some clear, concise key objectives. The basic idea is that all the requirements you are about to list match these key objectives.

Ideally I like to mention in this brief opening statement how the stakeholder's needs can be fulfilled, pain reduced, and/or gain received.

What Are Requirements?

A requirement is based on a specific business-driven need that is necessary to accomplish the stated project vision. Each requirement should be enumerated (or otherwise identifiable), and should only describe WHAT should be provided, not HOW, and should thus not necessarily discuss implementation details. A requirement should also be testable/verifiable, and mention how that is planned to be done.

As I alluded to in Tip #1, my conviction is that the rationale behind a requirement is a necessary step. By explaining WHY something is necessary we can help keep the requirements complete, clear, concise, consistent and maybe some other things that start with c.

Depending on the application, requirements can be related to the functionality, the business interaction (with users), or the technical requirements (eg: time, space, performance, scalability, security, disaster recovery).

For example, requirements can include anything from business rules, the budget, the interface (web-based vs desktop), reporting abilities, performance (number of concurrent users), completion date/schedule, security needs, hardware, software, error/fault-tolerance, user technical level, system maintenance requirements, and so on.

How to Gather Requirements

Is this a brand new product, or is it meant to either replace or compete with another product? If the latter, you should evaluate what that product does, and how it can be improved.

Identify all the stakeholders, such as users, management and the nerds (technical folk). Talk to them either one-on-one or in groups.

Techniques

In my experience, I simply write a document. For more sophisticated approaches, you can look into use-case models, unified modeling language (UML), automatic requirements generation tools, prototypes, storyboard, agile software development or any other host of techniques.

One word about prototypes. Go ahead and use prototypes to help users express their needs. But when you're done, throw the prototype away. Tell management that a quality product will take time to develop, and working with hastily-thrown together prototype code is just going to wind up with a real mess. Trust me: Chuck it!

Common Requirements Errors

Based on some research articles, the most common types of errors when writing requirements are (in order):
1. Incorrect assumptions
2. Omitted requirements
3. Inconsistent requirements
4. Ambiguous requirements

What's Next?

Remember that requirements gathering is a cyclical process. Despite having a single author responsible for this document, there should be several stakeholders conducting several rounds of reviews and inspections. In fact, you should have a way to track the history of the requirements document, and it should be possible to revise it, even after it is finalized (this is the real world, after all).

Prioritize your requirements. You may even want to move some requirements to a "future release" section.

In my requirements document, I als like to add the following sections:
1. A list of all the assumptions that were made during the requirements gathering process, and why.
2. A list of constraints on either the software, the design, or the development process
3. A reference to any industry standards to which you want the software to conform (eg: ANSI SQL), and why.
4. A summary of how we expect the system to be used: how many concurrent users, transactions per day, etc.
5. Project-specific details, like the manpower, time, skill sets, deadlines, licenses, equipment that will be needed.
6. A project glossary of terms used in this application, including acronyms.

Granted this was just one man's very basic look at requirements gathering, but for many small to mid-sized projects, that should get us started in the right direction.

Friday, January 20, 2006

Odds and Ends

PL/SQL vs J2EE, Part 2

I knew there was going to be some discussion following up my previous post on using stored procedures instead of embedding SQL in your J2EE application. Some of it you can read in the comments, others you can find on other people's blogs.

Original Article
Tim Hall's Response (and Here).
Harry Boswell's Response
Dizwell Discussion: Database Independence

I also saw a similar discussion that had taken place over a year ago. Check out these posts by Mark Rittman and Dave Warnock for a Point-Counterpoint on the debate.

Mark Rittman
Dave Warnock's Response

Gary Myers and Harold Ramis Separated At Birth


Gary Myers

Harold Ramis

In seriousness, I've enjoyed Gary Myers' blog for quite some time, but I particularly enjoyed his recent series of blogs on the 12 SELECTs of Christmas, so I thought I would highlight it for you to enjoy again.
12 Selects of Christmas, Part I
12 Selects of Christmas, Part II
12 Selects of Christmas, Part III
12 Selects of Christmas, Part IV

Neat Tricks

Some time ago I posted an article on UTL_HTTP that gave Pete Finnigan a security-related heart attack. Just when he was recovering, Doug Burns posted a neat little trick that will give him another stroke. Apparently within SQL*Plus you can execute an SQL script somewhere on the Internet. Try it!

UTL_HTTP Post
Pete Finnigan
Doug Burns

Blog Updates

I've noticed Tony Andrews, whose blog I quoted in my last post, has some new content including an interesting So Doku Solver in PL/SQL. Check it out here, and hope he continues to post:
Tony Andrews PL/SQL So Doku Solver

Another new Oracle blog to check out is Yas'. I've enjoyed several of his postings recently, so its hard to pick just one to highlight, but given its similarity to my recent post on Bulk Binding and FORALL, here is one on BULK COLLECT.
My FORALL
Yas' BULK COLLECT

Ask Tom

I spend a fair bit of my Oracle web-surfing time over on AskTom. If you ask many of us, we'll say its an invaluable source of information, and also houses some very interesting discussions. Here are three samples of discussions I've been following recently.

Driving Tables
Introduction to Analytical Functions
Batch Processing

Old Post Updates

The original intention of this blog was to have a place to organise my thoughts on things that I have learned about Oracle. Therefore, generally a post here is merely one of the early stages of my exploration on a particular topic. I usually continue my exploration, and doubtlessly continue to find interesting articles on the same topics. Here are two examples.

You may have noticed that I talk a lot about NULL, how funny it is, and how misunderstandings lead to errors. I mentioned recently that if I were to write an article it would probably be about NULLs. I was flipping through an Oracle magazine I had missed from last Summer and saw that Lex de Haan and Jonathan Gennick had beaten me to it.
Nulls: Nothing to Worry About

Last Fall I blogged on using ROWNUM and ROWID, and I just recently found an awesome post on AMIS to demonstrate the power of using ROWNUM to improve performance.
Original Post on ROWNUM
Alex Nuijten on ROWNUM

Finally

My final order of business is related to Oracle only in that it involves one of our favourite PL/SQL experts, Steven Feuerstein. Steven recently started a personal blog, and even posted an article in response to a questionable comment I left on Eddie's Blog.

Eddie Awad
Steven Feuerstein is a nut?

But rather than talk about nuts, I would really like to highlight just how much Steven Feuerstein's work has helped me, and how it can help others. You can see my blog is littered with examples, here is another, about how to hide your code:

Steven Feuerstein Q&A: Hiding PL/SQL
My Article about Steven Feuerstein on Refactoring

Wednesday, January 18, 2006

PL/SQL vs J2EE

Let's say you're designing an enterprise-wide OLTP Web Application with Data Warehousing, Reporting and multiple interfaces to various external systems for large volumes of data. Where do you put your business logic? Do you embed your SQL into your Java code, or do you keep your SQL in the database, and write stored procedures?

Regardless of your answer, you are wandering into one of the older and more heated arguments in this community. (Classic examples of one: here and here) I don't mean to re-ignite it, but I do want to understand it.

"It is a religious war though, they cannot be won by reason or objectivity." - Tom Kyte

It would appear like Java programmers hate stored procedures. They are viewed as hard to maintain, and not scalable. To them, scalability means spreading out the workload, running on several small machines.

Some don't even like to write the SQL themselves, instead preferring to use a framework like Hybernate or TopLink do it, and then rest their heads on their pillows at night dreaming of fewer errors/portability issues and better performance. But even if they can't achieve the better performance the database guys keep bragging about, they're more than happy to accept greater ease of development and maintainability in exchange.

Plus, Java programmers love choices. Lots of tools, unit testing packages, libraries - all open source - not to mention a huge community. They wouldn't want to get attached to any database vendor, especially one that is expensive and proprietary.

"If you want portability keep logic out of SQL." - Martin Fowler

At the other end of the spectrum are the database specialists. They walk around with print-outs of all the horrible SQL statements Java programmers have written that are choking the database, with serialization and loads of parsing. "Leave the SQL to us!" they plead in vain, "it's too hard for you, especially with this crappy database design!" They may secretly admire the Java progammers ability to write procedural or object-oriented code, but they believe that they lack the key to unlocking database performance: "set-based" mentality.

They look at a Java application and all they can see if multiple database trips where the SQL could have been combined into a single stored procedure making one trip. They examine their EJB containers and then lecture the programmers on recognising and understanding the difference between read-write and read-only situations.

The database specialist isn't as impressed by the Java programmer's triumphant removal from database dependence, for if you aren't tying yourself to a database you are losing out on all its power.

Where does that leave us?

The fundamental question is where the business logic should be.

The Java programmer, who sometimes refers to it as domain logic, wants business logic out of the database, often in the name of database independence. Even without this argument, they see live or non-relational data, in different formats, much of it transient and with complex rules ... why write piles of unmaintainable PL/SQL for that? They may concede that some validation constraints are OK, but even that isn't for sure.

Here is an argument on avoiding putting your Domain Logic into SQL, from none other than Martin Fowler:
http://www.martinfowler.com/articles/dblogic.html

"Do not use stored procedures to implement business logic. This should be done in Java business objects." - Rod Johnson

The database specialist views business rules as data rules and thus belongs close to the data. If there are some business rules or business relationships between various entities, well that should be enforced in the same place where the data is stored: the database. You've bought expensive, sophisticated software capable of managing your data in a fast, reliable, scaleable way: use it!

Consider a very simple argument put forward by database specialist Tony Andrews about putting the business logic in the database:
http://tonyandrews.blogspot.com/2005/04/business-logic-apis.html

Here is an article about a database-centric approach to J2EE application development:
http://www.oracle.com/technology/pub/articles/odtug_award.pdf

A Final Plea

I don't think there really is a right answer and a wrong answer in general terms. But armed with an awareness of the debate, its basic framework and a few starting points, we can make the decisions that make sense on a project-by-project basis.

Note: This is one of the first times I've put up a non-technical post, so I'll be curious if there is any demand for more.

Tuesday, January 17, 2006

Bulk Binding: FORALL

When writing your PL/SQL stored procedure bear in mind that SQL statements are still sent to the SQL Engine as opposed to the PL/SQL Engine. Therefore in cases where you are executing several SQL statements in a loop, the resulting context switches could cause a noticeable performance problem.

One solution to this performance solution is the use of "bulk binding." What is that? Well first, you may recall that binding variables allows you to tie the current value of a variable into an SQL statement.
http://thinkoracle.blogspot.com/2005/06/bind-variables-in-plsql.html

"Bulk Binding" refers to a process whereby you can tie the current values of all the elements in an entire collection into a single operation. By using bulk binds, only one context switch is made between the PL/SQL and SQL Engines, to pass the entire collection, thus avoiding those performance issues.

So how is this done? In this case, using FORALL.

Let's look at an example from the Oracle documentation of how you might do something without any knowledge of bulk binding:

DECLARE
TYPE NumList IS VARRAY(20) OF NUMBER;
depts NumList := NumList(10,30,70);
BEGIN
FOR i IN depts.FIRST..depts.LAST LOOP
UPDATE emp SET sal = sal + 100 WHERE deptno = depts(i);
END LOOP;
END;

Using timing techniques I've explained before, here is what I came up with:
http://thinkoracle.blogspot.com/2005/09/analyzing-query-performance.html

call     count       cpu    elapsed       disk      query
------- ------ -------- ---------- ---------- ----------
Parse 1 0.00 0.00 0 0
Execute 3 0.01 0.00 0 9
Fetch 0 0.00 0.00 0 0
------- ------ -------- ---------- ---------- ----------
total 4 0.01 0.00 0 9


Instead, we can use FORALL like this:

DECLARE
TYPE NumList IS VARRAY(20) OF NUMBER;
depts NumList := NumList(10,30,70);
BEGIN
FORALL i IN depts.FIRST..depts.LAST
UPDATE emp SET sal = sal + 100 WHERE deptno = depts(i);
END;

And get this:

call     count       cpu    elapsed       disk      query
------- ------ -------- ---------- ---------- ----------
Parse 1 0.00 0.00 0 0
Execute 1 0.00 0.01 0 9
Fetch 0 0.00 0.00 0 0
------- ------ -------- ---------- ---------- ----------
total 2 0.00 0.01 0 9


Notice 1 execute instead of 3. Of course we won't see any performance difference on a small sample size, but here I am just illustrating the point.

You may notice the absense of the keyword "LOOP" in the FORALL example. That is because despite its similar appearances and syntax in this example, FORALL is not a loop. It takes a single SQL statement, and the index i can be used only as an index into the collection.

You can find more information about FORALL in the PL/SQL User's Guide and Reference: http://download-west.oracle.com/docs/cd/B10501_01/appdev.920/a96624.pdf

Another useful trick is the use of SQL%BULK_ROWCOUNT (SQL is the implicit cursor used by the SQL engine for DML operations). It can be indexed the same way as the collection.

DECLARE
TYPE NumList IS VARRAY(20) OF NUMBER;
depts NumList := NumList(10,30,70);
BEGIN
FORALL i IN depts.FIRST..depts.LAST
UPDATE emp SET sal = sal + 100 WHERE deptno = depts(i);

FOR i IN depts.FIRST..depts.LAST LOOP
DBMS_OUTPUT.PUT_LINE('Affected Rows for ' || depts(i) || ' is ' || SQL%BULK_ROWCOUNT(i));
END LOOP;
END;

Affected Rows for 10 is 3
Affected Rows for 30 is 6
Affected Rows for 70 is 0

The only error I can foresee getting with FORALL is this one:
ORA-22160: element at index does not exist
And you will only get that if you are somehow binding to an index that does not exist.

You may also get complaints if you use the index in an expression, which is not supported:
PLS-00430: FORALL iteration variable i is not allowed in this context

Now that I've spoken about avoiding unnecessary context switches executing SQL statements, what about unnecessary context switches in getting information FROM the SQL engine? Are those avoidable? Yes, using BULK COLLECT. I'll write more about that shortly.

Links:
Dan Morgan has more than just a reference on this topic, he has lots of really good examples: http://www.psoug.org/reference/bulk_collect.html

There is also a short but good write-up on bulk binding in the Application Developer's Guide: Fundamentals, look under Chapter 9:
http://download-west.oracle.com/docs/cd/B10501_01/appdev.920/a96590.pdf

Dr. Tim Hall has some several good write-ups with good examples: http://www.oracle-base.com/articles/8i/BulkBinds8i.php

Sunday, January 15, 2006

Oracle DO NOTs

Here is another interesting discussion taking place on AskTom about Oracle DO NOTs:

http://asktom.oracle.com/pls/ask/f?p=4950:8:::::F4950_P8_DISPLAYID:53901314988148

After Tom responded, a lot of people got into it. I had some ideas myself, but decided not to make one of my famous Top 20 Lists after all, since it wound up similar to this one:

http://thinkoracle.blogspot.com/2005/12/20-plsql-coding-tips.html

For those of you who are curious, here is more or less what was covered by Tom or his guests:

Tom:
1. Don't accept string literals from end users and concatenate them into your SQL. In other words, use binds. This isn't just for efficiency, its more to prevent execution of malicious code. Yes, I know I did this one when demonstrating DBMS_PIPE.http://thinkoracle.blogspot.com/2005/11/dbmspipe.html

2. Don't test on an empty or near-empty database (you need real volumes)

3. Don't test with a single user (you'll miss scalability issues)

4. Don't forget to use a source control system.

5. Don't wing it as you go along, design in advance.

One responded (our buddy Bill) that you shouldn't do anything without a spec.

6. Don't take advice from experts without testing to see if it applies to you.

7. Don't optimize by hypothesize. Test!

From Others:

8. Don't reinvent the wheel, use built-in packages

9. Don't ignore the Oracle documentation, there is a wealth of information there.

10. Don't use technologies because they are cool.

11. Don't hesitate to throw away bad code, rewriting can be better than refactoring.

This actually resulted in some debate. Sometimes you want to let sleeping dogs lie. That is, if the code works, don't spend days re-factoring it unless you need to.

12. Don't write code without documentation/comments

Here is a great language-neutral link: http://tlug.up.ac.za/old/htwuc/unmain.html

13. Don't name variables arbitrarily.

14. Don't skip your unit-testing

15. Don't comment your SQL to override the optimizer plan unless you are sure.

16. Don't forget to update comments when you update code.

17. Don't forget to instrument your code.

What is instrumentation? Well, read Mogens Norgaard, or start with this link:
http://tkyte.blogspot.com/2005/06/instrumentation.html

18. Don't forget to mention dependencies when commenting code.

19. Don't expect the same database on two different hosts to work exactly the same.

20. Don't modify init.ora parameters without documenting the originals.

21. Don't neglect to collect histogram data when you run stats.

Well this may have turned into a Top 20 (well, 21) list, but none of these are my own. If you find this interesting, the link to the discussion is at the top.

This page is powered by Blogger. Isn't yours?