Sunday, August 14, 2005

UTL_HTTP

Let's say you want to pull some real-time information off the Internet and put it in your database. For example:
- Stock Market Quotes
- Temperature
- Sports scores

No need to write a separate application in Java or whatever, you can do it all directly in Oracle.

Your solution will involve using one of the many handy Oracle built-in utilities: UTL_HTTP.

For reference, flip open Chapter 78 of the 'Supplied PL/SQL Packages and Types Reference'.

Tom Kyte also had a discussion in the Appendix of "Expert One-on-One Oracle", and also on Ask Tom:
http://asktom.oracle.com/pls/ask/f?p=4950:8:::::F4950_P8_DISPLAYID:285215954607

Let's look at a simple example, grabbing a market quotation.

You can get the latest quotes from Yahoo Finance. Using UTL_HTTP.REQUEST, you can get the content of that page, and then search within for the data you want.

SELECT UTL_HTTP.REQUEST('http://finance.yahoo.com/q?s=KMP') FROM DUAL;

If you're behind a firewall, include the IP or name of your proxy server as the second parameter.

For simplicity, I'm not including the output here, but you can see that we don't have our quote in there. That's because we only got the first 2000 bytes of the web page. If we want more, we need to use REQUEST_PIECES.

Down below I have included a working example. It's roughly thrown together, but it does illustrate the point: you can use UTL_HTTP to retrieve information from the Internet that you can put in the database.

As an aside, talk to some lawyers to make sure the data you are mining is not violating any copyrights.

After you write stored procedures to retrieve web pages, and to extract information from within and insert them into the database, you can automate the updates by calling the stored procedures using DBMS_JOB (a topic for another day!).

SET SERVEROUTPUT ON

DECLARE
l_pieces UTL_HTTP.HTML_PIECES;
-- We'll look at two 2000-byte pages at a time
l_two_pages VARCHAR2(4000);
l_start_read NUMBER;
l_end_read NUMBER;
l_quote VARCHAR2(12);
BEGIN
-- Grab up to a maxium of 32 2000-byte pages, and then go through them,
-- looking at 2 pages at a time in case the data we are looking for
-- overlaps a page boundary
l_pieces := UTL_HTTP.REQUEST_PIECES('http://finance.yahoo.com/q?s=KMP', 32);
FOR i IN 1 .. l_pieces.COUNT LOOP
l_two_pages := l_two_pages || l_pieces(i);
-- Look for a string preceding the information we want
-- If we find it, add 52 (magic, Yahoo-specific number)
-- to find the point where the quote will begin
SELECT INSTR(l_two_pages, 'Last Trade', 1, 1) INTO l_start_read FROM dual;
IF (l_start_read > 0) THEN
l_start_read := l_start_read + 52;
IF (l_start_read < 3950) THEN
SELECT INSTR(l_two_pages, '<', l_start_read, 1) INTO l_end_read FROM dual;
IF (l_end_read > 0) THEN
IF ((l_end_read - l_start_read) < 12) THEN
SELECT SUBSTR(l_two_pages, l_start_read, l_end_read - l_start_read) INTO l_quote FROM dual;
DBMS_OUTPUT.PUT_LINE(l_quote);
ELSE
DBMS_OUTPUT.PUT_LINE('Error (Quote more than 12 chars)');
END IF;
EXIT;
END IF;
END IF;
END IF;
l_two_pages := l_pieces(i);
END LOOP;
END;

Comments:
Good write up and easy to understand example. I will be checking out the UTL_HTTP packages. Just wanted to note that Pete Finnigan also talked about your entry on his blog and has a simple caveat

Remember that this package can also be used in the opposite direction and data can be extracted from the database. Also it can be used to load and run hacker SQL or PL/SQL scripts that are stored on an external web site. Beware of default privileges and useful functionality.
 
Thanks! Link to Pete Finnigan's article is here:
http://www.petefinnigan.com/weblog/archives/00000512.htm

On the coaxing of the guys at work, I quickly put together a stored procedure that will automatically fill out our office hockey pool using this technique. Anyone who wants this small, rough demo can send me an email for a copy.
 
In response to some people who have asked me privately, yes, taking advantage of Regular Expressions in Oracle 10 is an excellent way to avoid the "magic number" kluge I used here. Finding (or writing) a good HTML/XML parser (including what Oracle has built-in) is a great idea, too.
 
I'm looking for utl_http for downloading stock quotes from internet to my database .thanks
I'm going to try it .I think it will work.
 
Great Snippet and example. The code is much simplified when extracting the pricing info using REGEXP_SUBSTR
 
Very interesting. But I was wondering if it was possible to access a page that needs to be logged in in order to view it, like a forum thread which is not public. I need to access www.xyz.com/viewtopic.php?t=718 but as I am not legged in, i got redirected to www.xyz.com/ucp.php?mode=login. Thanks in advance and very interesting Blog you got here!
 
This comment has been removed by a blog administrator.
 
I have read serval site looking for an answer to my question and thought someone here could help. I have a third party that needs to do a posting URL to my system. I would like to just except the posting in thru an oracle package if possible. I was thinking of using the utl_http however I am not sure of how to code it. I know the parameters they will be passing. Cna some explain the utl_http from a receiving perspective instaed of a posting.
thanks!
 
Thanks for your tutorial. Regarding to:

Finding (or writing) a good HTML/XML parser (including what Oracle has built-in) is a great idea, too.

I use libtidy to do the job as extproc:

http://oracletidybinding.blogspot.com/2010/09/for-some-reason-i-needed-to-download.html

Cheers
Christian
 
Thank you very much for the example you gave. It is simple and very well explained.I finally understand how it works :) Best Regards, Steffy
 
Here's a link to a page with an updated version that works after Yahoo changed their format:

http://it.toolbox.com/blogs/data-ruminations/stupid-oracle-tricks-you-can-quote-gartner-53310

 
Hi
excellent article! I have a small doubt. I want to send SMS through a web server, passing SMS text in a single variable. For ex, I define a variable as
sms_text varcahr2(160) :='"This is a text message"';

then I call the utl_http passing message=sms_text

unfortunately, the SMS received is just "This, rest of the text is discarded. I am sure it is some text wrapping issue.

Please let me know how to wrap text and pass as a message parameter value


regards,

raj
 
Post a Comment

<< Home

This page is powered by Blogger. Isn't yours?