Sunday, August 14, 2005
UTL_HTTP
- Stock Market Quotes
- Temperature
- Sports scores
No need to write a separate application in Java or whatever, you can do it all directly in Oracle.
Your solution will involve using one of the many handy Oracle built-in utilities: UTL_HTTP.
For reference, flip open Chapter 78 of the 'Supplied PL/SQL Packages and Types Reference'.
Tom Kyte also had a discussion in the Appendix of "Expert One-on-One Oracle", and also on Ask Tom:
http://asktom.oracle.com/pls/ask/f?p=4950:8:::::F4950_P8_DISPLAYID:285215954607
Let's look at a simple example, grabbing a market quotation.
You can get the latest quotes from Yahoo Finance. Using UTL_HTTP.REQUEST, you can get the content of that page, and then search within for the data you want.
SELECT UTL_HTTP.REQUEST('http://finance.yahoo.com/q?s=KMP') FROM DUAL;
If you're behind a firewall, include the IP or name of your proxy server as the second parameter.
For simplicity, I'm not including the output here, but you can see that we don't have our quote in there. That's because we only got the first 2000 bytes of the web page. If we want more, we need to use REQUEST_PIECES.
Down below I have included a working example. It's roughly thrown together, but it does illustrate the point: you can use UTL_HTTP to retrieve information from the Internet that you can put in the database.
As an aside, talk to some lawyers to make sure the data you are mining is not violating any copyrights.
After you write stored procedures to retrieve web pages, and to extract information from within and insert them into the database, you can automate the updates by calling the stored procedures using DBMS_JOB (a topic for another day!).
SET SERVEROUTPUT ON
DECLARE
l_pieces UTL_HTTP.HTML_PIECES;
-- We'll look at two 2000-byte pages at a time
l_two_pages VARCHAR2(4000);
l_start_read NUMBER;
l_end_read NUMBER;
l_quote VARCHAR2(12);
BEGIN
-- Grab up to a maxium of 32 2000-byte pages, and then go through them,
-- looking at 2 pages at a time in case the data we are looking for
-- overlaps a page boundary
l_pieces := UTL_HTTP.REQUEST_PIECES('http://finance.yahoo.com/q?s=KMP', 32);
FOR i IN 1 .. l_pieces.COUNT LOOP
l_two_pages := l_two_pages || l_pieces(i);
-- Look for a string preceding the information we want
-- If we find it, add 52 (magic, Yahoo-specific number)
-- to find the point where the quote will begin
SELECT INSTR(l_two_pages, 'Last Trade', 1, 1) INTO l_start_read FROM dual;
IF (l_start_read > 0) THEN
l_start_read := l_start_read + 52;
IF (l_start_read < 3950) THEN
SELECT INSTR(l_two_pages, '<', l_start_read, 1) INTO l_end_read FROM dual;
IF (l_end_read > 0) THEN
IF ((l_end_read - l_start_read) < 12) THEN
SELECT SUBSTR(l_two_pages, l_start_read, l_end_read - l_start_read) INTO l_quote FROM dual;
DBMS_OUTPUT.PUT_LINE(l_quote);
ELSE
DBMS_OUTPUT.PUT_LINE('Error (Quote more than 12 chars)');
END IF;
EXIT;
END IF;
END IF;
END IF;
l_two_pages := l_pieces(i);
END LOOP;
END;
Remember that this package can also be used in the opposite direction and data can be extracted from the database. Also it can be used to load and run hacker SQL or PL/SQL scripts that are stored on an external web site. Beware of default privileges and useful functionality.
http://www.petefinnigan.com/weblog/archives/00000512.htm
On the coaxing of the guys at work, I quickly put together a stored procedure that will automatically fill out our office hockey pool using this technique. Anyone who wants this small, rough demo can send me an email for a copy.
I'm going to try it .I think it will work.
thanks!
<< Home


