Page 1 of 1

loading html reports

Posted: Mon May 24, 2010 6:19 am
by randy
I am trying to load an html report into an oracle table.
I have looked at some of the xml documentation
but not sure it applies to html reports. I tried to use the XML Meta Data Importer,
but it hangs when I try to select an encoding.
I am only getting the files twice a day and it only has around 100 lines of data.
Is there a simple way to load this report?

Here is a sample of the header
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
<HTML>
<HEAD>
<TITLE>Sales Last 10 Hours</TITLE>
<META http-equiv=Content-Type content="text/html; charset=iso-8859-1">


Thanks
Randy

Posted: Mon May 24, 2010 10:09 am
by kduke
HTML is not XML. I am not sure you can trick it into working. What exactly are you trying to extract out of an HTML document? You might be able to do it with a sequential file stage if all you want is the HEAD or TITLE information.

Posted: Mon May 24, 2010 12:22 pm
by randy
its an html report, so I need the stuff between <TR> <TD> and </TD> </TR>

Randy

Posted: Tue May 25, 2010 9:50 am
by kduke
You may have to write your own solution.

Posted: Tue May 25, 2010 11:01 am
by priyadarshikunal
I would remove header and then convert </TD><TD> combination to the column separator and </TR> to line separator. After that clear all tags and use the data.

Posted: Tue May 25, 2010 1:44 pm
by vinnz
You could ask the report to be sent in XHTML format or if the current report structure is fairly standardized, use a utility like HTML Tidy to quickly and easily convert your HTML into XML