I am trying to load an html report into an oracle table.
I have looked at some of the xml documentation
but not sure it applies to html reports. I tried to use the XML Meta Data Importer,
but it hangs when I try to select an encoding.
I am only getting the files twice a day and it only has around 100 lines of data.
Is there a simple way to load this report?
Here is a sample of the header
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
<HTML>
<HEAD>
<TITLE>Sales Last 10 Hours</TITLE>
<META http-equiv=Content-Type content="text/html; charset=iso-8859-1">
HTML is not XML. I am not sure you can trick it into working. What exactly are you trying to extract out of an HTML document? You might be able to do it with a sequential file stage if all you want is the HEAD or TITLE information.
I would remove header and then convert </TD><TD> combination to the column separator and </TR> to line separator. After that clear all tags and use the data.
Priyadarshi Kunal
Genius may have its limitations, but stupidity is not thus handicapped.
You could ask the report to be sent in XHTML format or if the current report structure is fairly standardized, use a utility like HTML Tidy to quickly and easily convert your HTML into XML