loading html reports

Post questions here relative to DataStage Server Edition for such areas as Server job design, DS Basic, Routines, Job Sequences, etc.

Moderators: chulett, rschirm, roy

Post Reply
randy
Participant
Posts: 30
Joined: Tue Sep 13, 2005 11:17 am

loading html reports

Post by randy »

I am trying to load an html report into an oracle table.
I have looked at some of the xml documentation
but not sure it applies to html reports. I tried to use the XML Meta Data Importer,
but it hangs when I try to select an encoding.
I am only getting the files twice a day and it only has around 100 lines of data.
Is there a simple way to load this report?

Here is a sample of the header
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
<HTML>
<HEAD>
<TITLE>Sales Last 10 Hours</TITLE>
<META http-equiv=Content-Type content="text/html; charset=iso-8859-1">


Thanks
Randy
kduke
Charter Member
Charter Member
Posts: 5227
Joined: Thu May 29, 2003 9:47 am
Location: Dallas, TX
Contact:

Post by kduke »

HTML is not XML. I am not sure you can trick it into working. What exactly are you trying to extract out of an HTML document? You might be able to do it with a sequential file stage if all you want is the HEAD or TITLE information.
Mamu Kim
randy
Participant
Posts: 30
Joined: Tue Sep 13, 2005 11:17 am

Post by randy »

its an html report, so I need the stuff between <TR> <TD> and </TD> </TR>

Randy
kduke
Charter Member
Charter Member
Posts: 5227
Joined: Thu May 29, 2003 9:47 am
Location: Dallas, TX
Contact:

Post by kduke »

You may have to write your own solution.
Mamu Kim
priyadarshikunal
Premium Member
Premium Member
Posts: 1735
Joined: Thu Mar 01, 2007 5:44 am
Location: Troy, MI

Post by priyadarshikunal »

I would remove header and then convert </TD><TD> combination to the column separator and </TR> to line separator. After that clear all tags and use the data.
Priyadarshi Kunal

Genius may have its limitations, but stupidity is not thus handicapped. :wink:
vinnz
Participant
Posts: 92
Joined: Tue Feb 17, 2004 9:23 pm

Post by vinnz »

You could ask the report to be sent in XHTML format or if the current report structure is fairly standardized, use a utility like HTML Tidy to quickly and easily convert your HTML into XML
Post Reply