Page 1 of 1

Reading .xml file In DataStage

Posted: Wed Sep 16, 2015 8:17 pm
by Oritech
How can we read the .xml file in Datastage?

When we open .xml file in Notepad++ ,see below data (putting sample of it)

Line1 -

Code: Select all

<Proponix>
  <Header>
    <DestinationID>ABC</DestinationID>
    <SenderID>PRO</SenderID>
    <OperationOrganizationID>LANZ3073</OperationOrganizationID>
    <MessageType>PRFADAT</MessageType>
    <DateSent>20150618</DateSent>
    <TimeSent>0000</TimeSent>
    <MessageID>159649037</MessageID>
  </Header>
  <SubHeader>
    <InstrumentID></InstrumentID>
    <ActivityType></ActivityType>
    <ActivitySequenceNumber></ActivitySequenceNumber>
    <DocumentType></DocumentType>
    <WorkItemNumber></WorkItemNumber>
    <DocumentName>CURRENT, DCF_ABC_AB18062015</DocumentName>
    <FaxPrintorData>D</FaxPrintorData>
    <ReceiverFax></ReceiverFax>
    <FaxSubject></FaxSubject>
    <FaxComments></FaxComments>
    <OnlineorTemporal>B</OnlineorTemporal>
    <DistributeIndicator></DistributeIndicator>
    <FinalorDraftorCopy>BOR</FinalorDraftorCopy>
    <FileType>CSV</FileType>
    <UserID>DCF-TOD-S-LANZ3073-HD</UserID>
    <Product></Product>
    <ProductType></ProductType>
    <NumberofPages></NumberofPages>
    <NumberofPrints></NumberofPrints>
  </SubHeader>
  <UserHeader>
    <UserHeader1></UserHeader1>
    <UserHeader2></UserHeader2>
    <UserHeader3></UserHeader3>
    <UserHeader4></UserHeader4>
    <UserHeader5></UserHeader5>
  </UserHeader>
  <Body>
    <CoreInformation>
Line2 - Blank ie no data

Line3 - column names (separated by comma)

Line4 - Blank ie no data

Line5 - Data separated by comma

Line 6 - Blank ie no data

Line 7 - Data separated by comma

-
-
-

so on & so forth

How can we get DataState Read this file....

thanks

Posted: Wed Sep 16, 2015 9:38 pm
by chulett
Get your hands on the xsd. Whomever provided you the file should have it. I'm guessing you've never done any work with XML before, yes?

And open the file in something XML aware, something other than Notepad... I.E. for example or an editor that supports XML like UltraEdit and the like.

Posted: Fri Sep 18, 2015 4:36 am
by eostic
This is interesting...you have what looks like a .csv file "embedded" inside of xml. Very do-able, but you will need an xsd for this, or alternatively, if there is no xsd (or even if there is, but this xml is relatively small, like less than 100 meg) then you can just use the xmlInput Stage and the xml metadata importer (Import....Table Definitions....XML Table Definitions) to figure out what elements you need. Ultimately you will want "Body"...and then can send that to a sequential file target and then read it on its own, and/or once you get to know the file real well, use the Column Import stage to parse it on the fly.

To get started, use that xml metadata importer and open up the ENTIRE tree when you get there. Click only the inner most boxes that you see....you didn't give us the whole document, but it is very likely that Body repeats, with this other header stuff above it....the whole thing should end with "</Proponix>", assuming this is valid xml. As Craig notes, definitely open this in IE or browser that is xml aware so that you can fully appreciate it. After you check "Body", you will "probably" make Body a key, and then save the table def to be loaded onto the output link of your xmlInput Stage

Search this forum for more ideas on using the xmlInput Stage and go to my blog at www.dsrealtime.com and then the table of contents in the upper right corner...look for a post on reading xml...it will describe how you should be sending data into this Stage. You may need, for your initial testing, to put this document into your Project directory after you follow the technique listed in that blog entry.

You have lots more to do, but this should get you started.

Ernie

Posted: Fri Sep 18, 2015 9:01 am
by chulett
FYI - I cleaned up the posted XML for you.

Read up on the xmlInput stage and the XML Metadata Import utility as a starter.

Re: Reading .xml file In DataStage

Posted: Fri Sep 18, 2015 11:32 am
by abhinavagarwal
very simple one -

Just to avoid re-writing everything here, giving you pointers, use following links -

http://pr3systems.com/blog/information- ... -xml-pack/

http://etlimpact.blogspot.com/2012/08/r ... using.html

http://it.toolbox.com/wiki/index.php/Lo ... rallel_job

Even DS documentation is also very comprehensive on XML, you may want to try that as well.

Best regards