Reading .xml file In DataStage

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
Oritech
Premium Member
Premium Member
Posts: 140
Joined: Thu May 07, 2009 9:32 pm

Reading .xml file In DataStage

Post by Oritech »

How can we read the .xml file in Datastage?

When we open .xml file in Notepad++ ,see below data (putting sample of it)

Line1 -

Code: Select all

<Proponix>
  <Header>
    <DestinationID>ABC</DestinationID>
    <SenderID>PRO</SenderID>
    <OperationOrganizationID>LANZ3073</OperationOrganizationID>
    <MessageType>PRFADAT</MessageType>
    <DateSent>20150618</DateSent>
    <TimeSent>0000</TimeSent>
    <MessageID>159649037</MessageID>
  </Header>
  <SubHeader>
    <InstrumentID></InstrumentID>
    <ActivityType></ActivityType>
    <ActivitySequenceNumber></ActivitySequenceNumber>
    <DocumentType></DocumentType>
    <WorkItemNumber></WorkItemNumber>
    <DocumentName>CURRENT, DCF_ABC_AB18062015</DocumentName>
    <FaxPrintorData>D</FaxPrintorData>
    <ReceiverFax></ReceiverFax>
    <FaxSubject></FaxSubject>
    <FaxComments></FaxComments>
    <OnlineorTemporal>B</OnlineorTemporal>
    <DistributeIndicator></DistributeIndicator>
    <FinalorDraftorCopy>BOR</FinalorDraftorCopy>
    <FileType>CSV</FileType>
    <UserID>DCF-TOD-S-LANZ3073-HD</UserID>
    <Product></Product>
    <ProductType></ProductType>
    <NumberofPages></NumberofPages>
    <NumberofPrints></NumberofPrints>
  </SubHeader>
  <UserHeader>
    <UserHeader1></UserHeader1>
    <UserHeader2></UserHeader2>
    <UserHeader3></UserHeader3>
    <UserHeader4></UserHeader4>
    <UserHeader5></UserHeader5>
  </UserHeader>
  <Body>
    <CoreInformation>
Line2 - Blank ie no data

Line3 - column names (separated by comma)

Line4 - Blank ie no data

Line5 - Data separated by comma

Line 6 - Blank ie no data

Line 7 - Data separated by comma

-
-
-

so on & so forth

How can we get DataState Read this file....

thanks
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Get your hands on the xsd. Whomever provided you the file should have it. I'm guessing you've never done any work with XML before, yes?

And open the file in something XML aware, something other than Notepad... I.E. for example or an editor that supports XML like UltraEdit and the like.
-craig

"You can never have too many knives" -- Logan Nine Fingers
eostic
Premium Member
Premium Member
Posts: 3838
Joined: Mon Oct 17, 2005 9:34 am

Post by eostic »

This is interesting...you have what looks like a .csv file "embedded" inside of xml. Very do-able, but you will need an xsd for this, or alternatively, if there is no xsd (or even if there is, but this xml is relatively small, like less than 100 meg) then you can just use the xmlInput Stage and the xml metadata importer (Import....Table Definitions....XML Table Definitions) to figure out what elements you need. Ultimately you will want "Body"...and then can send that to a sequential file target and then read it on its own, and/or once you get to know the file real well, use the Column Import stage to parse it on the fly.

To get started, use that xml metadata importer and open up the ENTIRE tree when you get there. Click only the inner most boxes that you see....you didn't give us the whole document, but it is very likely that Body repeats, with this other header stuff above it....the whole thing should end with "</Proponix>", assuming this is valid xml. As Craig notes, definitely open this in IE or browser that is xml aware so that you can fully appreciate it. After you check "Body", you will "probably" make Body a key, and then save the table def to be loaded onto the output link of your xmlInput Stage

Search this forum for more ideas on using the xmlInput Stage and go to my blog at www.dsrealtime.com and then the table of contents in the upper right corner...look for a post on reading xml...it will describe how you should be sending data into this Stage. You may need, for your initial testing, to put this document into your Project directory after you follow the technique listed in that blog entry.

You have lots more to do, but this should get you started.

Ernie
Ernie Ostic

blogit!
<a href="https://dsrealtime.wordpress.com/2015/0 ... ere/">Open IGC is Here!</a>
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

FYI - I cleaned up the posted XML for you.

Read up on the xmlInput stage and the XML Metadata Import utility as a starter.
-craig

"You can never have too many knives" -- Logan Nine Fingers
abhinavagarwal
Participant
Posts: 26
Joined: Thu Jun 19, 2008 12:39 am
Location: Atlanta

Re: Reading .xml file In DataStage

Post by abhinavagarwal »

very simple one -

Just to avoid re-writing everything here, giving you pointers, use following links -

http://pr3systems.com/blog/information- ... -xml-pack/

http://etlimpact.blogspot.com/2012/08/r ... using.html

http://it.toolbox.com/wiki/index.php/Lo ... rallel_job

Even DS documentation is also very comprehensive on XML, you may want to try that as well.

Best regards
- Thanks and Regards,
Abhinav Agarwal
Post Reply