Problem reading XML in XML-Input

A forum for discussing DataStage<sup>®</sup> basics. If you're not sure where your question goes, start here.

Moderators: chulett, rschirm, roy

Post Reply
rroy2
Participant
Posts: 21
Joined: Thu Jan 03, 2008 4:16 pm

Problem reading XML in XML-Input

Post by rroy2 »

Hi,

The problem I am facing goes like this.

I created a parallel job to read an XML file using External Sorce stage and XML Input stage. I am able to read a file which has appropriate tags (I mean each tag has same number of properties and child tags like others "with same name") e.g.

<tag1>
<tag2 property="1">
<tag3 property="1">abcd</tag3>
</tag2>
<tag2 property="2">
<tag3 property="2">efgh</tag3>
</tag2>
</tag1>

But if the file has imbalanced tags (i.e. tags at level say 2 have either 3, 4 or 5 subtags) then Datastage fails to upload its metadata through plugin And worse it fails to read the file, even if I somehow load the metadata.

My file is like

<?xml version="1.0"?>
<!DOCTYPE doctype SYSTEM "http://someurl.com/abcd.dtd">
<fileRply createDateTime="2007-09-03T02:04:03" coName="Mycomp" rqstType="Default">
<Rply prodCode="abc" procCode="default">
<elemProcessed>
<elemName>mass_1</elemName>
<keyItems>
<keyItem keyName="type">Org</keyItem>
<keyItem KeyName="procInstrCode">Default</keyItem>
<keyItem KeyName="acctNo">11111</keyItem>
<keyItem KeyName="delimit">|</keyItem>
<keyItem KeyName="prodCode">Gateway</keyItem>
</keyItems>
<success>
</success>
</elemProcessed>
<elemProcessed>
<elemName>record</elemName>
<keyItems>
<keyItem keyName="record">|12345|A|007|11111|000000||</keyItem>
</keyItems>
<error>
<rplyItem severity="Elementrejected">
<rplyCode>77777</rplyCode>
<rplyMsg>Transaction Exception encountered</rplyMsg>
<rplyText>java.net.ConnectionException: Connection timed out</rplyText>
</rplyItem>
</error>
</elemProcessed>
</Rply>
</fileRply>

By imbalance I mean: e.g. 'mass_1' has more keyItem tags compared to 'record'

ISSUE 1) When I try to import metadata i get tags till <success> tag only. No tags I am getting in the metadata tree after it. Why?

ISSUE 2) I removed it and got the metadata but when executed the job it did run but gave 0 records in result. Does datastage require an IDEAL xml file given as sample above or it can read any xml file?

Please revert if more info is needed

Thanks
roy
rroy2
Participant
Posts: 21
Joined: Thu Jan 03, 2008 4:16 pm

Post by rroy2 »

Hi,

I dug more into it. It seems the statement

<!DOCTYPE doctype SYSTEM "http://someurl.com/abcd.dtd">

is causing some issue. (i cant write the actual dtd url due to security reasons). But whatever dtd url I specify the xml metadata plugin keeps on 'processing' it and then says 'operation timeout'. Its only once I have removed this statement and put a space ' ' between <success> and </success> tags that I am able to see the metadata tree and even able to parse the file.

Still working. suggestions are welcome.

Thanks!
VCInDSX
Premium Member
Premium Member
Posts: 223
Joined: Fri Apr 13, 2007 10:02 am
Location: US

Post by VCInDSX »

Hi Roy,

I think, this should ideally be in the Parallel thread.

Anyway, check out
viewtopic.php?t=116649
viewtopic.php?t=114855

You are probably hitting the Namespace related issue in Version 8.0 that ernie has mentioned several times in various thread and you might need a patch.

Search a bit more on "namespace" and check with your provider on patches for XML issues.

Good luck,
-V
eostic
Premium Member
Premium Member
Posts: 3838
Joined: Mon Oct 17, 2005 9:34 am

Post by eostic »

My first guess would be Vincent's also.... although I wonder, because it's a different symptom than I've seen. If I get a chance I'll try your document in a 7.5 Server just to see if it's alright.

As for your zero records, that's most likely an xpath issue. Carefully check your spellings and the way you have laid out the columns. Success is an emtpy element, and "appears" to only occur once per keyitems unit, but in theory perhaps it has it's own deeper repeating group, which would require its own path...

Ernie
Ernie Ostic

blogit!
<a href="https://dsrealtime.wordpress.com/2015/0 ... ere/">Open IGC is Here!</a>
VCInDSX
Premium Member
Premium Member
Posts: 223
Joined: Fri Apr 13, 2007 10:02 am
Location: US

Post by VCInDSX »

Roy,
How was the metadata (XML Table Definition) created. Did you use an XSD to create your Table definition or did you use the XML itself to create the table definition.

Like Ernie said, your XPath expressions, if not laid out correctly, might have issues when parsing the input file and emitting the "desired" output.

Would it be possible for you to post your XPath expression list?

Thanks
-V
rroy2
Participant
Posts: 21
Joined: Thu Jan 03, 2008 4:16 pm

Post by rroy2 »

Hi,

Sorry was away.

I added the XPaths through the XML Metadata Importer facility. I read a sample XML file (containing the tags that i require for parsing). The importer was adding 'ns1' which I removed as it was causing the parsing to fail.

The xpaths are like following:

/fileRply/@createDateTime
/fileRply/Rply/elemProcessed/keyItems/keyItem/@keyName
/fileRply/Rply/elemProcessed/keyItems/keyItem/text()
/fileRply/Rply/elemProcessed/error/rplyItem/@severity
/fileRply/Rply/elemProcessed/error/rplyItem/rplyCode/text()
/fileRply/Rply/elemProcessed/error/rplyItem/rplyMsg/text()
/fileRply/Rply/elemProcessed/error/rplyItem/rplyText/text()

I doubt misspellings since without the DOCTYPE stmt the file is getting parsed with same xpaths. Thus it seems a blank <success> tag and DOCTYPE are causing zero records to be parsed when either is included.

Thanks
roy
VCInDSX
Premium Member
Premium Member
Posts: 223
Joined: Fri Apr 13, 2007 10:02 am
Location: US

Post by VCInDSX »

Ernie,
Correct me if I am wrong. The OP has the element <elemProcessed> repeating within the parent structure "/fileRply/Rply/".

Again <keyItem> (singular) is repeating under <keyItems>
<keyItems>
<keyItem>
....
....
<keyItem>
<keyItems>

But the XPath from OP does not seem to take that into account. How would the stage behave in these cases? Shouldn't such instances be hanled in a branched-out link?

Also, what is your repeating element Roy? How have you setup the "Repetition element required" property?

How your DTD is defined in terms of min and max occurences for various elements will drive your job design.

Ernie/Craig are the gurus on the subject. Can you post your DTD, atleast the sections that define the min and max occurence rules?

Thanks,
-V
Post Reply