Problem parsing xml

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
eostic
Premium Member
Premium Member
Posts: 3838
Joined: Mon Oct 17, 2005 9:34 am

Post by eostic »

Hmm.... that's an interesting problem indeed.....from the looks of things, QC and Stability are "sub-elements" of Site. You should be able to just grab the values for those two elements as columns to be on each row of Site....

QC would have "/Ext/Site/QC/text()" and
Stability would have "/Ext/Site/Stability/text()"

The repetition element could be a bit tricky because the two sub-elements appear to be mutually exclusive in this example ....I'd probably first try having a "Site" field with xpath as "/Ext/Site/" and making it the repetition element.

You should get two rows, one with Nulls for Stability and another with Nulls for QC.

Ernie
Ernie Ostic

blogit!
<a href="https://dsrealtime.wordpress.com/2015/0 ... ere/">Open IGC is Here!</a>
anusha
Premium Member
Premium Member
Posts: 37
Joined: Wed Nov 28, 2007 1:29 am
Location: pune

Post by anusha »

Ernie,
We have given the site Xpath as the repetition element.
But we are not even getting any nulls within the stability data .

Basically the QC and the stability are at the same level

QC would have "/Ext/Site/@QC" and
Stability would have "/Ext/Site/@Stability"

How do we get the data populated for both the tags working.?

Or is there an alternate design approach we should be trying?
eostic
Premium Member
Premium Member
Posts: 3838
Joined: Mon Oct 17, 2005 9:34 am

Post by eostic »

Well...for starters, QC and Stablity are not attributes...so if that's the xpath you have, you will definitely not get anything....at the very least, I would first try these columns on your output link...

SiteXML (check yes for key) /Ext/Site
St /Ext/Site/@St
Cr /Ext/Site/@Cr
QC /Ext/Site/QC/
Stability /Ext/Site/Stability

It's not clear from your example whether QC and Stability have further element content, or just strings....but the test above will get their whole element, assuming it works.

Now...if Stability and QC have deeper definition, or if they repeat within site, then we have an entirely different problem --- you will want then to have a separate output link for each...one that drills into QC and the other that drills into Stability, each with some other deeper repetition element. This is true also if they aren't "deeper" but if either repeats.

Ernie
Ernie Ostic

blogit!
<a href="https://dsrealtime.wordpress.com/2015/0 ... ere/">Open IGC is Here!</a>
anusha
Premium Member
Premium Member
Posts: 37
Joined: Wed Nov 28, 2007 1:29 am
Location: pune

Post by anusha »

Ernie,
what you have thought is right.
We do have deeper definitions for Stability and QC. These QC & Stability entities will hold their own subentities& attributes in them & that too will be in again hierarchical nature.

It's true that,They are mutually exclusive in nature so we may end up with different set of permutations & combinations like that is, xml data file 1. May hold both QC & Stability entities available in the same Site.
2. May hold only QC entity.
3. May hold only Stability entity.
4. May hold multiple set of QC & Stability under different Site entities.

So for points 2 & 3 & 4, we are not even able to Import the XML Files with correct XPath Expressions correctly. we end up with importing only the start of first set of QC or Stability Entiites even though the XML File well -formed & validated.

advise me on this,
eostic
Premium Member
Premium Member
Posts: 3838
Joined: Mon Oct 17, 2005 9:34 am

Post by eostic »

No reason you shouldn't be able to import it.....try to make sure that you have a "fully populated" sample document. If you have the xsd, use any of 100's of tools that you can find on the web or elsewhere to generate a sample document that has all elements and attributes populated.

...or preferably, if its a single document xsd, import that instead.

When you do the import, drill deep into the QC and into Stability for each table definition. You will need a separate table definition for each "single" path thru the hierarchy where you expect multiple rows.

Ernie
Ernie Ostic

blogit!
<a href="https://dsrealtime.wordpress.com/2015/0 ... ere/">Open IGC is Here!</a>
anusha
Premium Member
Premium Member
Posts: 37
Joined: Wed Nov 28, 2007 1:29 am
Location: pune

Post by anusha »

Ernie,

I am listing some of the combinations of the file that are expected.

Code: Select all

1.
<?xml version="1.0" encoding="utf-8"?>
<Ext SDt="2009-03-08T00:00:00" EDt="2009-03-09T23:59:59" V="1">
  <Site Cr="A1" St="ABCD">
    <Bat ID="B-090108-0000001" Ty="Production">
      <MRPBatID>MERPS_001</MRPBatID>
       .........
    <Study StyId="STY_00019" StyName="stability strty" StyStatus="A" StyEdDt="N/A">
      <Suite SuitID="STS_00005" SuitDes="For Testing" SuitStat="P"/>
      <Bat ID="B-081126-0000003" Ty="Production">
        <MRPBatID>081126-000000001</MRPBatID>
        .....
   </Site>
  <Site Cr="B2" St="ABCD">
    <Study StyId="STY_00019" StyName="stability strty" StyStatus="A" StyEdDt="N/A">
      <Suite SuitID="STS_00005" SuitDes="For Testing" SuitStat="P"/>
      <Bat ID="B-081126-0000003" Ty="Production">
        <MRPBatID>081126-000000001</MRPBatID>
        .....
   </Site>
</Ext>

Code: Select all

2.
<?xml version="1.0" encoding="utf-8"?>
<Ext SDt="2009-03-08T00:00:00" EDt="2009-03-09T23:59:59" V="1">
  <Site Cr="A1" St="ABCD">
    <Study StyId="STY_00019" StyName="stability strty" StyStatus="A" StyEdDt="N/A">
      <Suite SuitID="STS_00005" SuitDes="For Testing" SuitStat="P"/>
      <Bat ID="B-081126-0000003" Ty="Production">
        <MRPBatID>081126-000000001</MRPBatID>
        .....
   </Site>
  <Site Cr="B2" St="ABCD">
     <Bat ID="B-090108-0000001" Ty="Production">
      <MRPBatID>MERPS_001</MRPBatID>
       .........
     <Study StyId="STY_00019" StyName="stability strty" StyStatus="A" StyEdDt="N/A">
         <Suite SuitID="STS_00005" SuitDes="For Testing" SuitStat="P"/>
         <Bat ID="B-081126-0000003" Ty="Production">
          <MRPBatID>081126-000000001</MRPBatID>
        .....
   </Site>
</Ext>
When File 2 listed above is imported through xml metadata importer it is not showing the xpath expressions for all the tag that are at the same hierarchy. That means it listing only the xpath expressions for ext/site/study as per file 2


Anusha
eostic
Premium Member
Premium Member
Posts: 3838
Joined: Mon Oct 17, 2005 9:34 am

Post by eostic »

Use the one that is most complete. The metadata importer, in my experience, is not going to scan beyond the initial instance of each node tree.....

That's why you need to use a document that has "all" possibible nodes fully populated. The best way to do that is to have the xsd and "generate" a sample doc.

...or as noted earlier, import from the xsd and not from an instance document. There are some xsd's that are not supported (those that use "import" and are multi-file xsd's), but they don't occur all of the time, and of course, if you have such an xsd, you can still do what I mentioned above, and easily create a sample fully populated document.

Otherwise, just piece it together....as you are doing testing, if you find a missing node or element/attribute that you need, just type it in...or better yet, create a new table def, and then on import to the link, use the selection dialog to choose the new column and have it inserted into the link.....

Ernie
Ernie Ostic

blogit!
<a href="https://dsrealtime.wordpress.com/2015/0 ... ere/">Open IGC is Here!</a>
Post Reply