Validating XML against dynamic xsd

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
DSDexter
Participant
Posts: 94
Joined: Wed Jul 11, 2007 9:36 pm
Location: Pune,India

Validating XML against dynamic xsd

Post by DSDexter »

Hello,

I have a XML file as one of the source to my job. I am validating the XML against a xsd file. In the xml file i have child-subchild on so on upto 4 levels.

Any given day I may or may not receive all the subchilds. Even the tags for these sub-childs will not be present in the xml file. In that case my Job should handle the handle the unfound tags and pass them as nulls.

Is this possible in Datastage? Server or Parallel?

Any thoughts are really welcome.
Thanks
DSDexter
DSDexter
Participant
Posts: 94
Joined: Wed Jul 11, 2007 9:36 pm
Location: Pune,India

Re: Validating XML against dynamic xsd

Post by DSDexter »

Just to add...

When the Child records are missing in the XML file....The associated tags are also missing. But whereas my xsd will have these tags defined.
Thanks
DSDexter
eostic
Premium Member
Premium Member
Posts: 3838
Joined: Mon Oct 17, 2005 9:34 am

Post by eostic »

Hi DSDexter...

DataStage and the XMLInput stage most certainly can handle this circumstance....but it requires that you carefully review your nested child-parent-grandchild (and so forth) relationships, and then get to know the meaning behind, and behavior of, what is call the "Repetition Element."

This confuses many, although it sounds like you may have already been using it successfully to identify (with key=yes), an element on the lowest level node for which you want rows. This means that the repeating element at that level is what defines the number of individual rows that go down your output link.

The next part of this equation is understanding the settings for "Repetition Element Required". This is a setting that impacts your run-time results. That setting tells DataStage what to do with an incoming node that does NOT have that particular element....ie --- when that element is entirely "missing" from the node being parsed and reviewed at any given time.

This is an issue for retreival from hierarchical structures. I cut my teeth years ago on a reporting tool called FOCUS. We had a great concept there called a "short path," describing this exact scenario. What should a retrieval tool do when the columns (in this case, elements) don't even exist? A parent with no children. Is it valid? Should it send out nulls?

If you check that the "Repetition Element is Reqiured" then you will "lose" the entire path if the elment does not exist in a particular node. This means that XMLInput will treat the parent that has no children (for this particular link --- another link that only describes the parent and has Repetition Element at another level will retrieve it) as an invalid row and it will not go down the link.

If you un-check "Repetition Element is Required," then you will get nulls for sub-level nodes that are "missing" or non-existent.

Ernie
Ernie Ostic

blogit!
<a href="https://dsrealtime.wordpress.com/2015/0 ... ere/">Open IGC is Here!</a>
Post Reply