xml schema / xsd with complexType and mixed="true"

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
rcanaran
Premium Member
Premium Member
Posts: 64
Joined: Wed Jun 14, 2006 3:51 pm
Location: CANADA

xml schema / xsd with complexType and mixed="true"

Post by rcanaran »

As per the example at http://www.w3.org/TR/2004/REC-xmlschema ... xedContent

Xml importer cannot find the text value of the complex element.

I coded the example in an xsd and imported via the xml schema importer in 7.5. Autochecking the nodes did not check the value for "letterbody". I think it only checks the lowest /leaf nodes.

MANUALLY checking the "letterbody" imports the xpath as ...../letterbody. This will bring in the entire xml chunk including the child elements. I must manually add the "/text()" to the end of the xpath to extract the text in "letterbody" that is not in a child element. This is great if its just ONE instance, but when its a complexType that is referenced multiple times in an xsd, as in this instance, I must manually check about 80 additional nodes (after autocheck) and then, after saving the schema, go to each of those lines and add "/text()".

I can only find ONE unanswered reference on IBM Developerworks.

As it is, the XML importer doesn't appear to understand the full w3 spec at the link above.

Is there a patch for 7.5x2?
Does the xml schema importer / schema manager for 8.1 or 8.5 parse this automatically?
eostic
Premium Member
Premium Member
Posts: 3838
Joined: Mon Oct 17, 2005 9:34 am

Post by eostic »

A couple of thoughts here...

a) the importer prior to 8.5 understands a lot of xsd, but doesn't handle all of the constructs.

b) independent mapping of the individual details of letterbody will likely be necessary...

c) never use auto-check. It doesn't understand the hierarchy, which is very critical when selecting your desired columns.

c) in a scenario like this one, you may be better just using an xml instance document and importing that instead. If you have that many nodes, and they are independently repeating, you are going to have that many independent table definitions anyway. Import an xml instance document that has at least one populated instance (I like to use the generate feature from xmlOxygen or xmlSpy if you have it) of each important node that you anticipate wanting to read.

Ernie
Ernie Ostic

blogit!
<a href="https://dsrealtime.wordpress.com/2015/0 ... ere/">Open IGC is Here!</a>
rcanaran
Premium Member
Premium Member
Posts: 64
Joined: Wed Jun 14, 2006 3:51 pm
Location: CANADA

Post by rcanaran »

Thanks Ernie. This is very helpful.

re :
a) does the importer for 8.5 (schema manager?) understand all of the current constructs? I think the "mixed=true" is in the 2004 spec, AFTER the 7.5 importer was released.

b & c) The message is in a 2Mb - 2Gb CLOB/xml transaction. The structure contains over 2,300 elements which, translating 1-1, become over 125 relational tables. So far, I've had no issue with autocheck in 7.5, until this. In 8.1, yes I've had issues. Is there a better tool/utility (3rd party even) to select all data nodes without getting all the intermediate xml chunks as well? Manual selection seems far too error prone, especially when dealing with thousands of elements.

d) -- I'm assuming the 2nd c) is a d) :D
Thanks again. I will try that, but no single instance has every xpath populated. I'll have to try merging several imports, but its worth a go.

And no xml spy or oxygen at this client's.
eostic
Premium Member
Premium Member
Posts: 3838
Joined: Mon Oct 17, 2005 9:34 am

Post by eostic »

a) the new xml stage lives and breathes on xsd's. So yes...but there are many considerations regarding a move to 8.5. This alone wouldn't be a good enough reason, especially if your ultimate goal is just reading.

b) Not sure what you mean here. Are you autochecking only for individual nodes? You still have to have separate links and table defs anyway, for any nodes that repeat....and I don't like surprises. I'd rather take the time to build the table for each link/tabledef than get burnt by the defaults.

c) get your own copy. it's worth the investment. And you'll be very happy you did once you get to 8.5 and xsd's become not only supported, but a required pre-requisite. You'll have many times when you are given an xml document, and no xsd.....and then need to generate one on your own.

Ernie
Ernie Ostic

blogit!
<a href="https://dsrealtime.wordpress.com/2015/0 ... ere/">Open IGC is Here!</a>
rcanaran
Premium Member
Premium Member
Posts: 64
Joined: Wed Jun 14, 2006 3:51 pm
Location: CANADA

Post by rcanaran »

a) It isn't up to me. Client is upgrading to 8.1 or 8.5 anyway.

b) usually, I just need the lowest/leaf of every branch

c) Yes, it looks like I'll have to
Post Reply