XML output stage: Problem joining nodes and defining xpath

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
Maximus_Jack
Premium Member
Premium Member
Posts: 139
Joined: Fri Apr 11, 2008 1:02 pm

XML output stage: Problem joining nodes and defining xpath

Post by Maximus_Jack »

Hi
out of the several nodes in the xml i'm having, this is one node i'm trying to
design, which i have to eventually attach with other nodes.

Code: Select all

<b1111>
	<a1111>
		<aaaa>1</aaaa>
		<bbbb>B</bbbb>
		<cccc>2012-06-01</cccc>
		<dddd>
			<eeee>1</eeee>
			<ffff>a role</ffff>
		</dddd>
		<gggg>
			<hhhh>100001</hhhh>
			<iiii>100001</iiii>
			<jjjj>Y</jjjj>
			<kkkk>2013-01-01</kkkk>
		</gggg>
		<llll>
			<mmmm>Manager</mmmm>
			<nnnnn>2012-08-01</nnnn>
		</llll>
	</a1111>
</b11111>
Here, the child element <gggg> and <llll> are repeating element, so what i did is, i seperated the design in to two files from node<a1111> to <gggg> as one file and <llll> as another file, and by introducing a key, i joined both the files, but what happened is, when i created a the first file starting from <a1111>, datastage has created the closing tag </a1111> as well, and its the same for <llll> as well, so the final output i got is

Code: Select all

	<a1111>
		<aaaa>1</aaaa>
		<bbbb>B</bbbb>
		<cccc>2012-06-01</cccc>
		<dddd>
			<eeee>1</eeee>
			<ffff>a role</ffff>
		</dddd>
		<gggg>
			<hhhh>100001</hhhh>
			<iiii>100001</iiii>
			<jjjj>Y</jjjj>
			<kkkk>2013-01-01</kkkk>
		</gggg>
	</a1111>
	<a1111>
		<llll>
			<mmmm>Manager</mmmm>
			<nnnnn>2012-08-01</nnnn>
		</llll>
	</a1111>
can anyone please tell how to fix this? so that i can join both these files under the same node and one important question

if i dont have to give the <a1111> at all in the description of the XML output stage and can eventually give it at the output of lookup stage, how to define the first three elements in the xpath
<aaaa>1</aaaa>
<bbbb>B</bbbb>
<cccc>2012-06-01</cccc>
eostic
Premium Member
Premium Member
Posts: 3838
Joined: Mon Oct 17, 2005 9:34 am

Post by eostic »

This looks like a scenario for multiple hjoins for each of the "lists" that you have. We don't know about the entire input to this Assembly, but perhaps you have one link with all the fixed information on it, along with the key(s)......another link with multple rows for the GGG repeats, along with the same key(s), and a third link with multiple rows for the lll repeats, again with the same keys. Two hjoins...first to get the GGGs mixed in with the initial link and then another for the lll's.

Ernie
Ernie Ostic

blogit!
<a href="https://dsrealtime.wordpress.com/2015/0 ... ere/">Open IGC is Here!</a>
Maximus_Jack
Premium Member
Premium Member
Posts: 139
Joined: Fri Apr 11, 2008 1:02 pm

Post by Maximus_Jack »

Hi Ernie..
I'm using the old XML output stage, can you please tell how it can be done
on that one?

thanks
MJ
eostic
Premium Member
Premium Member
Posts: 3838
Joined: Mon Oct 17, 2005 9:34 am

Post by eostic »

ah. The xmlOutput Stage.....it's the same idea, except now you are doing it "outside" of the Stage and bringing everything back together. It's significantly more difficult because you have to do all the building (and saving/looking up) of the different lists yourself.....it will require multiple xml Stages [this is certainly one of many reasons why the xml stage is more powerful than its predecessor, as it can handle multiple incoming links and perform the hierarchical joining within the stage itself].

The best resource for that is the xml best practices document that was written a long time ago by one of our engineers, and is still floating around.....K. Duke probably still has it at his site....do some searches, there are multiple threads on it in here.

There is a whole section on how to create each of the different repeating nodes in their own xmlOutput Stage, then look them up and bring them back together in a final xmlOutput Stage.

Ernie
Ernie Ostic

blogit!
<a href="https://dsrealtime.wordpress.com/2015/0 ... ere/">Open IGC is Here!</a>
Maximus_Jack
Premium Member
Premium Member
Posts: 139
Joined: Fri Apr 11, 2008 1:02 pm

Post by Maximus_Jack »

Hi Ernie
i have been searching for the document, but couldnt find it anywhere,
the link or the website is not working, can you please share the link if you have?

thanks MJ
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Kim took his site down and I think he said that the files would be hosted here... someday. In the meantime, since I'm the one that gave it to him to host I should still have a copy. Let me see if I can find it when I get home.
-craig

"You can never have too many knives" -- Logan Nine Fingers
Maximus_Jack
Premium Member
Premium Member
Posts: 139
Joined: Fri Apr 11, 2008 1:02 pm

Post by Maximus_Jack »

Hi craig, would be really grateful if you could share that, i'm splitting my hairs for the past 1 week..

thanks
MJ
Maximus_Jack
Premium Member
Premium Member
Posts: 139
Joined: Fri Apr 11, 2008 1:02 pm

Post by Maximus_Jack »

Hi Craig or anyone ...can some please post that file....?

your help is appreciated...thanks
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

I sent you a PM yesterday asking for your email address, please respond to that with it and I will. :wink:

OK... never mind that. I've uploaded it here for anyone that wants to download it before it finds a more permanent home elsewhere.
-craig

"You can never have too many knives" -- Logan Nine Fingers
Maximus_Jack
Premium Member
Premium Member
Posts: 139
Joined: Fri Apr 11, 2008 1:02 pm

Post by Maximus_Jack »

thanks a lot craig.. appreciated..
Maximus_Jack
Premium Member
Premium Member
Posts: 139
Joined: Fri Apr 11, 2008 1:02 pm

Post by Maximus_Jack »

Hi
regarding this issue, i have created all the individual nodes/chunks of the xml file, after that using lookup i'm joining the data elements and sending it to the xml file, what's happening in the final xml document that is getting created is, "only the first data element is received for all the rows and rest of the data elements are missing for each row and i have tried keeping various permutations and combination of nodes as key, but no use, i'm getting the same result

Some info
=======
in the final look up stage which joins all the individual nodes, this is what i have specified, which goes as the input to the XML stage

Code: Select all

Column name           Description
Node1                 /CRMPers/text()
Node2                 /CRMPers/text()
Node3                 /CRMPers/text()
Node4                 /CRMPers/text()
first i set all the columns as key, then only the last column and then the
first column and all the columns have been set as "XML" in the data element of the XML stage......but no use

any idea

thanks
MJ
Maximus_Jack
Premium Member
Premium Member
Posts: 139
Joined: Fri Apr 11, 2008 1:02 pm

Post by Maximus_Jack »

Hi
After few more tests,
I changed the lookup failure option to "fail", now the lookup is failing,
may be thats why the data gone past the lookup, in the director log
i'm having warnings for that key column i'm using to join the nodes, the warning is
When binding input interface field "ROWID" to field "ROWID": Implicit conversion from source type "ustring[max=20]" to result type "string[max=20]": Possible truncation of variable length ustring when converting to string using codepage UTF-8.
my job flow is

Sqfile-->Transformer-->XMLOutput-->Copy stage-->lookup--final xml

like this for all the four nodes, rowid is just used as the pass through
column in all the stages, can this data type conversion be the problem
from the XML stage, can someone please tell how to fix this?

FYI--the lookup stopped showing that warning, after two or three runs, when
i havent changed nothing

thanks
MJ
eostic
Premium Member
Premium Member
Posts: 3838
Joined: Mon Oct 17, 2005 9:34 am

Post by eostic »

This is one reason why I always evaluate my performance requirements when reading or writing xml and consider Server Jobs in each scenario. There are times when the functionality is critical, but the volume is tiny....in those cases, Server Jobs, which have very little or no restrictions on datatypes or lengths of varchar text, are a better choice. Issues with truncation, datatypes resulting in dynamic conversions and lookup issues or other related datatype complexities, strings too big for a link [all things that are valid and crucial for high volume, high performance but not necessary for small volume string manipulation] just "go away".

In this scenario, you are approaching it correctly. Debug and evaluate the lookups on their own, independent of the xml work.... and then also practice putting the many final strings together at the target --- a good way to do this is to hard code various strings into different columns in an upstream transformer, until you get the xpath just right on the input link of the final xmlOutput Stage. You will still have only one "key" or repetition element that is controlling aggregation...the other incoming chunks with XML Data Elements will behave as "single occurring" elements in the final construction.

Ernie
Ernie Ostic

blogit!
<a href="https://dsrealtime.wordpress.com/2015/0 ... ere/">Open IGC is Here!</a>
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Yay for Server jobs! Reports of their death have been greatly exaggerated. :D
-craig

"You can never have too many knives" -- Logan Nine Fingers
Maximus_Jack
Premium Member
Premium Member
Posts: 139
Joined: Fri Apr 11, 2008 1:02 pm

Post by Maximus_Jack »

thanks a lot for your input craig and ernie...

in a day or two i will post, how i fixed it.. thanks a lot
Post Reply