XMLOutput 1 header and n content

mgsteiner · Post by **mgsteiner** » Fri Jul 12, 2013 4:33 am

Hi Guys,

I have the following Situation:

I have to create an XML File with 1 Header and 1 Content Group per row from Input File.
Everything works fine as Long the Input Rows contains different data.
The problem is when some rows have exactly the same data. Then, the XML Output Stage builds a group from the rows and creates only 1 content for those rows.

The problem is I Need to have 1 contect group per input row are different.

Any suggestions?

Yes, I have builded a XML-Chunk manualy and could handle this Problem, but the solution is not very comfortable with an Output File format with many fields. so, I wonder how to solved using the XML Output Stage itself.

Thnaks a lot in advance

eostic · Post by **eostic** » Fri Jul 12, 2013 10:12 am

I have found it necessary at times to include a dummy element in the node with an artificial key (use a counter in an upstream transformer to create this additional column)....then the rows are ultimately unique and you'll get the groupings you need....

....depending on the downstream application, the extra element may be fine and not affect anything (if they aren't doing validation, you may be able to just leave it there) or else edit it out in a downstream transformer using ereplace() or similar.

Ernie

mgsteiner · Post by **mgsteiner** » Mon Jul 15, 2013 2:45 am

That souds very good idea.

Please can you make visible the entire message to me, because I'm still not a Premium member.

Thanks a lot in advance

chulett · Post by **chulett** » Mon Jul 15, 2013 7:26 am

mgsteiner wrote:Please can you make visible the entire message to me, because I'm still not a Premium member.

That kind of defeats the whole purpose of setting up a system like that.

MrBlack · Post by **MrBlack** » Mon Jul 15, 2013 12:07 pm

Here's how you do it, and I'm not a premium poster so you should be able to read all of my message

Had the same issue once, and so I added a surrogate key with an extra xml tag in my group to make it unique so Datastage wouldn't roll up my results. (Granted roll up is probably the proper thing to be done, but poor XML schema design sometimes forces us to get inventive but that's off topic

)

Then once I have generated my XML documents, my jobs splits off 62 files (again I create a column that is not used in the XML but still pass to the XML stage to make the files split) I use a After-Job Exec-SH to leverage the power of Linux to use a regular expression to remove the extra @ID value in the tag that I didn't want in my files, making them valid according to the XSD. Here's the Linux command that I use to do a a find/replace in all my files:

Code: Select all

 find /etl/upload/mydata_*.xml -type f -exec sed -i 's/<datagroup id="[0-9]*">/<datagroup>/g' {} \;