XML Chunks or Not

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
ggarze
Premium Member
Premium Member
Posts: 78
Joined: Tue Oct 11, 2005 9:37 am

XML Chunks or Not

Post by ggarze »

I having a lot of issues trying to create a correct XML and hopefully someone can help me. The XML should look something like this.

-<master>
-<lineofbusiness>
<field_one></field_one>
<field_two></field_two>
etc.....
-<perils>
<field_x></field_x>
<field_y></field_y>
etc....
</perils
-<perils>
<field_x></field_x>
<field_y></field_y>
etc....
</perils>
</lineofbusiness>
</master>

Both the lineofbusiness data and the perils data come in seperate files.

Attempt #1 - Because there were 2 files I thought I'd have to build this as a chunk. So if you've seen the "XML Pack Best Practices" I tried following that. So first thing I did was read the perils file in and created a chunk into a hash file. With the lob file I also created a chunk and then did a lookup on the pass through(key columns) to join in the perils Chunk and write that out to an XML stage. In the XML Output stage 'inputs' column tab I have the two chunks defined as

Field data element Description
LOB_XML XML /MASTER/LINEOFBUSINESS/text()
PERILS_XML XML /MASTER/LINEOFBUSINESS/text()

Job Design:
LOB_FILE -----> TRN ------> XML Output
^
|
|
PERILS_FILE --> XML Output --> Hash


The result XML I get is:
-<master>
-<lineofbusiness>
-<lineofbusiness>
<field_one></field_one>
<field_two></field_two>
etc.....
</lineofbusiness>
-<perils>
<field_x></field_x>
<field_y></field_y>
etc....
</perils
-<perils>
<field_x></field_x>
<field_y></field_y>
etc....
</perils>
</lineofbusiness>
</master>

As you can see I have two issues. 1) The inner lineofbusiness tag is closing off before the writing of the perils which should be inside the lineofbusiness. 2) why do I have an outer lineofbusiness tag? My assumption for the second issue is that the chunk I created up front created the inner lineofbusiness and then my final XML stage after the hash lookup created the outer lineof business because of the tag /MASTER/LINEOFBUSINESS/text()? However if I make the tag /MASTER/text() only the XML will NOT open.


Attempt #2 - So seeing the result in attempt #1 where my perils were not under my lineofbusiness I thought maybe I don't need a chunk because the perils where suppossed to be within lineofbusiness so if i merge the two files into one file then build the XML I should get a better result.

So in my job I read the merged file in and create an XML using the XML output stage.
In the XML Output stage I have my field defined as the following
FIELD DESCRIPTION
field_one /MASTER/LINEOFBUSINESS/field_one/text()
field_two /MASTER/LINEOFBUSINESS/field_two/text()
field_x /MASTER/LINEOFBUSINESS/PERILS/field_x/text()
field_y /MASTER/LINEOFBUSINESS/PERILS/field_y/text()

Job Design:
COMBINED_FILE ----> TRN ---> XML Output

The result XML I get is:
-<master>
-<lineofbusiness>
<field_one></field_one>
<field_two></field_two>
etc.....
-<perils>
<field_x></field_x>
<field_y></field_y>
etc....
</perils
</lineofbusiness>
-<lineofbusiness>
-<perils>
<field_x></field_x>
<field_y></field_y>
etc....
</perils>
</lineofbusiness>
</master>

As you can see the perils is now coming under the lineofbusiness however after each peril the lineofbusiness is closing and then reopening another tag and this pattern continues instead of ALL the perils for a lineofbusiness being listed first before closing.

I know it's a lot but I've been struggling with this for days. Hopefully someone can help.

Thanks,
eostic
Premium Member
Premium Member
Posts: 3838
Joined: Mon Oct 17, 2005 9:34 am

Post by eostic »

Hard to say exactly, but it looks like you don't have two separate chunks...you have two files that have a very definite relationship.....do "perils" rows "belong" (are subordinate to, are the child of) line-of-business?

If so, then "join" the two....get a set of rows where for (say) 100 perils, the line of business info is the same (whatever the parent might be)....then create the xml in a single stage, with perils being indicated as the key, and use "aggregate" option.

This may not be the case, but if so, it shouldn't be that difficult.

Ernie
Ernie Ostic

blogit!
<a href="https://dsrealtime.wordpress.com/2015/0 ... ere/">Open IGC is Here!</a>
ggarze
Premium Member
Premium Member
Posts: 78
Joined: Tue Oct 11, 2005 9:37 am

Post by ggarze »

Hey Ernie,

Yeah you are right the perils belong under the LOB. I really don't need a chunk here. The problem I was having was that after merging the two(LOB and Perils) when I created the XML I did not have the repeating eliminate in the Perils column marked as a key. I had it in the first column of a LOB record. So for every record it was opening and closing the tag for LOB.

Even though I don't need to create a chunk when I did I had that issue of lineofbusiness tag within lineofbusiness tag. That was because of my job design. For the chunks I originall said my perils design was like this:
PERIL_FILE ---> XML Output(chunk) ---> Hash File.

Which was correct. However I also said my LOB was like this.
LOB_FILE ---> XML Output(chunk) ---> Transformer(join Peril chunk) ---> XML Ouput(final).

The problem was I was creating the LOB chunk when I didnt have too. I just should have went:
LOB_FILE ---> Transformer(join Peril chunk) ---> XML Ouput(final).

this way it worked and I only got the one outer LineofBusiness tag.
eostic
Premium Member
Premium Member
Posts: 3838
Joined: Mon Oct 17, 2005 9:34 am

Post by eostic »

:) perfect!
Ernie Ostic

blogit!
<a href="https://dsrealtime.wordpress.com/2015/0 ... ere/">Open IGC is Here!</a>
Post Reply