Multiple rows getting generated.....

Post questions here relative to DataStage Server Edition for such areas as Server job design, DS Basic, Routines, Job Sequences, etc.

Moderators: chulett, rschirm, roy

Post Reply
rkumar28
Participant
Posts: 43
Joined: Tue Mar 30, 2004 9:39 am

Multiple rows getting generated.....

Post by rkumar28 »

Hi,

I have a situation where I am using XML Input stage to parse the XML file. I am reading the XML document through the folder stage and send the output of folder stage as input into the XML Input Stage. The output goes into the sequential stage. I get pipe dilimited flat file.

But I see quite a few rows in the flatfile against one row in the xml. I have tried taking recordcount, email, perms as a key (one by one in turns) but I never got one row in flat file against one row in xml. I always got few lines of data against one row of XML.

I think I am missing something minor in this. I have spend some time on this but couldn't figure it out. I will really appreciate if anyone can suggest a way to resolve this problem.
Below is my XML:(This is a two rows XML)

<?xml version="1.0" ?>
<batch recordCount="100" source="MAIL" creDate="2004-10-23">
<email transDate="2004-10-23" delv="Y">
<eaddr>xyz@xyz.net</eaddr>
<newaddr>xyz1@xyz.net</newaddr>
<btn>1112223333444</btn>
<busres>B</busres>
<tn>1112223333</tn>
<fname>xyz</fname>
<lname>asdd</lname>
<perms>
<ebill value="Y"/>
<econf value="Y"/>
</perms>
</email>

<email transDate="2004-10-24" delv="Y">
<eaddr>xyz@xyz.net</eaddr>
<tn>1112223333</tn>
</email>

</batch>
vmcburney
Participant
Posts: 3593
Joined: Thu Jan 23, 2003 5:25 pm
Location: Australia, Melbourne
Contact:

Post by vmcburney »

Your key field on your XML input stage controls how many rows are written out. If you choose recordCount as the key you only get one row as it only appears once in the XML <batch> tag. If you choose eaddr you only get one row because both rows in the sample have the same value.

Suitable candidations for a key field look like TranDate, which is different for each record in the sample, or newaddr. For newaddr some email records may not have a value so under "transformation settings" turn "Repetition element required" off. This ensures that when an email entry such as that second row does not have a newaddr value it still gets written out.

I'm not sure what you do if you don't have a suitable key field within the XML, perhaps someone else can comment.
rasi
Participant
Posts: 464
Joined: Fri Oct 25, 2002 1:33 am
Location: Australia, Sydney

Post by rasi »

Hi Kumar,

Check the metadata of the XML which you imported. Try re-importing and changing the definitions. Improperly imported metadata can cause weird results.

Thanks
Rasi
rkumar28
Participant
Posts: 43
Joined: Tue Mar 30, 2004 9:39 am

Post by rkumar28 »

Hi,
Thanks for the help and suggestion. Finally it helped. I tried importing the metadata again and I added a wild character in the folder stage(under properties)...something like *xyz_filename.xml*. This tells folder stage to read the filename that begins with the above filename.

I was getting repeating rows because there were more than one xml files saved under my folder on my disk. Looks like folder stage tends to read all the files in the target folder. Putting the wild charater in the folder stage resolved this issue.

rasi wrote:Hi Kumar,

Check the metadata of the XML which you imported. Try re-importing and changing the definitions. Improperly imported metadata can cause weird results.

Thanks
Rasi
rasi
Participant
Posts: 464
Joined: Fri Oct 25, 2002 1:33 am
Location: Australia, Sydney

Post by rasi »

Yes without the wildcards the folder stage will pick up all the files from the path. Also make sure you have some mechanism which clears off the files or moved to different path after the job is finished. This will make sure you are not getting back date files for the current running jobs.

Good luck.

Rasi
Post Reply