Datastage strange behavirs

Gokul · Post by **Gokul** » Fri Dec 17, 2010 7:59 am

Hi ,

I have been observing some weird behaviors, can someone explain these for me.

1> I have 2 jobs a and b. In the job a 2 datasets are created partitioned on key x and sorted on key x and key y. (The relation between key x and Y is one to many and between Y and X is one to many. e.g branch and account.).
In the job b when I join the 2 created datasets using key x,y keeping the partitions as same. The join is not going proper count.

When I do a explicit hash in the join stage on key x for the both the links, count is coming properly.

2> In a job , (I have source-->sort(a,b)--> removed dup(key a,b)-->target.)
There is sort stage on key (a,b) followed by remove duplicates on key (a,b). On execution I was getting the warning,"downstream operator does not fulfill the requirement".

As a solution, I deleted the link between the sort and remove duplicated and replaced it with a new on. The warning was removed.

3> In a another job, I have more than 2 inputs to the join stage, the count of records is getting garbled.
When I replace the above single join stage with multiple join stage with only 2 inputs, the count is coming properly.

DSguru2B · Post by **DSguru2B** » Fri Dec 17, 2010 9:06 am

What exactly are you trying to accomplish here. Get count of duplicates?

Gokul · Post by **Gokul** » Thu Dec 23, 2010 6:24 am

I am not trying to prove or establish here( I have been working datastage for 4+years and I love it) . But these are some of the weird behaviours i cam across. Just need to know whether the implementation can be changed or the tool settings.