Result Not Stable

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
mdbatra
Premium Member
Premium Member
Posts: 175
Joined: Wed Oct 22, 2008 10:01 am
Location: City of London

Result Not Stable

Post by mdbatra »

Hi All,
in my job design, i have a single input dataset & 3 reference data sets. With each reference data set, i have a separate join based on same 3 keys. Prior to join with input data set, all reference data set go through a remove duplicate stage,which are hash partitioned on same keys as used in join. The output of input data set & first refernce is fetched to second join with 2nd reference data set & so on.

in first join, i used below partioning:
input: Hash(on 3 keys)
1st RD stage Ouput: Same

2nd join
Output from 1st join : Same
2nd RD stage Ouput: Same

3rd join
Output from 2nd join : Same
3rd RD stage Ouput: Same

The issue i am facing is:
In my output, sometimes i get the match from the reference data sets & sometimes not for the same input. Tried disabling "Automatic Partioning " in admin but not fruitful.
What could be cause for this.
Any help would be appreciated.
Rgds,
MB
ArndW
Participant
Posts: 16318
Joined: Tue Nov 16, 2004 9:08 am
Location: Germany
Contact:

Post by ArndW »

Have you hashed on all 3 keys in all of the links, i.e in the second and third joins?
mdbatra
Premium Member
Premium Member
Posts: 175
Joined: Wed Oct 22, 2008 10:01 am
Location: City of London

Post by mdbatra »

yes,i did.
Rgds,
MB
mdbatra
Premium Member
Premium Member
Posts: 175
Joined: Wed Oct 22, 2008 10:01 am
Location: City of London

Post by mdbatra »

& 1 more observation:
even if it is getting the match, all the extracted columns from the refernce data sets are not coming(some coming as BLANK) while input is having a valid value in those field.
Rgds,
MB
mdbatra
Premium Member
Premium Member
Posts: 175
Joined: Wed Oct 22, 2008 10:01 am
Location: City of London

Post by mdbatra »

Sorting at Join proved fruitful.
So, Does it mean that Join tries to match incoming record only with the first record(from reference data set) it founds within the same partition ?
Rgds,
MB
ArndW
Participant
Posts: 16318
Joined: Tue Nov 16, 2004 9:08 am
Location: Germany
Contact:

Post by ArndW »

Join data must be sorted on the join key(s). If the data isn't sorted, DataStage will put in a tsort operator unless you explicitly turn this off using an environment parameter such as $APT_NO_SORT_INSERTION.
mdbatra
Premium Member
Premium Member
Posts: 175
Joined: Wed Oct 22, 2008 10:01 am
Location: City of London

Post by mdbatra »

Will keep in mind. Thanks for your prompt replies ArndW.
Rgds,
MB
Post Reply