i have 2 sequential files

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
dssubhani
Participant
Posts: 38
Joined: Wed Jul 14, 2010 11:12 pm

i have 2 sequential files

Post by dssubhani »

i have 2 sequential files in which
one file having firstname,lastname

2nd file having firstname,e-mailaddress

now i want output as
firstname,lastname,e-mail-address...........how?

i used join stage join type as inner it does n't work...
subhani
stuartjvnorton
Participant
Posts: 527
Joined: Thu Apr 19, 2007 1:25 am
Location: Melbourne

Post by stuartjvnorton »

You tell us. You've given us practically nothing to go on. Unless you can get most or all of the last name out of the email address, you won't have enough for a reliable match.
One thing I do know is that an inner join is not the way to go about it.
abhijitg
Premium Member
Premium Member
Posts: 10
Joined: Sun Aug 22, 2010 9:54 am
Location: Charlotte, NC

Post by abhijitg »

Hi,

Make sure you sort the incoming records on the key prior to the join and it should work. Your problem might be the selection of a bad key column rather the join. The first name might not uniquely identify a row which might result in cross joins. Secondly by selecting an inner join you might be dropping records (not everyone has an email address).

Thanks
Abhijit
abhijitg
Premium Member
Premium Member
Posts: 10
Joined: Sun Aug 22, 2010 9:54 am
Location: Charlotte, NC

Post by abhijitg »

Hi,

Using a lookup stage might be more appropriate, with the fname,lname file as your input and the fname, email file as you reference link. The bad choice for the key still holds.

Thanks
Abhijit
asyed
Participant
Posts: 16
Joined: Sun Dec 12, 2010 10:24 pm
Location: Hyderabad, India

Post by asyed »

abhijitg wrote:Hi,

Make sure you sort the incoming records on the key prior to the join and it should work. Your problem might be the selection of a bad key column rather the join. The first name might not uniquely identify a row which might result in cross joins. Secondly by selecting an inner join you might be dropping records (not everyone has an email address).

Thanks
Abhijit
In addtion to the above

a) Hash Partition the input
b) Maintain consistency in the Input key fields (ie. Trim, Upper/Lower Case).
Post Reply