Inner Join ..Out put ???

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
Developer9
Premium Member
Premium Member
Posts: 187
Joined: Thu Apr 14, 2011 5:10 pm

Inner Join ..Out put ???

Post by Developer9 »

Hi ,

I have source data:

CC (country code)|SC (state code)
USA|IL
USA|IL
USA|NY
USA|NY
USA|TX
USA|TX

Reference data:

SC|DESCRIPTION
IL|ILLINOIS
TX|TEXAS
NY|NEWYORK
NY|NEW-YORK

Code: Select all

reference (sequential_file)
                                     >>> Inner join (on key SC)>>>>output(sequential_file)

source(sequential_file)

Out put :

CC|SC|DESCRIPTION

Inner join:

Code: Select all

Inner transfers records from input data sets whose key columns contain equal values to the output data set. Records whose key columns do not contain equal values are dropped
In my out put I am getting "zero records"..Please give me your comments on my out put

Thank you
stuartjvnorton
Participant
Posts: 527
Joined: Thu Apr 19, 2007 1:25 am
Location: Melbourne

Re: Inner Join ..Out put ???

Post by stuartjvnorton »

Developer9 wrote:Please give me your comments on my out put
You don't seem to have any... :wink:

A couple of things to look at:
- Are you sure both the source and reference data are getting read in properly?
- Is the state code on the source side trimmed? If you've got trailing spaces, then nothing will get joined.
- Have you checked your partitioning? Less likely, seeing as you have no output at all.
Kryt0n
Participant
Posts: 584
Joined: Wed Jun 22, 2005 7:28 pm

Post by Kryt0n »

As per stuartjvnorton and...
Is either source or reference key field defined as a char on one side and varchar on the other and char length <> 2 or both defined as char with different lengths?

As an aside, you are happy with output (when you get it) having more rows than input (assuming that is real data)?
stuartjvnorton
Participant
Posts: 527
Joined: Thu Apr 19, 2007 1:25 am
Location: Melbourne

Post by stuartjvnorton »

Kryt0n wrote:As an aside, you are happy with output (when you get it) having more rows than input (assuming that is real data)?
Good spot. OP should clean up their reference data if that's indicative.
Developer9
Premium Member
Premium Member
Posts: 187
Joined: Thu Apr 14, 2011 5:10 pm

Post by Developer9 »

@stuartjvnorton,Kryt0n

Thanks for the Input

Is the state code on the source side trimmed? If you've got trailing spaces, then nothing will get joined?

I am using a text file ..Not trimmed ..trimmed it

Are you sure both the source and reference data are getting read in properly?

checked it to read it properly..
Is either source or reference key field defined as a char on one side and varchar on the other and char length <> 2 or both defined as char with different lengths?

Fixed it ..different lengths (a factor on out put )..learned a new point here !!

Here expected out put :

CC|SC|DESCRIPTION
USA|IL|ILLINOIS
USA|IL|ILLINOIS
USA|NY|NEWYORK
USA|NY|NEW-YORK
USA|NY|NEWYORK
USA|NY|NEW-YORK
USA|TX|TEXAS
USA|TX|TEXAS

Is above expected out put correct ??How it dealt with duplicate record in join operation ??

Example:look up on IL is present in reference data

Code: Select all

source           Look up on IL     reference
USA|IL >>>>>>>>>>>>>>>>>>>>>>USA|IL   
USA|IL 
Thank you
pandeesh
Premium Member
Premium Member
Posts: 1399
Joined: Sun Oct 24, 2010 5:15 am
Location: CHENNAI, TAMIL NADU

Post by pandeesh »

Its your responsibility to avoid the duplicates in the reference data for getting proper result .
pandeeswaran
Developer9
Premium Member
Premium Member
Posts: 187
Joined: Thu Apr 14, 2011 5:10 pm

Post by Developer9 »

pandeesh wrote:Its your responsibility to avoid the duplicates in the reference data for getting proper result .
@pandeesh,
FOR example Here,

Code: Select all

Inner join SC as key


Source : for 10,000 records
CC -USA is and SC| is distributed as below:

Code: Select all

1000 Records:IL 
1000                      :NY
1000                      :PA
7000                      :TX..............(.Total  10,000)

Code: Select all

Reference:5 records 

SC|DESP

IL|ILLINOIS
PA| PENNSYLVANIA
TX|TEXAS
NY|NEWYORK
NY|NEW-YORK
Out put:

Code: Select all

CC|SC|DESP

what can we expected in the output ???


Thank you
pandeesh
Premium Member
Premium Member
Posts: 1399
Joined: Sun Oct 24, 2010 5:15 am
Location: CHENNAI, TAMIL NADU

Post by pandeesh »

The result may vary depending on how the records get read .
Why not you can try this yourself?
pandeeswaran
Developer9
Premium Member
Premium Member
Posts: 187
Joined: Thu Apr 14, 2011 5:10 pm

Post by Developer9 »

sure ..thanks !!!!!!!!
pandeesh
Premium Member
Premium Member
Posts: 1399
Joined: Sun Oct 24, 2010 5:15 am
Location: CHENNAI, TAMIL NADU

Post by pandeesh »

Can you please post your Findings ?
pandeeswaran
Post Reply