configuration and partitioning

kcshankar · Post by **kcshankar** » Sun Jun 05, 2005 3:44 am

Hi,
My job cantains
1.source file
2.Lookup file
3.Lookup Satge
4.Target file
I want to run my job in 2*2 configuration.
When I do that, Iam not getting total output(in rows,ie say for 10 rows iam getting only 6).

If I run the same job in
a.default configuration or
b.SAME partition type and 2*2 configuration
iam getting all the records

.

can anyone explain me what is happening

Thanks in advance
kcs

roy · Post by **roy** » Sun Jun 05, 2005 4:16 am

Hi & welcome abord

,

Can you explain more about the differences between the 2 runs you mentioned?

kcshankar · Post by **kcshankar** » Sun Jun 05, 2005 4:52 am

Hi Roy,
There are 2 jobs.Iam having problem in the 2nd job.

Job1.

Taking A1 file as Source and doing a Lookup on C file and creating a Dataset ,A1C.
Key ---Custid
Partition type ---- Hash based on CustId.
Configuration type 2 * 2.

job2.

Taking A1C file(output from job1) as source and doing a Lookup on A2 file and creating a Dataset A1A2C.
Key-----acctid
Partition type -- Hash based on acctid
Configuration type 2*2.
Result is iam not getting all the records.

The 2nd job either of the following conditions is working fine,

a.default config
or
b.SAME partition,2*2 config.

Thanks in advance
kcs

SriKara · Post by **SriKara** » Sun Jun 05, 2005 5:39 am

what does "configuration 2 * 2 " mean ??

dsxdev · Post by **dsxdev** » Mon Jun 06, 2005 6:09 am

Hi Shankar,
What is 2*2 is it 2-node configuration file you are referring to ?
Any way I see you are hash partitioning on acctid and custid.

Are you any chance sorting and using unique option ?

This could be one reason for loosing the records.
When you use default partitioning perform sort with unique option is not active.

kcshankar · Post by **kcshankar** » Mon Jun 06, 2005 7:07 am

friends,
Thanks for your replies.

Dev,Iam not using Sort or Unique options.
Fine,what is happening when the same job is running in 2-node configuration with the Partitioning type as SAME.

Thanks in advance
kcs

SriKara · Post by **SriKara** » Mon Jun 06, 2005 7:21 am

When you use the partitioning type "same" , the hash partitioning performed in the first job is preserved.
i.e. the dataset used in the second job also has the same hash partitioning on custid.

But when you perform hash partitioning on the acctid in the second job, the records are dropped. Can you check if there are any records with duplicate acctids?

Also in the reference link to lookup stage in second job, have you specified any particular partitioning property??