Design DataStage job

A forum for discussing DataStage<sup>®</sup> basics. If you're not sure where your question goes, start here.

Moderators: chulett, rschirm, roy

Post Reply
epall
Participant
Posts: 1
Joined: Wed Aug 08, 2007 1:39 am
Location: MAL

Design DataStage job

Post by epall »

Hi,


What is the baseline to implement partition method in datastage job e.g. process more than 2 million records ?

Thanks.
Maveric
Participant
Posts: 388
Joined: Tue Mar 13, 2007 1:28 am

Post by Maveric »

Functionality. Using stages like Join, Lookup etc, the reference data and main link data should be in the same node to get the required output. consider
1,asd
1,asd
2,erf
3,saw
1,asd

Now if u use remove duplicates stage, if the first record and lst record are on node 1 and the second record is on node 2. u will still get 2 records in the output. If you hash partition the data on both the fields then 1st, 2nd and lst record will be on the same node and the output will be one record.
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

epall, you've marked the Job Type as TX in your post. If indeed this is a TX question, I'd suggest you post in the actual TX forum where those experts hang out.
-craig

"You can never have too many knives" -- Logan Nine Fingers
Post Reply