use of lookup stage in a particular job

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
suryadev
Premium Member
Premium Member
Posts: 211
Joined: Sun Jul 11, 2010 7:39 pm

use of lookup stage in a particular job

Post by suryadev »

designed a job to load data into a table which has 6 fields
user_id,app_id,sch_id.....these 3 together form primary key
and other three are join_date,rene_date,end_dt

first time when the job runs around 5000 records are loaded in the table.
and the end_dt has to be null for the first load.
the job is to be scheduled every week and suppose next week 5200 records come from the source.....My requirement is to find out only the extra 200 records and load them in the table.

can I use lookup stage there to do lookup between the records from source and the table which is previously loaded.
so that the output and reject links can be used...as the 5000 records which are earlier loaded come from the output and for all these records the end_dt has to be the current_dt and the end_dt for these 200 records has to be null which mean the new records are the ones which have their end_dt as null.
so the output link from lookup has 5000 records changing the nd_dt as current_dt
the reject link has 200 records changing the end_dt as null and load them.

same procedure every time the source sends the data...


Is this the right way to design the job according to the requirement?

please help me with any changes required.

final phase would be something like

source(primary)------>lookup1<__________(lookuptable)table1

lookup1(output link)------------>table1
lookup1(rejectlink)--------------->table1
Thanks,
Surya
divesh
Participant
Posts: 7
Joined: Wed Dec 20, 2006 5:18 pm

Re: use of lookup stage in a particular job

Post by divesh »

In order to aviod dealing with record commit, best design would be

source(primary)------>lookup1<__________(lookuptable)table1

lookup1(output link)------------>Dataset for insert
lookup1(rejectlink)--------------->Dataset for update

in the next job, use the Datasets to insert and update the table. In the update dataset just inlcude the primary key.
suryadev
Premium Member
Premium Member
Posts: 211
Joined: Sun Jul 11, 2010 7:39 pm

Post by suryadev »

Thanks,

So in the sequence we need to use an other job which has datasets and table.

earlier I never user dataset,Is it something to store the data and then read when ever needed?

one more doubt which I have is while doing look up between the source and table1 what will the keys be....I assume the keys in the lookup will be all the three i.e
user_id,app_id,sch_id
so that....it does a look up with the table which has all the three values same.
Is that right?
Thanks,
Surya
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Assuming you set up your lookup rule to be "=", you are correct.

The Lookup stage also supports range lookups.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Post Reply