Page 1 of 1

use of lookup stage in a particular job

Posted: Thu Nov 18, 2010 9:45 am
by suryadev
designed a job to load data into a table which has 6 fields
user_id,app_id,sch_id.....these 3 together form primary key
and other three are join_date,rene_date,end_dt

first time when the job runs around 5000 records are loaded in the table.
and the end_dt has to be null for the first load.
the job is to be scheduled every week and suppose next week 5200 records come from the source.....My requirement is to find out only the extra 200 records and load them in the table.

can I use lookup stage there to do lookup between the records from source and the table which is previously loaded.
so that the output and reject links can be used...as the 5000 records which are earlier loaded come from the output and for all these records the end_dt has to be the current_dt and the end_dt for these 200 records has to be null which mean the new records are the ones which have their end_dt as null.
so the output link from lookup has 5000 records changing the nd_dt as current_dt
the reject link has 200 records changing the end_dt as null and load them.

same procedure every time the source sends the data...


Is this the right way to design the job according to the requirement?

please help me with any changes required.

final phase would be something like

source(primary)------>lookup1<__________(lookuptable)table1

lookup1(output link)------------>table1
lookup1(rejectlink)--------------->table1

Re: use of lookup stage in a particular job

Posted: Thu Nov 18, 2010 11:26 am
by divesh
In order to aviod dealing with record commit, best design would be

source(primary)------>lookup1<__________(lookuptable)table1

lookup1(output link)------------>Dataset for insert
lookup1(rejectlink)--------------->Dataset for update

in the next job, use the Datasets to insert and update the table. In the update dataset just inlcude the primary key.

Posted: Thu Nov 18, 2010 11:35 am
by suryadev
Thanks,

So in the sequence we need to use an other job which has datasets and table.

earlier I never user dataset,Is it something to store the data and then read when ever needed?

one more doubt which I have is while doing look up between the source and table1 what will the keys be....I assume the keys in the lookup will be all the three i.e
user_id,app_id,sch_id
so that....it does a look up with the table which has all the three values same.
Is that right?

Posted: Thu Nov 18, 2010 4:52 pm
by ray.wurlod
Assuming you set up your lookup rule to be "=", you are correct.

The Lookup stage also supports range lookups.