use of lookup stage in a particular job
Posted: Thu Nov 18, 2010 9:45 am
designed a job to load data into a table which has 6 fields
user_id,app_id,sch_id.....these 3 together form primary key
and other three are join_date,rene_date,end_dt
first time when the job runs around 5000 records are loaded in the table.
and the end_dt has to be null for the first load.
the job is to be scheduled every week and suppose next week 5200 records come from the source.....My requirement is to find out only the extra 200 records and load them in the table.
can I use lookup stage there to do lookup between the records from source and the table which is previously loaded.
so that the output and reject links can be used...as the 5000 records which are earlier loaded come from the output and for all these records the end_dt has to be the current_dt and the end_dt for these 200 records has to be null which mean the new records are the ones which have their end_dt as null.
so the output link from lookup has 5000 records changing the nd_dt as current_dt
the reject link has 200 records changing the end_dt as null and load them.
same procedure every time the source sends the data...
Is this the right way to design the job according to the requirement?
please help me with any changes required.
final phase would be something like
source(primary)------>lookup1<__________(lookuptable)table1
lookup1(output link)------------>table1
lookup1(rejectlink)--------------->table1
user_id,app_id,sch_id.....these 3 together form primary key
and other three are join_date,rene_date,end_dt
first time when the job runs around 5000 records are loaded in the table.
and the end_dt has to be null for the first load.
the job is to be scheduled every week and suppose next week 5200 records come from the source.....My requirement is to find out only the extra 200 records and load them in the table.
can I use lookup stage there to do lookup between the records from source and the table which is previously loaded.
so that the output and reject links can be used...as the 5000 records which are earlier loaded come from the output and for all these records the end_dt has to be the current_dt and the end_dt for these 200 records has to be null which mean the new records are the ones which have their end_dt as null.
so the output link from lookup has 5000 records changing the nd_dt as current_dt
the reject link has 200 records changing the end_dt as null and load them.
same procedure every time the source sends the data...
Is this the right way to design the job according to the requirement?
please help me with any changes required.
final phase would be something like
source(primary)------>lookup1<__________(lookuptable)table1
lookup1(output link)------------>table1
lookup1(rejectlink)--------------->table1