Hi Performance job design
Posted: Thu Jul 09, 2009 10:30 am
Hi
I have a requirement, I need to create jobs to populate table (insert). here is the table detail.
The source table has nearly 50 columns. and the target table is something like this, col1_pk, col2, col3_fk, col4_fk.... col59. I have indicated the primary key and foreign key with PK and FK. The target has only one primary key and two foreign keys. So I designed a parallel job that extracts data from the source(ODBC Stage) then use a look up stage to look up against the foreign key tables(target side) and then used a merge stage to merge them and then again use a look up to make a target look up to check if the rows exist or not. then generate keys and load the target (insert only). But my manager said this is a poor design and i need to remove or use very less look up stages (ultimately reduce the number of stages), and then design the job. Also the source table dosent have a creationdate or updatedate to extract the delta load, so in my design i used a full extract (assuming the source database are updated). I have one more thought, to remove the look up stage i can write a sql query that gets the foreign keys. the data is going to be huge (in millions). can anybody suggest me a design method or some points where i can make some improvements. Any help would be appreciated.
I used auto partition everywhere.
[/code]
I have a requirement, I need to create jobs to populate table (insert). here is the table detail.
The source table has nearly 50 columns. and the target table is something like this, col1_pk, col2, col3_fk, col4_fk.... col59. I have indicated the primary key and foreign key with PK and FK. The target has only one primary key and two foreign keys. So I designed a parallel job that extracts data from the source(ODBC Stage) then use a look up stage to look up against the foreign key tables(target side) and then used a merge stage to merge them and then again use a look up to make a target look up to check if the rows exist or not. then generate keys and load the target (insert only). But my manager said this is a poor design and i need to remove or use very less look up stages (ultimately reduce the number of stages), and then design the job. Also the source table dosent have a creationdate or updatedate to extract the delta load, so in my design i used a full extract (assuming the source database are updated). I have one more thought, to remove the look up stage i can write a sql query that gets the foreign keys. the data is going to be huge (in millions). can anybody suggest me a design method or some points where i can make some improvements. Any help would be appreciated.
I used auto partition everywhere.
[/code]