DSXchange

Posted: **Tue Mar 10, 2009 6:36 am**

Hi
Everybody,

I have a large sequential file which contains 15 million records.We need to insert this data into a oracle table by performing some transformations.Basically we are doing ETL transformations in our jobs to load a fact table. Our aproach is first load the sequential file data into the staging table by one job.In the Second job we compare staging table data with the fact data to perform an insert operation (if match not found) or updateoperation(if match found).

So in the second job we load the staging table data and fact table data by oracle stage.Then we use a join stage to determine records for update or insert operation by using a dummy column. In the join stage we compare the data by using the unique key columns and left outer join.

After the join stage we send the data to a transformer for satisfying the bussiness rules.Then we perform the insert( if required) and update operation based on the bussiness rules.

So we require some clarification whether this approach will be suitable for
comparing the 15 million records coming from the staging table and 15 million plus records coming from the fact table.

Is it suitable to use join stage or any other stage to compare such huge data?

Posted: **Tue Mar 10, 2009 3:56 pm**

Welcome aboard.

You may find it preferable to use the Slowly Changing Dimension stage.

Or you may like to investigate any of the three change detection stages (Difference, Compare, Change Capture).

Posted: **Tue Mar 10, 2009 3:57 pm**

Welcome aboard.

You may find it preferable to use the Slowly Changing Dimension stage.

Or you may like to investigate any of the three change detection stages (Difference, Compare, Change Capture).

DSXchange

How to process a large sequential file?

How to process a large sequential file?