Oracle append method without duplicates entry

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
premupdate
Participant
Posts: 47
Joined: Thu Oct 04, 2007 3:37 am
Location: chennai

Oracle append method without duplicates entry

Post by premupdate »

Hello Folks,

My Job design

Job 1: Teradata Table -> TFM -> DataSet
Job 2: DataSet ->TFM -> Oracle

Requirement: To load all data from TD in to oracle without any transformation or conditions

The TD table is temporary table which will get loaded in a daily basis with a truncate and load option.
The Oracle table load method is append method where business needs all data to be stored.No key columns defined in the table.The table has extraction date as a column which will be populated from the scheduler.
In normal scenario, the above jobs will run smoothly.But if there are failures in the job ,during rerun the same records might get inserted again as its an append method.
The expected volume of data is around 1400 records with 4 MB size with 56 columns in each row.
My option:
Using Change capture stage with All Keys,Explicit values where all the 55 columns defined as key columns(except the date column) for the stage to compare and then identify the insert record for oracle table.
At the maximum there is a possibility of 1200 MB size in the before dataset(Oracle table) of Change capture stage.
Please let me know if this is the option which can be tried or is there any other option to avoid duplicates loads during a job failures
Cheers,
prem
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

If every load is uniquely identified (perhaps with a run ID, perhaps with a created timestamp) you can use these identifiers to roll back to a known good point in the event that you need to re-load.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
premupdate
Participant
Posts: 47
Joined: Thu Oct 04, 2007 3:37 am
Location: chennai

Post by premupdate »

Thanks Ray,

But the table dont have an option like timestamp or row id...is there anyway to achieve this.The only column is Extract date where we should have manual interference incase of any issues which is not allowed as per process.
Cheers,
prem
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

To answer your first question, "no".

Why have Extract Date if it's not permitted to be used? Resist stupid requirements!
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
premupdate
Participant
Posts: 47
Joined: Thu Oct 04, 2007 3:37 am
Location: chennai

Post by premupdate »

Hello All,

Found a solution of using the strength of Before SQL in Oracle Stage where it will have a delete statement with extract date in it.So whenever a rerun happens with partial records in table,the before SQL will delete the records with date matching before being inserted.

Thanks,
Prem
Cheers,
prem
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

I thought you weren't allowed to use the Extract Date ?
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Post Reply