Hi All,
We are trying to build a new data warehouse, i am currently doing POC, we will have multiple tables in warehouse, we will build CDC job for every table to capture the data changes. In every CDC job i am planning to capture inserts into a dataset & edits into another data set. Next i planned to build couple of jobs, one for loading inserts into Oracle table & another for updating Oracle table.
Now my question, instead of inserting/updating data in separate jobs, can i do these tasks in CDC job itself. If i perform insert/update in CDC job, i hope it improve jobs performance because i don't need to run(build) insert/update jobs for every table. Please let me know if this method is recommended & also let me know what kind of issues i may face in long run.
Some additional information, in CDC jobs we are doing CDC, joining few reference tables & find key max value. Each table may have rows between 1 million to 2 millions.
Greatly appreciate any help, thanks in advance.
Insert/Update Oracle table in CDC job?
Moderators: chulett, rschirm, roy
-
- Participant
- Posts: 36
- Joined: Mon Jul 16, 2007 3:37 am
- Location: USA
-
- Participant
- Posts: 36
- Joined: Mon Jul 16, 2007 3:37 am
- Location: USA
Thanks for your reply chulett.
Sure, i will test in both the ways.If performance is same in both the ways, then i would like to build insert/update actions in CDC job. I am seeing following advantages in this way.
1. If suppose i have 100 tables, then if i need to build separate jobs for inserts/updates then i need to build 200 jobs for inserts & updates. If i keep insert/update actions in CDC jobs, then we can save development time.
2. These jobs runs daily, running 200 insert/update jobs daily may take longer time, like every job will need time for startup.
My main concern in developing insert/update actions in CDC job is, what happens if job fails, does Oracle tables get locked? do i start face any Oracle related issues in long run?
Sure, i will test in both the ways.If performance is same in both the ways, then i would like to build insert/update actions in CDC job. I am seeing following advantages in this way.
1. If suppose i have 100 tables, then if i need to build separate jobs for inserts/updates then i need to build 200 jobs for inserts & updates. If i keep insert/update actions in CDC jobs, then we can save development time.
2. These jobs runs daily, running 200 insert/update jobs daily may take longer time, like every job will need time for startup.
My main concern in developing insert/update actions in CDC job is, what happens if job fails, does Oracle tables get locked? do i start face any Oracle related issues in long run?
VINOD
-
- Participant
- Posts: 36
- Joined: Mon Jul 16, 2007 3:37 am
- Location: USA
To me, that's one of the primary drivers of any design, CDC or otherwise - how does it recover from failure? We always keep "jobs" as atomic as possible, discrete units of work so that failure during one step doesn't mean repeating multiple steps to get back to that point. When dealing with CDC in the past, we ensured that all changes for that "run" were captured before starting any processing of those changes so that we had a static set to work with. Which meant in the event of failure, we knew we could either start over from the point of failure (or the beginning) without fear of a changing source complicating that. Meaning, IMHO, it would be best to not combine the process of capturing the CDC information with processing it. Depending on exactly what "CDC" means here. Just something to keep in mind.reddy.vinod wrote:My main concern... is, what happens if job fails
Now, I also don't see a need to handle inserts separately from updates unless you are in a situation where the inserts FAR outnumber the updates so you'd want to bulk load those before performing the updates. But again, that's just me. You may also need to take into account updates that happen to the inserts, meaning you would not be able to do the updates first. Depends on your knowledge of the data and what kind of changes you'll see to any given record between loads. You can't necessarily assume you get one or the other for a record, or that you will only get one. Make sure you are know for certain what you will be processing.
ps. Not a big fan of polls here.
-craig
"You can never have too many knives" -- Logan Nine Fingers
"You can never have too many knives" -- Logan Nine Fingers