Changed Data Capture questions
Posted: Thu Apr 24, 2008 2:37 pm
Currently our company data from the mainframe each night, FTP's the data to a server at a remote location, then transforms and loads the data.
Issues are the time it takes to run the cycle and the growing volume of data.
Changed Data Capture seemed a viable alternative so I suggested it. My thought was to determine the Inserts, Updates, and Deletes while the data is still on the mainframe and send only changed data to the LAN, then perform the changes via DStage.
Our DBA's recommended setting up a Staging database on site, dumping all of the mainframe data to the LAN, loading the Staging area with DataStage, and then use DStage for CDC and making the changes at the remote location.
Their recommendation is based on their experience with data warehouses, but can someone explain what advantage is gained or if this will even work? What about deletes?
Server A would hold todays data, Server B holds yesterdays data 500 miles down the road and needs to be updated with changes. 99% of the data is unchanged. Is DataStage the right tool for the job? If so please advise on how to set this up.
Additional information.
The company has just installed DS EE. To this point all of our DataStage projects have been done with the Server Edition. In DS SE I was dealing with some large hashed files for CDC and experiencing performance problems, that's how I ended up determining changed data on the Mainframe. Is EE better suited for the task?
This was previously posted in the wrong forum. Not sure how/why I did that...sorry.
Thanks in advance.
Issues are the time it takes to run the cycle and the growing volume of data.
Changed Data Capture seemed a viable alternative so I suggested it. My thought was to determine the Inserts, Updates, and Deletes while the data is still on the mainframe and send only changed data to the LAN, then perform the changes via DStage.
Our DBA's recommended setting up a Staging database on site, dumping all of the mainframe data to the LAN, loading the Staging area with DataStage, and then use DStage for CDC and making the changes at the remote location.
Their recommendation is based on their experience with data warehouses, but can someone explain what advantage is gained or if this will even work? What about deletes?
Server A would hold todays data, Server B holds yesterdays data 500 miles down the road and needs to be updated with changes. 99% of the data is unchanged. Is DataStage the right tool for the job? If so please advise on how to set this up.
Additional information.
The company has just installed DS EE. To this point all of our DataStage projects have been done with the Server Edition. In DS SE I was dealing with some large hashed files for CDC and experiencing performance problems, that's how I ended up determining changed data on the Mainframe. Is EE better suited for the task?
This was previously posted in the wrong forum. Not sure how/why I did that...sorry.
Thanks in advance.