Px design approach question - fishing for better ideas
Moderators: chulett, rschirm, roy
Px design approach question - fishing for better ideas
I have a requirement that has me taking about 400 DB/2 tables and loading them into Oracle tables with the same table names and column names. The datatypes are being changed (different date handling, some trims, nulls modified, etc.) according to rules based on the datatypes and not on the column names.
My current approach is that I am going to write some code (in DataStage BASIC) to auto-generate .dsx files from the DB/2 and Oracle metadata along with the conversion rules - this is easier than hand-coding 400 jobs.
I have just started the coding and the thought came to me that someone here might have a better approach or suggestion for this set of jobs - that I might have fixated upon one solution and be ignoring some other simpler way of doing this (can't see the schema for all the columns or something along those lines).
Any and all suggestions accepted
My current approach is that I am going to write some code (in DataStage BASIC) to auto-generate .dsx files from the DB/2 and Oracle metadata along with the conversion rules - this is easier than hand-coding 400 jobs.
I have just started the coding and the thought came to me that someone here might have a better approach or suggestion for this set of jobs - that I might have fixated upon one solution and be ignoring some other simpler way of doing this (can't see the schema for all the columns or something along those lines).
Any and all suggestions accepted
<a href=http://www.worldcommunitygrid.org/team/ ... TZ9H4CGVP1 target="WCGWin">
</a>
</a>
Why can't you put an Oracle stage on the palette and drag the link over from DB2 to Oracle and save everything but the connection information. As long as they do not change the column names then just import the new tables and overwrite the column definitons to change the data types. I would think you could do this very fast.
Changing a stage type in a DSX iis not something I have done. I would think that might be difficult. I have changed the column names in a DSX with BASIC. Send me what you got in a private email and I might be able to modify what I got to meet your needs.
Changing a stage type in a DSX iis not something I have done. I would think that might be difficult. I have changed the column names in a DSX with BASIC. Send me what you got in a private email and I might be able to modify what I got to meet your needs.
Mamu Kim
Borrowing Kim's assumptions that all names remain the same, and adding an assumption that column orders remain intact.
Just create 1 parameterized job and send the table name as a parameter.
1) DB2Read (select * from #table#)
link
2) OracleWrite (insert *)
If you make it a multi-instance job, you can run tables concurrently.
Just create 1 parameterized job and send the table name as a parameter.
1) DB2Read (select * from #table#)
link
2) OracleWrite (insert *)
If you make it a multi-instance job, you can run tables concurrently.
In your DB2 stage doing the read, set the Read Method to Table. If you do not specify a select list or where clause, this simply runs a 'select *' from the table you specify. Then parameterize the tablename. I believe Oracle uses a similar concept of tables as DB2, using a schemaname.tablename format? So create 3 parameters: ds_db2schema, ds_oraschema, and ds_tablename. Then, in your DB2 read, reference the source table as #ds_db2schema#.#ds_tablename#.kduke wrote:bcarlson
I am not sure I follow what you are trying to do.
Link this to your Oracle stage. Now, I have not used Oracle in DS/PX, but I assume it works similarly to DB2's stage. If so, then supply the target table name as #ds_oraschema#.#ds_tablename#. You don't supply a select list so that it will default to all columns.
This method will, effectively, be the same as
insert into myOracleTargetTable
select * from myDB2SourceTable
;
We don't use the Sequencer, so I don't know how you would iterate through all 400 tables. We do everything in Unix, so you can use the dsjob program at the Unix commandline (probably Windows, too) that comes with PX. In Unix, it would be very easy to run through all your jobs. Better yet, split your 400 jobs into 4 sets of 100 and run 4 jobs concurrently.
I misread the original post. These are straight table copies. I was think this was 400 jobs that need to be converted from DB2 jobs to Oracle jobs. This is much easier.
Do one job exactly like you want. Then import all the tables. Export the one job and all the tables and send it to me. I can generate all 400 jobs. I send you the DSX back.
Make your link names like OrderDimIn and OrderDimOut for ORDER_DIM. Do you comments the same. Keep them generic. My job will parse the DSX change ORDER_DIM into new table to process and do string replaces for OrderDim to CustomerDim or whatever the new table is. I have already done this a couple times. Jobs even compile when done. I always have to fine tune it because DB2 stages are not something I have dealt with before. Should save you hours but do it the hard way if you want. The metadata is easily read from DS_METADATA. Ray posted code to show how to do that. Hacking a DSX was a little tougher.
Do one job exactly like you want. Then import all the tables. Export the one job and all the tables and send it to me. I can generate all 400 jobs. I send you the DSX back.
Make your link names like OrderDimIn and OrderDimOut for ORDER_DIM. Do you comments the same. Keep them generic. My job will parse the DSX change ORDER_DIM into new table to process and do string replaces for OrderDim to CustomerDim or whatever the new table is. I have already done this a couple times. Jobs even compile when done. I always have to fine tune it because DB2 stages are not something I have dealt with before. Should save you hours but do it the hard way if you want. The metadata is easily read from DS_METADATA. Ray posted code to show how to do that. Hacking a DSX was a little tougher.
Mamu Kim
Wow - many more responses than I had expected!
I think one part that I didn't really stress is important, otherwise I could have used DS's PX facility of not having to declare DB columns - each column needs to have a transform applied to it. The table & columns are the same, but the datatypes are different. CHAR goes to VARCHAR (needs to be trimmed & CHAR(000) removed), all date, timestamp values need to be converted, a lot of decimal data types have different precisions, etc.
There are 21077 columns in total but I can distill the conversion rules down into a distinct list. Thus I thought of my approach. I actually get all the table & column data from the DS_METADATA already.
Thanks for all the great suggestions so far, even though they might not work for this particular flow they have certainly opened a number of avenues of thought for me, and hopefully for those reading this thread as well.
I think one part that I didn't really stress is important, otherwise I could have used DS's PX facility of not having to declare DB columns - each column needs to have a transform applied to it. The table & columns are the same, but the datatypes are different. CHAR goes to VARCHAR (needs to be trimmed & CHAR(000) removed), all date, timestamp values need to be converted, a lot of decimal data types have different precisions, etc.
There are 21077 columns in total but I can distill the conversion rules down into a distinct list. Thus I thought of my approach. I actually get all the table & column data from the DS_METADATA already.
Thanks for all the great suggestions so far, even though they might not work for this particular flow they have certainly opened a number of avenues of thought for me, and hopefully for those reading this thread as well.
<a href=http://www.worldcommunitygrid.org/team/ ... TZ9H4CGVP1 target="WCGWin">
</a>
</a>
Craig,
I'm just starting to pluck apart the PX export file. It's pretty long and complex, but I think that Im on the right track now. With all those columns I have distilled it down to only 8 conversion rules - much less than what I had feared!
Thanks again for all the suggestions and help. It really is a fantastic place to go for both general questions as well as for a final resort with issues.
I'm just starting to pluck apart the PX export file. It's pretty long and complex, but I think that Im on the right track now. With all those columns I have distilled it down to only 8 conversion rules - much less than what I had feared!
Thanks again for all the suggestions and help. It really is a fantastic place to go for both general questions as well as for a final resort with issues.
<a href=http://www.worldcommunitygrid.org/team/ ... TZ9H4CGVP1 target="WCGWin">
</a>
</a>