Automatic Job Generator - does one exist??

kduke · Post by **kduke** » Thu Dec 07, 2006 10:33 pm

Sure you can write directly to DS_METADATA. It is a simple table. There is very little risk too. If you mess up then just clear the table and reload from export. I would think Ray's routine which exported this table to a XML file would be the place to start. I expect the XML file would also import back into to DataStage. So maybe you can use the command line tool to automate the import as well.

Send me your sample DSX files to copy one table and maybe I will write it just for fun.

jdmiceli · Post by **jdmiceli** » Fri Dec 08, 2006 10:26 am

Hi Kim,

Do you want me to email it directly to you or just post it here? I'll just pick one of our tables for metadata (a larger one to get a variety of datatypes) and either post it or email it to you as you desire.

As always, I appreciate your expert opinions,

Bestest!

ray.wurlod · Post by **ray.wurlod** » Fri Dec 08, 2006 3:34 pm

Are you emailing the code or the table definition here? If the latter, I feel we might be straying into billable territory. IMHO, of course. It's Kim's call.

kduke · Post by **kduke** » Fri Dec 08, 2006 10:16 pm

I am assuming you are doing straight table copies. Send sample jobs you want to use as a prototype. I should bill you for it but I want to see if it would be a lot of work first. So mail it to

Kim_G_Duke@hotmail.com

I will let you know how much work is involved. I will post my answer if you like. Send all the jobs to copy one table including a sequence if you want. Maybe it is just one job.

jdmiceli · Post by **jdmiceli** » Mon Dec 11, 2006 3:44 pm

Hi Kim,

I'm still in the process of building my prototype jobs with a real table. I've only done it with a hand defined table to map all the datatypes within DataStage. I'll redo it with a small single table.

Please bear in mind, I'm not looking for free work from you guys, nor am I in a position of authority to commit my company to any work requiring cash expenditure. I'm perfectly willing to do the work, I just don't know how to insert the metadata in any way other than Import. If I had a budget, I'd be happy to pay for the assistance but I'm just little old me

Anywho, I'll forward what I'm doing if you feel like looking at it.

Thanks!

kduke · Post by **kduke** » Mon Dec 11, 2006 9:45 pm

If it is not a lot of work then I will post the solution or email back to you.

jdmiceli · Post by **jdmiceli** » Fri Jan 05, 2007 5:17 pm

Hi all!

Would it be possible for anyone to send me a sample export file from systems/versions other than V.7.5.1a Server Edition? I would like to compare the generated export files to what I currently have to see what differences I have to allow for with other versions. Maybe a sample from Hawk would be possible as well???

I will probably make it work for just one system first (my own of course - selfish me). I have set up templates for sections of the export file in my code. To start, the process will be pretty reliant on some prepared files to tell the code what to make the .dsx file look like for your system (DSN's, etc.). If anyone has ideas on approaches to smooth information gathering or whatever, I'm open to any input. I figured I would just make it work first and improve later. Prove it will work at all first, then improve it.

Anywho, samples of other .dsx files would be wonderful. I'm specifically looking for two jobs in the file:

1. SRC --> xfm --> Sequential File
2. Sequential file --> xfm --> TGT

I would like no transformations as this is just to build a straight across pull. My initial assumptions are going to be as follows:

- Source db and tables exist
- Target db and tables exist
- metadata is currently loaded in DataStage (though I want this process to do this for us from the start if possible - but for now....)
- Source and target databases have the same table structure
- DSN's are established (duh

)

I have started the design process on this project now and I'm just looking to fill in the blanks that are outside my limited experience. The email address for these or anything else would be:

DataStage_Growth@hotmail.com.

I appreciate your input as always!

Bestest,

kduke · Post by **kduke** » Fri Jan 05, 2007 6:46 pm

John

I have been so busy, I forgot you sent me your example DSX. Hopefully I can work on it tomorrow.

pclarebu · Post by **pclarebu** » Sun Jan 07, 2007 7:00 pm

Hi on my current project we are playing with using the propogate facility. It is at an initial stage but using this we can generate from a file of metadata OSH schema files. Then we create one datastage job that specifies the Osh Schema file as a parameter as well as all the other parameters fro database table name etc.

In the job we don't specifiy any column metadata.

Then we can run the job over and over just changing the paramaters and column data is propogated through.

As I said at early stages but it seems to work. I have also done a template job with a simple join in this way as well by passing the join key as a parameter.

Still playing at this stage but some of you real experts might have more coments on this

Regards

Peter

kduke · Post by **kduke** » Sun Jan 07, 2007 10:23 pm

Let me explain how I did this. First you assume a straight table copy. This means source column names equal target column names. Next you assume the metadata stays the same. I save all the table definitons by importing them. These get stored in DS_METADATA. Ray posted a routine which exposed the data structure of this table. Next you have to examine the DSX file to see where the table names, column names and SQL are stored. Sometimes it is easier to replace user defined SQL and sometimes it is easier to replace generated SQL. Do both and look at the DSX file.

Next you need to find something unique in the DSX so you can search for it. You need to be pretty good with BASIC to understand this code. FINDSTR will find any attribute in a record. You this to locate the whatever you want to replace. First you need to copy your DSX file to your DataStage server below the project.

Code: Select all

cd ProjectDir
mkdir KGen 
ftp DSX file into this directory

Next create a routine or batch job to read the DSX and change it to what you want and then write it back out.

Open your KGen directory to read in your DSX file. I called this KGen for Kim's Generator as a play on Cajun, my favorite food.

Code: Select all

      openpath 'KGen' to KGen else
         ErrMsg = "Error: Unable to open DsxFiles "
         gosub ErrRtn
         goto TheEnd
      end

Now do your global replaces.

Code: Select all

      DEFFUN UpperLowerName(OldName) CALLING "DSU.UpperLowerName"
      DEFFUN KgdCreateDdlMetadata(DsMetadataId, OracleStyle) CALLING "DSU.KgdCreateDdlMetadata"
* ------------------------------------------------------------
* Main
* ------------------------------------------------------------
      DsxName = 'ExtractPsProdPrice.dsx'
      NewDsxName = change(DsxName, 'PsProdPrice', UpperLowerName(SourceTable))
* ------------------------------------------------------------
      MyTable = 'PS_PROD_PRICE'
      ThingsToChange<-1> = upcase(MyTable)
      ThingsToChange<-1> = downcase(MyTable)
      ThingsToChange<-1> = UpperLowerName(MyTable)
* ------------------------------------------------------------
      ReplaceWith<-1> = upcase(SourceTable)
      ReplaceWith<-1> = downcase(SourceTable)
      ReplaceWith<-1> = UpperLowerName(SourceTable)
      NoOfThings = dcount(ThingsToChange, @FM)
* ------------------------------------------------------------
      TargetTableShort = TargetTable
      ReplaceWith<-1> = upcase(TargetTableShort)
      ReplaceWith<-1> = downcase(TargetTableShort)
      ReplaceWith<-1> = UpperLowerName(TargetTableShort)
* ------------------------------------------------------------

         gosub ProcessLine

* ------------------------------------------------------------
      goto TheEnd
* ------------------------------------------------------------
* Change PsProdPrice into NewDim
* ------------------------------------------------------------
ProcessLine:
      for j=1 to NoOfThings
         NewDsxLine = change(NewDsxLine, ThingsToChange<j>, ReplaceWith<j>)
      next j
      return
* ------------------------------------------------------------
ErrRtn:
      Call DSLogInfo(ErrMsg , "JobControl")
      * Call DSLogFatal(ErrMsg, "JobControl")
      * abort
      return
* ------------------------------------------------------------
TheEnd:

kduke · Post by **kduke** » Sun Jan 07, 2007 10:28 pm

Next you need to find the column metadata in your DSX. Once you find where columns start in the DSX file then you need to strip out the old columns and insert the new columns.

Code: Select all

* ------------------------------------------------------------
      read DsxRec from KGen, DsxName then
         NoOfLines = dcount(DsxRec, @FM)
         for i=1 to NoOfLines
* ------------------------------------------------------------
* get OLEType, SectionName ...
* ------------------------------------------------------------
            gosub EvaluateLine
* ------------------------------------------------------------
* Change PsProdPrice into TargetTable
* ------------------------------------------------------------
            gosub ProcessLine
* ------------------------------------------------------------
* Do more complicated substituions
* ------------------------------------------------------------
            EndOfFirstWord = index(NewDsxLine,FirstWord,1)+len(FirstWord)-1
            BeginOfSecondWord = index(NewDsxLine,SecondWord,1)
            Begin Case  
               Case SectionName = 'HEADER' and FirstWord = 'Date'
                  NewDsxLine = NewDsxLine[1, BeginOfSecondWord-1] : DQuote(Today)
*               Case Identifier = 'ROOT' and FirstWord = 'Description'
*                  NewDsxLine = NewDsxLine[1, BeginOfSecondWord-1] : '"':@LOGNAME:' ':Today:' created."'
*               Case OLEType = 'CJobDefn' and FirstWord = 'Category'
*                  NewDsxLine = NewDsxLine[1, EndOfFirstWord]:' "UserDefined\\KimD\\':UpperLowerName(TargetTable):'"'
*                  NewDsxLine = NewDsxLine[1, EndOfFirstWord]:' "UserDefined\\KimD"'
               Case OLEType = 'CTrxOutput' and FirstWord = 'Columns'
                  gosub GenTrxOutput
               Case OLEType = 'CCustomOutput' and SecondWord = '"CCustomProperty"'
                  gosub GenCCustomProperty
            End Case

This code brings back all the critical fields in the DSX to figure out where in the DSX you are at.

Code: Select all

* ------------------------------------------------------------
* get OLEType, SectionName ...
* ------------------------------------------------------------
EvaluateLine:
      DsxLine = DsxRec<i>
      NewDsxLine = convert(CrLf, '', DsxLine)
      NewDsxLine = DsxLine
      RealFirstWord = field(NewDsxLine, ' ', 1)
      RealSecondWord = field(NewDsxLine, ' ', 2)
      FirstWord = field(trim(NewDsxLine), ' ', 1)
      SecondWord = field(trim(NewDsxLine), ' ', 2)
      EndOfFirstWord = index(NewDsxLine,FirstWord,1)+len(FirstWord)-1
      if RealFirstWord = 'BEGIN' then 
         SectionName = RealSecondWord
         LogMsg = "SectionName : ":SectionName
         if DebugSw then Call DSLogInfo(LogMsg, "JobControl")
      end else
         Begin Case
            Case FirstWord = 'BEGIN' 
               SubSection = SecondWord
               LogMsg = "SubSection  : ":SubSection
               if DebugSw and i<50 then Call DSLogInfo(LogMsg, "JobControl")
            Case FirstWord = 'Identifier' 
               Identifier = convert('"' : "'", '', SecondWord)
               LogMsg = "Identifier  : ":Identifier
               if DebugSw and i<50 then Call DSLogInfo(LogMsg, "JobControl")
            Case FirstWord = 'OLEType' 
               OLEType = convert('"' : "'", '', SecondWord)
               LogMsg = "OLEType     : ":OLEType:OLEType
               if DebugSw and i<50 then Call DSLogInfo(LogMsg, "JobControl")
         End Case
      end
      LogMsg = i "5R. ":DsxLine[1, 50]
      if DebugSw then Call DSLogInfo(LogMsg, "JobControl")
      NewDsxLine = DsxLine
      return

kduke · Post by **kduke** » Sun Jan 07, 2007 10:32 pm

Next you need to read the metadata you need from DS_METADATA and substitute it into the DSX for each column.

Code: Select all

* ------------------------------------------------------------
      open 'DS_METADATA' to DsMetadata else
         ErrMsg = "Error: Unable to open DS_METADATA."
         gosub ErrRtn
         goto TheEnd
      end

Code: Select all

      DsMetadataIdSource = SourceFolder : upcase(SourceTable)
      DsMetadataId = DsMetadataIdSource
      gosub GetMetadata
      if not(Found) then 
         ErrMsg = LogMsg
         gosub ErrRtn
         goto TheEnd
      end
      DsMetadataRecSource = DsMetadataRec

Code: Select all

* ------------------------------------------------------------
GetMetadata:
      AttrNo = 11
      read DsMetadataRec from DsMetadata, DsMetadataId then
         Found = @TRUE
         LogMsg = "Read: ":DsMetadataId:" in DS_METADATA."
         if DebugSw then Call DSLogInfo(LogMsg, "JobControl")
         NoOfCols = Field(DsMetadataRec<AttrNo>, "/", 2, 1)
         if DebugSw then
            for k=1 to NoOfCols
               LogMsg = k "3R. ":DsMetadataRec<AttrNo+1, k>
               Call DSLogInfo(LogMsg, "JobControl")
            next k
         end
      end else
         Found = @FALSE
         LogMsg = "Error: No ":DsMetadataId:" in DS_METADATA."
         Call DSLogInfo(LogMsg, "JobControl")
         DsMetadataRec = ''
         NoOfCols = 0
      end 
      return

Code: Select all

* ------------------------------------------------------------
GetColMetadata:
      AttrNo = 11
      locate ColName in DsMetadataRec<AttrNo+1> setting MvNo then
         Found = @TRUE
         LogMsg = "Read: Column: ":ColName:" in ":DsMetadataId
         if DebugSw then Call DSLogInfo(LogMsg, "JobControl")
         * ColName = DsMetadataRec<AttrNo+1, MvNo>
         ColDesc = DsMetadataRec<AttrNo+2, MvNo>
         ColDataElement = DsMetadataRec<AttrNo+3, MvNo>
         ColSQLDataTypeCode = DsMetadataRec<AttrNo+4, MvNo>
         ColPrecision = DsMetadataRec<AttrNo+5, MvNo>
         ColScale = DsMetadataRec<AttrNo+6, MvNo>
         ColIsNullable = DsMetadataRec<AttrNo+7, MvNo>
         ColKeyPosition = DsMetadataRec<AttrNo+8, MvNo>
         ColDisplayWidth = DsMetadataRec<AttrNo+9, MvNo>
         if ColDisplayWidth > 15 then ColDisplayWidth = 15
      end else
         Found = @FALSE
         LogMsg = "Error: No Column: ":ColName:" in ":DsMetadataId
         Call DSLogInfo(LogMsg, "JobControl")
         ColDesc = '<none>'
         ColDataElement = ''
         ColSQLDataTypeCode = '12'
         ColPrecision = '10'
         ColScale = '0'
         ColIsNullable = '1'
         ColKeyPosition = '0'
         ColDisplayWidth = '10'
      end 
      return

kduke · Post by **kduke** » Sun Jan 07, 2007 10:52 pm

Now you have all the pieces to put together a solution. You need to figure out the column types. They differ by OLETYPE. OLETYPE represents the stage type. A column type of 1 in ODBC maybe interger but in a Oracle 8 stage this maybe a varchar. You have to figure this out by looking in Designer and seeing what shows up in the DSX. This takes time.

I am posting all this because I doubt if I have the time to figure out this solution. It was a lot of work the first time. It was a lot of fun though.

If you really need this solution where your company does a lot data warehouse projects and you have a standard set of jobs then this can save you weeks of work. Plus if you redesign it then maybe you can rebuild all of your jobs in one forth the time.

My goal with all my tools was to reduced the time to delivery for a startup data warehouse or reduce the time it takes to re-engineer a data warehouse. This is one of my tricks. Basically you build a really good prototype to load one table and you generate all the other table load jobs from this one prototype. IBM plans on doing this with Rational Architect. So they are listening.

I wish I had more money. I could of had this done years ago. I had crude versions of all this years ago. Just could not put together a polished product soon enough. I used the crude version on my clients but would never try to sell something that crude. Too complex to explain too. The documentation stuff is lot easier to explain and support. The window has closed on this type of product or is closing. IBM has too much money and resources.

DSguru2B · Post by **DSguru2B** » Mon Jan 08, 2007 8:39 am

Pretty impressive Kim. Keep up the good work. All of us look up to you, so let us in on your secret

.

jdmiceli · Post by **jdmiceli** » Mon Jan 08, 2007 12:03 pm

Thanks for your input on this, Kim.

That's a lot of code. I will dig through it and try to get my brain wrapped around it. I'm not DS Basic fluent yet, so it may take me a bit. I have already been working on getting the datatype mappings based on connection type by setting up an extract job to the different engines with a manually created table that has all datatypes in order for the fields. It has been a bit of a challenge to map, but DS keeps things fairly consistent so I believe I can make it fairly uniform.

One thing for sure, when I finally meet up with you (it'll happen sometime I'm sure), lunch/dinner is on me

Bestest!

DSXchange

Automatic Job Generator - does one exist??

Thanks, Kim!

Kim's offer

Request for samples from versions other than Server

Wow!