Validate each and every field in a record

Post questions here relative to DataStage Server Edition for such areas as Server job design, DS Basic, Routines, Job Sequences, etc.

Moderators: chulett, rschirm, roy

Post Reply
poorna_76
Charter Member
Charter Member
Posts: 190
Joined: Thu Jul 08, 2004 10:42 am

Validate each and every field in a record

Post by poorna_76 »

I have a scenario like this:

Field1,Field2,Field3,Field4,Field5,Field6...................Field150.

Here I have to validate each and every field against some Business Logic.

If Field1 is not numeric write out to error table and
if Field2 is not numeric write out to error table and so on.

In worst case scenario, if I have error in all the fields in that record.
I have to write out 150 records in the output table.

For 1 record - I have to write out 150 records.

ErrorField, ErrorDesc
Field1, Invalid ID
Field2, Invalid Name
Field3, Invalid Date
.
.
.
.
.
.
.
.
Field150, InvalidXYZ

We thought of StageVariables and Pivot.

Is it the only way/ Is there a better way to do it?

What is the best way to do it in DataStage?

Any Thoughts/Recommendations are appreciated.

Thanks in Advance.
roy
Participant
Posts: 2598
Joined: Wed Jul 30, 2003 2:05 am
Location: Israel

Post by roy »

Hi,
I wrote a FAQ example that might be used to concatenate all this info into 1 write operation.

look it up.

IHTH,
Roy R.
Time is money but when you don't have money time is all you can afford.

Search before posting:)

Join the DataStagers team effort at:
http://www.worldcommunitygrid.org
Image
poorna_76
Charter Member
Charter Member
Posts: 190
Joined: Thu Jul 08, 2004 10:42 am

Post by poorna_76 »

Roy,
Sorry to ask this,
where can i find that.

Thanks & Regards
roy wrote:Hi,
I wrote a FAQ example that might be used to concatenate all this info into 1 write operation.

look it up.

IHTH,
kduke
Charter Member
Charter Member
Posts: 5227
Joined: Thu May 29, 2003 9:47 am
Location: Dallas, TX
Contact:

Post by kduke »

I have 2 routines which will work on small sequential files because it reads the whole sequential file into one record. If you have large sequential files then you need to change this to openseq and readseq.

Code: Select all

* -----------------------------------------------------------------
* KgdParseSeqFile(Path, FileName)
* Decription: Parse Sequential File.
* Written by: Kim Duke
* -----------------------------------------------------------------
* Notes:
* -----------------------------------------------------------------
      Ans = ''
* -----------------------------------------------------------------
* initialize standard variables
* -----------------------------------------------------------------
      openpath Path to DirPtr else
         ErrMsg = "Error: Bad Path = ":Path
         goto ErrRtn
      end
      ColumnNames = ''
      read Rec from DirPtr, FileName then
         NoColumns = dcount(Rec<1>, ',')
         NoLines = dcount(Rec, @FM)
         NoLines -= 1
         print Rec<NoLines>
         for j=1 to NoColumns
            ColumnName = field(Rec<1>, ',', j)
            Ans<1, j> = ColumnName
            Value = field(Rec<NoLines>, ',', j)
            TheLen = len(Value)
            if Value[1,1]='"' then TheLen -= 2
            Ans<2, j> = Value
            Ans<3, j> = TheLen
            print j 'R#5':'. ':ColumnName:'(':TheLen:') = ':Value
         next j
      end else
         ErrMsg = 'Error: Bad FileName = ':FileName
         goto ErrRtn
      end
      goto TheEnd
ErrRtn:
      print
      print ErrMsg
      print
TheEnd:
      print Ans
Run this routine in test mode and double click on results.
Mamu Kim
kduke
Charter Member
Charter Member
Posts: 5227
Joined: Thu May 29, 2003 9:47 am
Location: Dallas, TX
Contact:

Post by kduke »

This is the second routine. It is a lot more complicated. If you save your metadata in Designer then it stores it in DS_METADATA. We were having an issue were one source column was wider than the target. The job gave a metadata mismatch warning. This routine will help debug this situation by comparing the column definition saved to the actual data. So what I did was route the data to sequential file even if the original design was an Oracle table. So when the job failed it would compare column length to actual length and show you the actual value on the next to last row.

Code: Select all

* -----------------------------------------------------------------
* KgdDebugMetadata(Path, FileName, DsMetadataId)
* Decription: Debug Metadata.
* Written by: Kim Duke
* Notes:
* -----------------------------------------------------------------
* $INCLUDE DSINCLUDE JOBCONTROL.H
      Ans = ''
      RoutineName = "KgdDebugMetadata"
      Usage = RoutineName:'(Path, FileName, DsMetadataId)'
* -----------------------------------------------------------------
* initialize variables
* -----------------------------------------------------------------
      True = 1
      False = 0
      Other = True
      Forever = True
      Today = oconv(@DATE, "D4-")
      ErrMsg = ''
      NullRec = ''
      CrLf = char(13):char(10)
      DsMetadataSep = '\'
      DebugMode = False
      OracleStyle = 1
* -----------------------------------------------------------------
      HashFile = 'DS_METADATA'
      open HashFile to FilePtr else
         ErrMsg = 'Can not open ':HashFile:' file.'
         goto ErrRtn
      end
      DsMetadataPtr = FilePtr
* -----------------------------------------------------------------
      MdColumnNames = ''
      Precisions = ''
      DataTypes = ''
      DDL = ''
      PKeys = ''
* -----------------------------------------------------------------
      read DsMetadataRec from DsMetadataPtr, DsMetadataId then
         Occurrence = 1
* -----------------------------------------------------------------
FindColumns:
         FindStr "CMetaColumn" In DsMetadataRec, Occurrence Setting AttrNo, MvNo, SvNo Then
* -----------------------------------------------------------------
* number of column definitions in collection
* -----------------------------------------------------------------
            NoOfFields = Field(DsMetadataRec<AttrNo>, "/", 2, 1)
            If NoOfFields = 0 Then
               * Stored procedures use the second columns collection in the record.
               * Go get the next one.
               Occurrence += 1
               GoTo FindColumns
            End
            TableName = field(DsMetadataId, DsMetadataSep, dcount(DsMetadataId, DsMetadataSep))
            DDL<-1> = 'Create Table ':TableName
            DDL<-1> = '('
            Indent = spaces(3*1)
            * NoOfFields = dcount(DsMetadataRec<AttrNo+1>,
            for MvNo=1 to NoOfFields
               ColumnName = DsMetadataRec<AttrNo+1, MvNo>
               MdColumnNames<1, MvNo> = ColumnName
               if DebugMode then print 'ColumnName=':ColumnName
               ColumnDescription = DsMetadataRec<AttrNo+2, MvNo>
               DataElement = DsMetadataRec<AttrNo+3, MvNo>
               if DebugMode then print 'DataElement=':DataElement
               SQLDataTypeCode = DsMetadataRec<AttrNo+4, MvNo>
               if DebugMode then print 'SQLDataTypeCode=':SQLDataTypeCode
               Precision = DsMetadataRec<AttrNo+5, MvNo>
               Scale = DsMetadataRec<AttrNo+6, MvNo>
               IsNullable = DsMetadataRec<AttrNo+7, MvNo>
               if DebugMode then print 'IsNullable=':IsNullable
               KeyPosition = DsMetadataRec<AttrNo+8, MvNo>
               DisplayWidth = DsMetadataRec<AttrNo+9, MvNo>
               Begin case
                  case SQLDataTypeCode = -7
                     SQLType = "BIT"
                     OracleType = "VARCHAR2"
                  case SQLDataTypeCode = -6
                     SQLType = "TINYINT"
                     OracleType = "NUMBER"
                  case SQLDataTypeCode = -5
                     SQLType = "BIGINT"
                     OracleType = "NUMBER"
                  case SQLDataTypeCode = -4
                     SQLType = "LONGVARBINARY"
                     OracleType = "VARCHAR2"
                  case SQLDataTypeCode = -3
                     SQLType = "VARBINARY"
                     OracleType = "VARCHAR2"
                  case SQLDataTypeCode = -2
                     SQLType = "BINARY"
                     OracleType = "VARCHAR2"
                  case SQLDataTypeCode = -1
                     SQLType = "LONGVARCHAR"
                     OracleType = "VARCHAR2"
                  case SQLDataTypeCode = 0
                     SQLType = "VARCHAR"
                     OracleType = "VARCHAR2"
                  case SQLDataTypeCode = 1
                     SQLType = "CHAR"
                     OracleType = "CHAR"
                  case SQLDataTypeCode = 2
                     SQLType = "NUMERIC"
                     OracleType = "NUMBER"
                  case SQLDataTypeCode = 3
                     SQLType = "NUMERIC"
                     OracleType = "NUMBER"
                  case SQLDataTypeCode = 4
                     SQLType = "INTEGER"
                     OracleType = "NUMBER"
                  case SQLDataTypeCode = 5
                     SQLType = "SMALLINT"
                     OracleType = "NUMBER"
                  case SQLDataTypeCode = 6
                     SQLType = "FLOAT"
                     OracleType = "NUMBER"
                  case SQLDataTypeCode = 7
                     SQLType = "REAL"
                     OracleType = "NUMBER"
                  case SQLDataTypeCode = 8
                     SQLType = "DOUBLE"
                     OracleType = "NUMBER"
                  case SQLDataTypeCode = 9
                     SQLType = "DATE"
                     OracleType = "DATE"
                  case SQLDataTypeCode = 10
                     SQLType = "TIME"
                     OracleType = "DATE"
                  case SQLDataTypeCode = 11
                     SQLType = "TIMESTAMP"
                     OracleType = "DATE"
                  case SQLDataTypeCode = 12
                     SQLType = "VARCHAR"
                     OracleType = "VARCHAR2"
                  case True
                     SQLType = "VARCHAR"
                     OracleType = "VARCHAR2"
               end case
               if OracleStyle then
                  If OracleType = "NUMBER" Then
                     If Precision = @NULL Then
                        DataType = OracleType
                     End Else
                        If Scale = @NULL Then
                           DataType = OracleType : "(" : Precision : ")"
                        End Else
                           DataType = OracleType : "(" : Precision : "," : Scale : ")"
                        End
                     End
                  End Else
                     if OracleType = "DATE" then
                        DataType = OracleType
                     end else
                        DataType = OracleType : "(" : Precision : ")"
                     end
                  End
               end else
                  If SQLType = "NUMERIC" Then
                     If Precision = @NULL Then
                        DataType = SQLType
                     End Else
                        If Scale = @NULL Then
                           DataType = SQLType : "(" : Precision : ")"
                        End Else
                           DataType = SQLType : "(" : Precision : "," : Scale : ")"
                        End
                     End
                  End Else
                     if SQLType = "DATE" or SQLType = "TIME" or SQLType = "TIMESTAMP" then
                        DataType = SQLType
                     end else
                        DataType = SQLType : "(" : Precision : ")"
                     end
                  End
               end
               if DebugMode then print 'DataType=':DataType
               DataTypes<1, MvNo> = DataType
               Precisions<1, MvNo> = Precision
               if IsNullable then
                  NullAble = ''
               end else
                  NullAble = ' NOT NULL'
               end
               if DebugMode then print 'NullAble=':NullAble
               if MvNo = 1 then
                  DDL<-1> = Indent:'  ':ColumnName "L#40":' ':DataType:NullAble
               end else
                  DDL<-1> = Indent:', ':ColumnName "L#40":' ':DataType:NullAble
               end
               if DebugMode then print 'KeyPosition=':KeyPosition
               if KeyPosition then
                  PKeys<-1> = ColumnName
               end
            next MvNo
            if not(OracleStyle) then
               Indent = spaces(3*1)
               DDL<-1> = Indent:', Primary Key ('
               NoOfLines = dcount(PKeys, @FM)
               for i=1 to NoOfLines
                  Indent = spaces(3*2)
                  if DebugMode then print PKeys<i>
                  if i =1 then
                     DDL<-1> = Indent:'  ':PKeys<i>
                  end else
                     DDL<-1> = Indent:', ':PKeys<i>
                  end
               next i
               Indent = spaces(3*1)
               DDL<-1> = Indent:')'
            end
            DDL<-1> = ')'
            DDL<-1> = ';'
            if OracleStyle then
               DDL<-1> = ' '
               Indent = spaces(3*1)
               DDL<-1> = 'Alter Table ':TableName:' Add ('
               DDL<-1> = Indent:'Primary Key ('
               NoOfLines = dcount(PKeys, @FM)
               for i=1 to NoOfLines
                  Indent = spaces(3*2)
                  if DebugMode then print PKeys<i>
                  if i =1 then
                     DDL<-1> = Indent:'  ':PKeys<i>
                  end else
                     DDL<-1> = Indent:', ':PKeys<i>
                  end
               next i
               Indent = spaces(3*1)
               DDL<-1> = Indent:')'
               DDL<-1> = ')'
               DDL<-1> = ';'
            end

            print
            NoOfLines = dcount(DDL, @FM)
            for i=1 to NoOfLines
               print DDL<i>
            next i

            * Ans = DDL
         end else
            ErrMsg = 'Could not find CMetaColumn in ':DsMetadataId
            goto ErrRtn
         end
      end else
         ErrMsg = 'Could not find ':DsMetadataId:' in DS_METADATA.'
         goto ErrRtn
      end

      openpath Path to DirPtr else
         ErrMsg = "Error: Bad Path = ":Path
         goto ErrRtn
      end
      ColumnNames = ''
      read Rec from DirPtr, FileName then
         NoColumns = dcount(Rec<1>, ',')
         NoLines = dcount(Rec, @FM)
         NoLines -= 1
         print Rec<NoLines>
         for j=1 to NoColumns
            ColumnName = field(Rec<1>, ',', j)
            * Ans<1, j> = ColumnName
            Value = field(Rec<NoLines>, ',', j)
            TheLen = len(Value)
            if Value[1,1]='"' then TheLen -= 2
            * Ans<2, j> = Value
            * Ans<3, j> = TheLen
            print j 'R#5':'. ':ColumnName:'(':TheLen:') = ':Value
            convert '"' to '' in ColumnName
            locate ColumnName in MdColumnNames<1> setting MvNo then 
               if TheLen > Precisions<1, MvNo> then
                  print ColumnName:' longer than metadata: ':Precisions<1, MvNo>
                  Ans<-1> = ColumnName:' longer than metadata: ':Precisions<1, MvNo>
               end
            end else 
               print ColumnName:' not in metadata.'
            end
         next j
      end else
         ErrMsg = 'Error: Bad FileName = ':FileName
         goto ErrRtn
      end

      goto TheEnd
* -----------------------------------------------------------------
ErrRtn:
      if DebugMode then print
      if DebugMode then print ErrMsg
      if DebugMode then print
      Ans = ErrMsg
* -----------------------------------------------------------------
TheEnd:
      print
      print
      if DebugMode then print Ans
      print
Mamu Kim
Post Reply