I need a DSX-Cutter

A forum for discussing DataStage<sup>®</sup> basics. If you're not sure where your question goes, start here.

Moderators: chulett, rschirm, roy

victorbos
Participant
Posts: 24
Joined: Tue Jul 15, 2003 2:05 am
Contact:

I need a DSX-Cutter

Post by victorbos »

Hi all,

As said in this thread: viewtopic.php?t=85488

...what I need is a DSX cutter: a little program to cut a DSX file in smal pieces: one file for each object.
A program in Perl would be great because we can integrate it with our versioncontrol tool.

tia,

Victor.
AndrewWebbUK
Participant
Posts: 17
Joined: Sun Sep 14, 2003 6:14 am

Re: I need a DSX-Cutter

Post by AndrewWebbUK »

why dont you export the DSX in XML format, and then s the XML reader/writer to reorganise it?
Andrew Webb
Principal SE Ascential UK

www.ascential.com
kduke
Charter Member
Charter Member
Posts: 5227
Joined: Thu May 29, 2003 9:47 am
Location: Dallas, TX
Contact:

Post by kduke »

Victor

If you edit it then you can figure it out. If you search this web site we have discussed this before. There is a header record at the very first of the file which you need to add to the begining of each job to have a complete dsx for that job. I wrote one in VB in about a hour.

Kim.
Mamu Kim
scboyce
Participant
Posts: 9
Joined: Mon Nov 03, 2003 10:18 am
Location: Tampa, FL

Re: I need a DSX-Cutter

Post by scboyce »

I created two perl scripts a while ago to do just this. It's great for source code control since you typically want to do change control at the job/routine level.

ParseDSX.pl will split a DSX file into separate DSX files, one per job/routine in the same folder structure that they appear in DataStage.

CatDSX goes the other way. It combines multiple DSX files into one or more equally distributed DSX files suitable for migration importing.

I was going to spend some time documenting these before posting it because the ParseDSX script does a few more things that you may or may not like.

A few things you need to know about ParseDSX:
I wanted the ability to be able to do comparisions between versions of checked in DSX files (jobs or routines) with freshly exported version from production projects. This ability is crucial in order to be able to audit your source code repository and confirm that your migration procedures are working and that "freelance" edits are not happening in production. There are data elements in the DSX that are generated at export time that ParseDSX will mask out with generic values such as export date time, job last edit date time etc. This allows you to do file compares without getting a lot of "false positive" changes.

Also, all parameter values are stripped out of jobs that do not begin with "PROTOTYPE" in the name. This is because these script were written to dovetail with Ken Blands job control.

I really need to do a more thorough job of documenting these and will at some point. But for now, I would suggest reviewing the code and making modifications where you like.

These were developed against DataStage 5.1 DSX export files.

Here they are:

ParseDSX.pl

Code: Select all

#!/usr/bin/perl
##############################################################################
#
# Program:     ParseDSX.pl
#
# Description: See ShowBlurb function below for details
#
# === Modification History ===================================================
# Date       Author           Comments
# ---------- --------------- -------------------------------------------------
# 07-18-2002 Steve Boyce     Created.
# 08-21-2002 Steve Boyce     Exporting routines now includes binary info.
# 08-28-2002 Steve Boyce     Corrected bug relating to jobs and routines
#                            located in the root folder.  They now get
#                            created in the correct location.
# 08-28-2002 Steve Boyce     Changed default output directory to be the name
#                            of the dsx file being parsed without the
#                            extension.
#                            -s option now works.
# 10-04-2002 Steve Boyce     Eliminated -c option.  That now the default
#                            and only behavior.
#                            Default Parmameter metadata is now stripped out
#                            of all jobs except any jobs that have PROTOTYPE
#                            or Batch:UTIL in the name.
#                            Routines are unaffected.
# 11-12-2002 Steve Boyce     Added Version dipsplay option.
#                            Corrected source code generation bug.
# 02-27-2003 Steve Boyce     Added -x option to strip out ValidationStatus
# 03-21-2003 Steve Boyce     Stripping out ValidationStatus is now default
#                            behavior.
# 04-29-2003 Steve Boyce     Bumped version number
#
##############################################################################

use Getopt::Std;
use File::Basename;
my $version="2.1.00";

##############################################################################
sub ShowBlurb
{
print <<ENDOFBLURB;
Syntax:      ParseDSX.pl -h -l<ListFile> -o<OutputDir> -s -v -y <DSXFile>

Version:     $version

Description: Extracts individual jobs and routines from a DataStage export
             file.

Parameters:  <DSXFile> Name of DataStage DSX file to parse.  This file is
                       assumed to be generated from the DataStage export
                       process.

Options:     -l  job/routine list file. (future enhancement)
                 This file contains a list of jobs and routines to extract
                 from the <DSXFile>.
             -o  Explicitly specify <OutputDir> directory.
                 Default is the name of the parsed dsx file without the
                 extension in the current directory.
             -s  Extract job "Job Control Code" and routine "source"
                 code into "source files".
                 <job>.src and <routine>.src
                 These will appear in the same directory as the generated
                 dsx files in the <OutputDir> directory.
             -v  Display version information.
             -y  Force a "Yes" answer to overwrite existing <OutputDir>
                 directory prompt.
             -h  This help.

Notes:       Job and routine names are case sensitive in DataStage.  Extracted
             jobs and routines are placed in file names constructed based on
             job or routine names.  Running this utility on the Windows
             platform will ignore case and possibly consider some jobs and
             routines duplicates when the UNIX platform will not.
             It is a good practice to not rely on case as a differentiator
             for file names.
ENDOFBLURB
}

##############################################################################
sub ShowVersion
{
print <<ENDOFBLURB;
ParseDSX.pl Version $version
ENDOFBLURB
}

##############################################################################
sub DieWith
{
   my ($MessageLine) = @_;
   print "$MessageLine\nType ParseDSX.pl -h for help.\n";
   exit 1;
}

##############################################################################
sub OKToOverWriteOutputDir
{
   my ($OutPutDirectory, $opt_y) = @_;
   my $RetVal = 0;

   if ( -e $OutPutDirectory ) {
      if ( $opt_y ) {
         print "*** Warning: <OutputDir> directory ($OutPutDirectory) already exists.  Using anyway.\n";
         $RetVal = 1;
      }
      else  {
         print "*** Warning: <OutputDir> directory ($OutPutDirectory) already exists.\n";
         print "Proceed anyway? [y|n] ";
         $Ans = <STDIN>;
         chomp($Ans) if ($Ans);
         if ( "$Ans" eq "Y" || "$Ans" eq "y" ) {
            $RetVal = 1;
         }
         else  {
            DieWith("Aborting.");
         }
      }
   }
   else  {
      if ( MakeDir($OutPutDirectory, 777) ) {
         $RetVal = 1;
      }
      else  {
         DieWith("Error: Could not create ($OutPutDirectory) directory");
      }
   }
   return $RetVal;
}

##############################################################################
sub LoadObjectList
{
   my ($DSXListFile) = @_;
   my %DSXObjectList = ();

   if ( $DSXListFile ) {
      if (open fhDSXListFile, "<".$DSXListFile) {
         while (<fhDSXListFile>) {
            chop;
            #-- Push line onto array
            $DSXObjectList{$_} = 1;
         }
         close fhDSXListFile;
      }
      else  {
         DieWith("Error: Can't open $DSXListFile");
      }
      while ( ($key,$value) = each %DSXObjectList ) {
         print "$key=$value\n";
      }
   }
   return %DSXObjectList;
}

##############################################################################
sub MakeDir
{
   my ($FullDirPath, $Mode) = @_;
   my @DirList = ();
   my $PartialDirPath = "";
   my $RetVal = 1;

   $FullDirPath =~ tr/\\/\//;
   @DirList = split(/\//,$FullDirPath);
   foreach $Directory ( @DirList ) {
      $PartialDirPath = $PartialDirPath . $Directory. "/" ;
      if ( ! (length($PartialDirPath) == 3 && substr($PartialDirPath, 1, 2) eq ":/") ) {
         if ( ! -e $PartialDirPath ) {
            if ( ! mkdir($PartialDirPath, $Mode) ) {
               $RetVal = 0;
            }
         }
      }
   }
   return $RetVal;
}

##############################################################################
sub ParseQuotedString
{
   my ($InputLine) = @_;
   my $FirstQuotePos = 0;
   my $SecondQuotePos = 0;
   my $Length = 0;

   $FirstQuotePos = index($InputLine, '"');
   $SecondQuotePos = index($InputLine, '"', $FirstQuotePos+1);
   $Length = $SecondQuotePos - $FirstQuotePos;

   return substr($InputLine, $FirstQuotePos + 1, $Length - 1);
}

##############################################################################
sub MakeDuplicateName
{
   my ($OriginalName) = @_;
   my $NewName = "";
   my $DupSuffix = 1;

   $NewName = $OriginalName . "_dup" . "$DupSuffix";
   while ( -e $NewName ) {
      if ( $DupSuffix > 99 ) {
         DieWith("Error: There seems to be more than 99 duplicate jobs or routines.\n");
      }
      $DupSuffix += 1;
      $NewName = $OriginalName . "_dup" . "$DupSuffix";
   }
   return $NewName;
}

##############################################################################
sub WriteDSXHeader
{
   my ($fhOutputFile) = @_;

   print $fhOutputFile "BEGIN HEADER\n";
   print $fhOutputFile "   CharacterSet \"ENGLISH\"\n";
   print $fhOutputFile "   ExportingTool \"Ardent DataStage Export\"\n";
   print $fhOutputFile "   ToolVersion \"3\"\n";
   print $fhOutputFile "   ServerName \"$cStandardServerName\"\n";
   print $fhOutputFile "   ToolInstanceID \"$cStandardToolInstanceID\"\n";
   print $fhOutputFile "   MDISVersion \"1.0\"\n";
   print $fhOutputFile "   Date \"$cStandardDate\"\n";
   print $fhOutputFile "   Time \"$cStandardTime\"\n";
   print $fhOutputFile "END HEADER\n";
}

##############################################################################
sub WriteDSXObjectFile
{
   my ($ObjectType, $tmpDSXObjectHolder, $OutPutDirectory, $DSXObjectName, $DSXCategoryName) = @_;
   my $x = 0;
   my $TranslatedDSXObjectName = "";
   my $TranslatedCategorytName = "";
   my $OutputFileName = "";
   my $OutputLine = "";
   my $WriteLine = 1;

   $TranslatedDSXObjectName = $DSXObjectName;
   $TranslatedDSXObjectName =~ tr/:/_/;
   $TranslatedDSXObjectName =~ tr/ /_/;

   if ($ObjectType eq "JOB") {
      $OutPutDirectory = $OutPutDirectory . "/jobs";
      if ( ! -e $OutPutDirectory ) {
         if ( ! MakeDir($OutPutDirectory, 777) ) {
            DieWith("Error: Could not create directory: $OutPutDirectory");
         }
      }
   }
   else  {
      $OutPutDirectory = $OutPutDirectory . "/routines";
      if ( ! -e $OutPutDirectory ) {
         if ( ! MakeDir($OutPutDirectory, 777) ) {
            DieWith("Error: Could not create directory: $OutPutDirectory");
         }
      }
   }

   if ($DSXCategoryName) {
      $TranslatedCategoryName = $DSXCategoryName;
      $TranslatedCategoryName =~ tr/ /_/;

      $TranslatedCategoryName =~ tr/\\/\//s;

      $OutPutDirectory = $OutPutDirectory . "/" . $TranslatedCategoryName;
      if ( ! -e $OutPutDirectory ) {
         if ( ! MakeDir($OutPutDirectory, 777) ) {
            DieWith("Error: Could not create directory: $OutPutDirectory");
         }
      }
   }
   $OutputFileName = $OutPutDirectory . "/" . $TranslatedDSXObjectName . ".dsx";

   print "Writing File: $OutputFileName...\n";

   if ( -e $OutputFileName ) {
      print "*** WARNING: Job/Routine output DSX file ($OutputFileName) already exists.  Creating duplicate.\n";
      $OutputFileName = MakeDuplicateName($OutputFileName);
   }

   if (open (fhOutputFile, ">$OutputFileName")) {
      WriteDSXHeader(\*fhOutputFile);
      if ($ObjectType eq "ROUTINE") {
         print fhOutputFile "BEGIN DSROUTINES\n";
      }
      while ( $$tmpDSXObjectHolder[$x] ) {
         $OutputLine = $$tmpDSXObjectHolder[$x];
         $WriteLine = 1;
         #-- Filter ValidationStatus metadata out
         #-- This metadata seems to be intermitent with no value added.
         if ($OutputLine =~ /^.*ValidationStatus /) {
            $WriteLine = 0;
         }
         if ($WriteLine) {
            #-- Normalize dates and times
            if ($OutputLine =~ /^ {3,6}DateModified /) {
               $OutputLine =~ s/\".{10}\"/\"$cStandardDate\"/;
            }
            else  {
               if ($OutputLine =~ /^ {3,6}TimeModified /) {
                  $OutputLine =~ s/\".{8}\"/\"$cStandardTime\"/;
               }
            }
            #-- Send the line to the output file
            print fhOutputFile "$OutputLine";
         }
         $x = $x + 1;
      }
      if ($ObjectType eq "ROUTINE") {
         print fhOutputFile "END DSROUTINES\n";
      }
      close fhOutputFile;
   }
}

##############################################################################
sub WriteDSXSourceFile
{
   my ($ObjectType, $tmpDSXObjectSourceHolder, $OutPutDirectory, $DSXObjectName, $DSXCategoryName) = @_;
   my $x = 0;
   my $TranslatedDSXSourceName = "";
   my $TranslatedCategorytName = "";
   my $OutputFileName = "";
   my $OutputLine = "";

   $TranslatedDSXSourceName = $DSXObjectName;
   $TranslatedDSXSourceName =~ tr/:/_/;
   $TranslatedDSXSourceName =~ tr/ /_/;

   if ($ObjectType eq "JOB") {
      $OutPutDirectory = $OutPutDirectory . "/jobs";
      $SourceKeyword = "JobControlCode";
   }
   else  {
      $OutPutDirectory = $OutPutDirectory . "/routines";
      $SourceKeyword = "Source";
   }

   if ($DSXCategoryName) {
      $TranslatedCategoryName = $DSXCategoryName;
      $TranslatedCategoryName =~ tr/ /_/;
      $TranslatedCategoryName =~ tr/\\/\//s;

      $OutPutDirectory = $OutPutDirectory . "/" . $TranslatedCategoryName;
   }
   $OutputFileName = $OutPutDirectory . "/" . $TranslatedDSXSourceName . ".src";

   if ( -e $OutputFileName ) {
      $OutputFileName = MakeDuplicateName($OutputFileName);
   }

   #-- Convert single line encoded source code to properly formated code
   #-- Chop off trailing 6 spaces after every CR-LF (really leading 6 spaces)

   #-- Convert "symbolic CR-LF to real CR-LF
   $tmpDSXObjectSourceHolder =~ s/\\\(D\)\\\(A\)/\n/g;

   #-- Chop off leading keyword - either Source " or JobControlCode "
   $tmpDSXObjectSourceHolder =~ s/^ *$SourceKeyword "//;

   #-- Chop off trailing quote
   $tmpDSXObjectSourceHolder =~ s/\" *$//;

   #-- Replace all \" with "
   $tmpDSXObjectSourceHolder =~ s/\\\"/\"/g;

   #-- replace all \\ with \
   $tmpDSXObjectSourceHolder =~ s/\\\\/\\/g;

   if (open (fhOutputFile, ">$OutputFileName")) {
      print fhOutputFile $tmpDSXObjectSourceHolder;
      close fhOutputFile;
   }
}

##############################################################################
sub OKToStripDefaultValue
{
   my ($DSXObjectName, $DSParameterName) = @_;
   my $RetVal = 0;

   if (! ($DSXObjectName =~ /Batch::UTIL/) ) {
      if (! ($DSXObjectName =~ /PROTOTYPE/) ) {
         if (! ($DSParameterName eq "JobName" or $DSParameterName eq "PartitionNumber" or $DSParameterName eq "PartitionCount" ) ) {
            $RetVal = 1;
         }
      }
   }
   return $RetVal
}

##############################################################################
sub ParseDSXObjects
{
   my ($DSXFileName, $Greppize, $OutPutDirectory, $DSXObjectList) = @_;
   my @tmpDSXObjectHolder = ();
   my $tmpDSXObjectSourceHolder = "";
   my $DSXObjectName = "";
   my $DSXCategoryName = "";
   my $InDSJobBlock = 0;;
   my $InDSRoutineBlock = 0;
   my $InDSRecordBlock = 0;
   my $InDSSubRecordBlock = 0;
   my $InDSUBinaryBlock = 0;
   my $DSParameterName = "";

   if (open fhDSXFileName, "<".$DSXFileName) {
      while (<fhDSXFileName>) {
         if ($InDSJobBlock) {
            push(@tmpDSXObjectHolder, $_);
            if ($_ =~ /^END DSJOB/) {
               $InDSJobBlock = 0;
               WriteDSXObjectFile("JOB", \@tmpDSXObjectHolder, $OutPutDirectory,
                                  $DSXObjectName, $DSXCategoryName);
               if ( $tmpDSXObjectSourceHolder ) {
                  WriteDSXSourceFile("JOB", $tmpDSXObjectSourceHolder, $OutPutDirectory,
                                     $DSXObjectName, $DSXCategoryName);
               }
            }
            else  {
               if ($InDSRecordBlock) {
                  if ($_ =~ /^   END DSRECORD/) {
                     $InDSRecordBlock = 0;
                  }
                  else {
                     if ($InDSSubRecordBlock) {
                        if ($_ =~ /^      END DSSUBRECORD/) {
                           $InDSSubRecordBlock = 0;
                        }
                        else {
                           if ($_ =~ /^         Name/) {
                              $DSParameterName = ParseQuotedString($_);
                           }
                           if ($_ =~ /^         Default/) {
                              if (OKToStripDefaultValue($DSXObjectName, $DSParameterName)) {
                                 pop(@tmpDSXObjectHolder);
                              }
                           }
                        }
                     }
                     else {
                        if ($_ =~ /^      BEGIN DSSUBRECORD/) {
                           $InDSSubRecordBlock = 1;
                        }
                        else  {
                           if ($_ =~ /^      Category /) {
                              $DSXCategoryName = ParseQuotedString($_);
                           }
                           if ($Greppize) {
                              if ($_ =~ /^      JobControlCode /) {
                                 $tmpDSXObjectSourceHolder = $_;
                              }
                           }
                        }
                     }
                  }
               }
               else  {
                  if ($_ =~ /^   BEGIN DSRECORD/) {
                     $InDSRecordBlock = 1;
                  }
                  else  {
                     if ($_ =~ /^   Identifier /) {
                        $DSXObjectName = ParseQuotedString($_);
                     }
                  }
               }
            }
         }
         else {
            if ($InDSRoutineBlock) {
               if ($_ =~ /^END DSROUTINES/) {
                  $InDSRoutineBlock = 0;
               }
               else  {
                  if ($InDSRecordBlock) {
                     push(@tmpDSXObjectHolder, $_);
                     if ($_ =~ /^   END DSRECORD/) {
                        $InDSRecordBlock = 0;
                     }
                     else  {
                        if ($_ =~ /^      Identifier /) {
                           $DSXObjectName = ParseQuotedString($_);
                        }
                        else  {
                           if ($_ =~ /^      Category /) {
                              $DSXCategoryName = ParseQuotedString($_);
                           }
                           if ($Greppize) {
                              if ($_ =~ /^      Source /) {
                                 $tmpDSXObjectSourceHolder = $_;
                              }
                           }
                        }
                     }
                  }
                  else  {
                     if ($InDSUBinaryBlock) {
                        push(@tmpDSXObjectHolder, $_);
                        if ($_ =~ /^   END DSUBINARY/) {
                           $InDSUBinaryBlock = 0;
                           WriteDSXObjectFile("ROUTINE", \@tmpDSXObjectHolder, $OutPutDirectory,
                                              $DSXObjectName, $DSXCategoryName);
                           if ( $tmpDSXObjectSourceHolder ) {
                              WriteDSXSourceFile("ROUTINE", $tmpDSXObjectSourceHolder, $OutPutDirectory,
                                                 $DSXObjectName, $DSXCategoryName);
                           }
                        }
                        else  {
                           if ($_ =~ /^      COMMENT Record is empty/) {
                              print "*** WARNING: Routine ($DSXObjectName) is missing compiled executable.\n";
                           }
                        }
                     }
                     else  {
                        if ($_ =~ /^   BEGIN DSRECORD/) {
                           $InDSRecordBlock = 1;
                           @tmpDSXObjectHolder = ();
                           push(@tmpDSXObjectHolder, $_);
                           $tmpDSXObjectSourceHolder = "";
                           $DSXCategoryName = "";
                        }
                        else  {
                           if ($_ =~ /^   BEGIN DSUBINARY/) {
                              $InDSUBinaryBlock = 1;
                              push(@tmpDSXObjectHolder, $_);
                           }
                        }
                     }
                  }
               }
            }
            else  {
               if ($_ =~ /^BEGIN DSJOB/) {
                  $InDSJobBlock = 1;
                  @tmpDSXObjectHolder = ();
                  push(@tmpDSXObjectHolder, $_);
                  $tmpDSXObjectSourceHolder = "";
                  $DSXCategoryName = "";
               }
               else  {
                  if ($_ =~ /^BEGIN DSROUTINES/) {
                     $InDSRoutineBlock = 1;
                  }
               }
            }
         }
      }
      close (fhDSXFileName);
   }
}

##############################################################################
# Main

#-- Global variables (constants)
$cStandardDate = "2001-01-01";
$cStandardTime = "01.00.00";
$cStandardServerName = "ServerName";
$cStandardToolInstanceID = "ToolInstanceID";

#-- Local variables
my %DSXObjectList = ();
my $NumArgs = 0;
my $DSXFileName = "";
my $OutPutDirectory = "";
my $Ans = "";

if (getopts('hl:o:svy')) {
   if ( $opt_h ) {
      ShowBlurb();
      exit 2;
   }
   if ( $opt_v ) {
      ShowVersion();
      exit 2;
   }
   $NumArgs = scalar(@ARGV);
   if ( $NumArgs == 1 ) {
      $DSXFileName = $ARGV[0];
      if ( -r $DSXFileName ) {
         if ( $opt_o ) {
            $OutPutDirectory = $opt_o;
         }
         else  {
            $OutPutDirectory = basename($DSXFileName, ".dsx");
         }
         if ( OKToOverWriteOutputDir($OutPutDirectory, $opt_y) ) {
            %DSXObjectList = LoadObjectList($opt_l);
            ParseDSXObjects($DSXFileName, $opt_s, $OutPutDirectory, \@DSXObjectList);
         }
      }
      else  {
         DieWith("Error: Unable to read file ($DSXFileName).");
      }
   }
   else  {
      DieWith("Error: Invalid filespec.");
   }
}
else  {
   DieWith("Error: Invalid options.");
}
CatDSX.pl

Code: Select all

#!/usr/bin/perl
##############################################################################
#
# Program:     CatDSX.pl
#
# Description: See ShowBlurb function below for details
#
# Notes:       @gDSXFileList() array format
#              Column0 - Fully Qualified source DSX file name
#              Column1 - Directory Name portion of Column0
#              Column2 - File Name portion of Column0
#              Column3 - Size in bytes of file pointed to by Column0
#              Column4 - Target CombinedDSXFile Number
#              Column5 - Job type (J-Job|R-Routine)
#
#              This script will create one or more target DSX files that
#              are constructed in the same manner that DataStage would have
#              created them.
#
# === Modification History ===================================================
# Date       Author           Comments
# ---------- --------------- -------------------------------------------------
# 03-27-2003 Steve Boyce     Created.
#
##############################################################################

use Cwd;
use Getopt::Std;
use File::Basename;
use File::Find;

##############################################################################
sub ShowBlurb
{
print <<ENDOFBLURB;
Syntax:      CatDSX.pl -r -s<n> -y -h <CombinedDSXFile> [SourceDSXDir]

Description: Combines individual DSX files in SourceDSXDir into one (or more)
             DSX file(s) suitable for importing into DataStage.

Parameters:  <CombinedDSXFile> Name of DataStage DSX file(s) to create.
             [SourceDSXDir]    Path where individual DSXFiles reside.
                               Optional.  Defaults to current directory.

Options:     -r    Recurse subdirectories
             -s<n> Number of CombinedDSXFiles to create (evenly spreads
                   individual DSXFiles across all CombinedDSXFiles by size).
                   Must be greater than 0.
                   Must be less than total number of jobs and routines found
                   in SourceDSXDir and less than 10.
             -y    Force a "Yes" answer to overwrite existing <CombinedDSXFile>
                   file prompt.
             -h    This help.

Notes:       It is assumed that each DSXFile only has one Job or Routine.
ENDOFBLURB
}

##############################################################################
sub Now
{
   my ($InFormat) = @_;
   my $RetVal = "";
   my ($Seconds, $Minutes, $Hours, $Day, $MonthNumber, $YearNumber, $WeekDayNumber, $DayOfYear, $IsDayLightSavings) = localtime(time);
   my $Year = $YearNumber + 1900;
   my $Month = sprintf("%02d", $MonthNumber + 1);
   $Day = sprintf("%02d", $Day);
   $Hours = sprintf("%02d", $Hours);
   $Minutes = sprintf("%02d", $Minutes);
   $Seconds = sprintf("%02d", $Seconds);

   if    ($InFormat eq "YYYYMMDD")          { $RetVal = "$Year$Month$Day"; }
   elsif ($InFormat eq "YYYY-MM-DD")        { $RetVal = "$Year-$Month-$Day"; }
   elsif ($InFormat eq "DDMMYYYY")          { $RetVal = "$Day$Month$Year"; }
   elsif ($InFormat eq "DD-MM-YYYY")        { $RetVal = "$Day-$Month-$Year"; }
   elsif ($InFormat eq "YYYYMMDD.HH24MISS") { $RetVal = "$Year$Month$Day.$Hours$Minutes$Seconds"; }
   else                                     { $RetVal = "$Year-$Month-$Day $Hours:$Minutes:$Seconds"; }
   return $RetVal;
}

##############################################################################
sub ErrorMessage
{
   my ($MessageLine) = @_;
   print Now()." $MessageLine\n";
   print Now()." Type CatDSX.pl -h for help.\n";
}

##############################################################################
sub ValidSplitOption
{
   my ($opt_s) = @_;
   my $RetVal = $cFalse;

   if ( $opt_s ) {
      #-- Option specified
      if ( $opt_s < 10 ) {
         $RetVal = $cTrue;
         $gNumberOfCombinedDSXFiles = $opt_s;
      }
      else {
         print Now()." Error: NumberOfCombinedDSXFiles option (-s) must be less than 10.\n";
      }
   }
   else {
      #-- Option not specified, assume 1
      $RetVal = $cTrue;
      $gNumberOfCombinedDSXFiles = 1;
   }
   return $RetVal;
}

##############################################################################
sub BuildControlList
{
   my $RetVal = $cTrue;

   sub wanted
   {
      my $DirectoryName;
      my $FileName;
      my $FileSize;
      $File::Find::prune = !$gRecursive;
      if ( -f $File::Find::name ) {
         $DirectoryName = dirname($File::Find::name);
         $FileName = basename($File::Find::name);
         $FileSize = -s $File::Find::name;
         push(@gDSXFileList, [$File::Find::name, $DirectoryName, $FileName, $FileSize, 1, "X"]);
      }
   }

   #-- Can't determine if find returns anything useful
   find(\&wanted, $gSourceDSXDir);

   $RetVal = $#gDSXFileList + 1;

   #-- Return number of files found
   return $RetVal;
}

##############################################################################
sub AssignJobType
{
   my ($NumberOfFiles) = @_;
   my $RetVal = $cTrue;
   my $x = 0;
   my $JobCounter = 0;
   my $RoutineCounter = 0;

   #-- Spin through list of DSX files
   for ($x = 0; $x < $NumberOfFiles; $x++) {
      $JobCounter = 0;
      $RoutineCounter = 0;
      if ( open fhDSXFile, "<".$gDSXFileList[$x][0] ) {
         while (<fhDSXFile>) {
            chop;
            if ($_ =~ /^BEGIN DSJOB/) {
               $JobCounter++;
            }
            if ($_ =~ /^BEGIN DSROUTINES/) {
               $RoutineCounter++;
            }
         }
         close fhDSXFile;
         #-- Update DSX file array
         if ( ($JobCounter + $RoutineCounter) == 1 ) {
            #-- This DSX file has only one job or routine
            if ( $JobCounter == 1 ) {
               $gDSXFileList[$x][5] = "J";
            }
            else {
               $gDSXFileList[$x][5] = "R";
            }
         }
         else {
            print Now()." Error: $gDSXFileList[$x][0] has $JobCounter jobs and $RoutineCounter routines.\n";
            $RetVal = $cFalse;
         }
      }
      else {
         print Now()." Error: Can't open $gDSXFileList[$x][0]\n";
         $RetVal = $cFalse;
         last
      }
   }
   return $RetVal;
}

##############################################################################
sub SortBySize
{
   #-- Bubble sort by size
   my ($NumberOfFiles) = @_;

   my $x = 0;
   my $y = 0;
   my $FQName = "";
   my $DirName = "";
   my $FileName = "";
   my $FileSize = "";
   my $FileNumber = "";
   my $JobType = "";

   for ($x = 0; $x < $NumberOfFiles - 1; $x++) {
      for ($y = $x+1; $y <= $NumberOfFiles - 1 ; $y++) {
         if ( $gDSXFileList[$y][3] > $gDSXFileList[$x][3] ) {
            #-- Swap rows
            $FQName     = $gDSXFileList[$x][0];
            $DirName    = $gDSXFileList[$x][1];
            $FileName   = $gDSXFileList[$x][2];
            $FileSize   = $gDSXFileList[$x][3];
            $FileNumber = $gDSXFileList[$x][4];
            $JobType    = $gDSXFileList[$x][5];

            $gDSXFileList[$x][0] = $gDSXFileList[$y][0];
            $gDSXFileList[$x][1] = $gDSXFileList[$y][1];
            $gDSXFileList[$x][2] = $gDSXFileList[$y][2];
            $gDSXFileList[$x][3] = $gDSXFileList[$y][3];
            $gDSXFileList[$x][4] = $gDSXFileList[$y][4];
            $gDSXFileList[$x][5] = $gDSXFileList[$y][5];

            $gDSXFileList[$y][0] = $FQName;
            $gDSXFileList[$y][1] = $DirName;
            $gDSXFileList[$y][2] = $FileName;
            $gDSXFileList[$y][3] = $FileSize;
            $gDSXFileList[$y][4] = $FileNumber;
            $gDSXFileList[$y][5] = $JobType;
         }
      }
   }
}

##############################################################################
sub AssignTargetDSXFiles
{
   my $RetVal = $cFalse;
   my ($NumberOfFiles) = @_;
   my $TargetFileNumber = 1;
   my $x = 0;

   if ( $NumberOfFiles >= $gNumberOfCombinedDSXFiles ) {
      $RetVal = $cTrue;
      if ( $gNumberOfCombinedDSXFiles > 1 ) {
         for ($x = 0; $x < $NumberOfFiles; $x++) {
            $gDSXFileList[$x][4] = $TargetFileNumber;
            $TargetFileNumber++;
            if ( $TargetFileNumber > $gNumberOfCombinedDSXFiles ) {
               $TargetFileNumber = 1;
            }
         }
      }
   }
   else {
      print Now()." Error: There are fewer DSX Files to process than the NumberOfCombinedDSXFiles option (-s).\n";
   }
   return $RetVal;
}

##############################################################################
sub SortByTargetFile
{
   #-- Bubble sort by TargetFile, JobType, Name
   my ($NumberOfFiles) = @_;

   my $x = 0;
   my $y = 0;
   my $FQName = "";
   my $DirName = "";
   my $FileName = "";
   my $FileSize = "";
   my $FileNumber = "";
   my $JobType = "";

   for ($x = 0; $x < $NumberOfFiles - 1; $x++) {
      for ($y = $x+1; $y <= $NumberOfFiles - 1 ; $y++) {
         if ( ($gDSXFileList[$y][4].$gDSXFileList[$y][5].$gDSXFileList[$y][2]) lt ($gDSXFileList[$x][4].$gDSXFileList[$x][5].$gDSXFileList[$x][2]) ) {
            #-- Swap rows
            $FQName     = $gDSXFileList[$x][0];
            $DirName    = $gDSXFileList[$x][1];
            $FileName   = $gDSXFileList[$x][2];
            $FileSize   = $gDSXFileList[$x][3];
            $FileNumber = $gDSXFileList[$x][4];
            $JobType    = $gDSXFileList[$x][5];

            $gDSXFileList[$x][0] = $gDSXFileList[$y][0];
            $gDSXFileList[$x][1] = $gDSXFileList[$y][1];
            $gDSXFileList[$x][2] = $gDSXFileList[$y][2];
            $gDSXFileList[$x][3] = $gDSXFileList[$y][3];
            $gDSXFileList[$x][4] = $gDSXFileList[$y][4];
            $gDSXFileList[$x][5] = $gDSXFileList[$y][5];

            $gDSXFileList[$y][0] = $FQName;
            $gDSXFileList[$y][1] = $DirName;
            $gDSXFileList[$y][2] = $FileName;
            $gDSXFileList[$y][3] = $FileSize;
            $gDSXFileList[$y][4] = $FileNumber;
            $gDSXFileList[$y][5] = $JobType;
         }
      }
   }
}

##############################################################################
sub OKToOverWriteOutputFile
{
   my ($FileName, $DirectoryName, $SuffixName, $opt_y) = @_;
   my $RetVal = $cFalse;
   my $FirstOutputFile = "";

   if ( $gNumberOfCombinedDSXFiles > 1 ) {
      $FirstOutputFile = $DirectoryName.$FileName."-Part1".$SuffixName;
   }
   else {
      $FirstOutputFile = $DirectoryName.$FileName.$SuffixName;
   }

   if ( -e $FirstOutputFile ) {
      if ( $opt_y ) {
         print Now()." *** Warning: $FirstOutputFile file already exists.  Overwriting anyway.\n";
         $RetVal = $cTrue;
      }
      else {
         print Now()." *** Warning: $FirstOutputFile file already exists.\n";
         print "Proceed anyway? [y|n] ";
         $Ans = <STDIN>;
         chomp($Ans) if ($Ans);
         if ( "$Ans" eq "Y" || "$Ans" eq "y" ) {
            $RetVal = $cTrue;
         }
         else {
            print Now()." Aborting.\n";
         }
      }
   }
   else  {
      $RetVal = $cTrue;
   }
   return $RetVal;
}

##############################################################################
sub WriteDSXHeader
{
   my ($fhOutputFile) = @_;

   print $fhOutputFile "BEGIN HEADER\n";
   print $fhOutputFile "   CharacterSet \"ENGLISH\"\n";
   print $fhOutputFile "   ExportingTool \"Ardent DataStage Export\"\n";
   print $fhOutputFile "   ToolVersion \"3\"\n";
   print $fhOutputFile "   ServerName \"$cStandardServerName\"\n";
   print $fhOutputFile "   ToolInstanceID \"$cStandardToolInstanceID\"\n";
   print $fhOutputFile "   MDISVersion \"1.0\"\n";
   print $fhOutputFile "   Date \"$cStandardDate\"\n";
   print $fhOutputFile "   Time \"$cStandardTime\"\n";
   print $fhOutputFile "END HEADER\n";
}

##############################################################################
sub CreateOutputFile
{
   my ($OutputFile, $TargetFileNumber, $NumberOfFiles) = @_;
   my $RetVal = $cTrue;
   my $x = 0;
   my $IsPastHeader = $cFalse;
   my $IsPastRoutineHeader = $cFalse;
   my $LastJobType = "X";

   #-- Open output file
   if ( open(fhOutputFile, ">$OutputFile" ) ) {
      WriteDSXHeader(\*fhOutputFile);

      #-- Spin through DSX File array
      for ($x = 0; $x < $NumberOfFiles; $x++) {
         #-- See if this DSX File in array is targeted for this output file
         if ( $gDSXFileList[$x][4] == $TargetFileNumber ) {
            #-- This file is targeted to this output file
            #-- See if we are processing the first routine in this output file set
            if ( $gDSXFileList[$x][5] eq "R" && $LastJobType ne "R" ) {
               #-- Must be the first routine
               #-- Write out Routine header
               print fhOutputFile "BEGIN DSROUTINES\n";
            }
            #-- Open DSX input file
            $IsPastHeader = $cFalse;
            $IsPastRoutineHeader = $cFalse;
            if ( open fhDSXFile, "<".$gDSXFileList[$x][0] ) {
               #-- Spin through source DSX file
               while (<fhDSXFile>) {
                  if ( $IsPastHeader ) {
                     #-- Filter out routine headers and footers from source DSX file
                     if ( !(($_ =~ /^BEGIN DSROUTINES/) || ($_ =~ /^END DSROUTINES/)) ) {
                        #-- Not a routine header or footer
                        print fhOutputFile $_;
                     }
                  }
                  else {
                     #-- Spin past header
                     if ( $_ =~ /^END HEADER/ ) {
                        $IsPastHeader = $cTrue;
                     }
                  }
               }
               close fhDSXFile;
            }
            else {
               print Now()." Error: Cannot open $gDSXFileList[$x][0].\n";
               $RetVal = $cFalse;
            }
            $LastJobType = $gDSXFileList[$x][5];
         }
      }
      #-- See if the last source DSX file was a routine
      if ( $LastJobType eq "R" ) {
         #-- Must be a routine
         #-- Write out Routine footer
         print fhOutputFile "END DSROUTINES\n";
      }
      close fhOutputFile;
   }
   else {
      print Now()." Error: Cannot create $OutputFile.\n";
      $RetVal = $cFalse;
   }
   return $RetVal;
}

##############################################################################
sub OutputFileProcess
{
   my ($NumberOfFiles, $opt_y) = @_;
   my $RetVal = $cFalse;
   my ($FileName, $DirectoryName, $SuffixName) = fileparse($gCombinedDSXFile, '\.dsx');
   my $x = 1;
   my $OutputFile = "";

   #-- See if output directory exists
   if ( -d $DirectoryName ) {
      #-- Output directory exists
      if ( OKToOverWriteOutputFile($FileName, $DirectoryName, $SuffixName, $opt_y) ) {
         #-- Write output file(s)
         #-- Spin through output files
         for ($x = 1; $x <= $gNumberOfCombinedDSXFiles; $x++) {
            if ( $gNumberOfCombinedDSXFiles == 1 ) {
               #-- Create one big file
               $OutputFile = $DirectoryName.$FileName.$SuffixName;
            }
            else {
               #-- Split output across multiple files
               $OutputFile = $DirectoryName.$FileName."-Part".$x.$SuffixName;
            }
            print Now()." Creating: $OutputFile...\n";
            if ( CreateOutputFile($OutputFile, $x, $NumberOfFiles) ) {
               $RetVal = $cTrue;
            }
            else {
               last;
            }
         }
      }
   }
   else {
      print Now()." Error: $DirectoryName does not exist.\n";
   }
   return $RetVal;
}

##############################################################################
sub MainProcess
{
   my ($opt_y) = @_;
   my $RetVal = $cFalse;
   my $NumberOfFiles = 0;

   #-- Create Control Array
   print Now()." Gathering list of DSX files...\n";
   $NumberOfFiles = BuildControlList();
   if ( $NumberOfFiles > 0 ) {
      #-- Found some files to process

      #-- Spin through list and determine JobType
      print Now()." Determining job types...\n";
      if ( AssignJobType($NumberOfFiles) ) {

         #-- Sort Array by size only
         print Now()." Sorting DSX file list by Size...\n";
         SortBySize($NumberOfFiles);

         #-- Assign target DSXFiles
         print Now()." Assigning jobs to target output files...\n";
         if ( AssignTargetDSXFiles($NumberOfFiles) ) {

            #-- Sort Array by TargetFile, JobType, Name
            print Now()." Sorting DSX file list by TargetFile, JobType, Name...\n";
            SortByTargetFile($NumberOfFiles);

            #-- Control the process of creating the outupt files
            if ( OutputFileProcess($NumberOfFiles, $opt_y) ) {
               $RetVal = $cTrue;
            }
            else {
               ErrorMessage("Aborting: Cannot create Output files.");
            }
         }
         else {
            ErrorMessage("Aborting: Cannot properly assign target output files.");
         }
      }
      else {
         ErrorMessage("Aborting: One or more DSX files is invalid.");
      }
   }
   else {
      ErrorMessage("Aborting: No files to process in $gSourceDSXDir.");
   }
   return $RetVal;
}

##############################################################################
#-- Main

#-- Global Constants
$cTrue = 1;
$cFalse = 0;
$cOSSuccess = 0;
$cOSFailure = 1;

$cStandardDate = "2001-01-01";
$cStandardTime = "01.00.00";
$cStandardServerName = "ServerName";
$cStandardToolInstanceID = "ToolInstanceID";

#-- Global variables
$gCombinedDSXFile = "";
$gSourceDSXDir = "";
$gNumberOfCombinedDSXFiles = 1;
$gRecursive = $cFalse;

@gDSXFileList = ();

#-- Local variables
my $NumArgs = 0;
my $OSRetVal = $cOSSuccess;

print Now()." Initialization...\n";
if ( getopts('rs:yh') ) {
   if ( ! $opt_h ) {
      $NumArgs = scalar(@ARGV);
      if ( $NumArgs == 1 || $NumArgs == 2 ) {
         $gCombinedDSXFile = $ARGV[0];
         if ( $NumArgs == 2 ) {
            $gSourceDSXDir = $ARGV[1];
         }
         else {
            $gSourceDSXDir = cwd();
         }
         if ( $opt_r ) {
            $gRecursive = $cTrue;
            print Now()." Recursively combining all DSX files found in $gSourceDSXDir...\n";
         }
         else {
            print Now()." Combining all DSX files found in $gSourceDSXDir...\n";
         }
         if ( ValidSplitOption($opt_s) ) {
            if ( $gNumberOfCombinedDSXFiles > 1 ) {
               print Now()." Splitting SourceDSXFiles across $gNumberOfCombinedDSXFiles CombinedDSXFiles...\n";
            }
            #-- All input gathered
            #-- Do main processing
            if ( MainProcess($opt_y) ) {
               print Now()." Complete.\n";

            }
            else {
               $OSRetVal = $cOSFailure;
            }
         }
         else {
            $OSRetVal = $cOSFailure;
            ErrorMessage("Aborting: Invalid split (-s) option.");
         }
      }
      else {
         $OSRetVal = $cOSFailure;
         ErrorMessage("Aborting: Missing ParameterFile or too many parameters.");
      }
   }
   else {
      $OSRetVal = $cOSFailure;
      ShowBlurb();
   }
}
else {
   $OSRetVal = $cOSFailure;
   ErrorMessage("Aborting: Invalid options.");
}
exit $OSRetVal;
Enjoy,
-Steve
victorbos
Participant
Posts: 24
Joined: Tue Jul 15, 2003 2:05 am
Contact:

Post by victorbos »

Thanks a lot Steve, you are a star.
You've deserved eternal glory in the Netherlands :D

Victor.
Teej
Participant
Posts: 677
Joined: Fri Aug 08, 2003 9:26 am
Location: USA

Post by Teej »

Just want to give this thread a little bump since this is helping me quite a bit. :)

-T.J.
Developer of DataStage Parallel Engine (Orchestrate).
kcbland
Participant
Posts: 5208
Joined: Wed Jan 15, 2003 8:56 am
Location: Lutz, FL
Contact:

Post by kcbland »

Steve is the man, checkout his picture at http://www.kennethbland.com His Bag of Tricks is voluminous, his DSX cutter is top notch. He has a complete PVCS integration suite setup. He has point-and-click control, the weakest link in the chain being the command line import/export with DataStage because of no ability to singly export job objects by name. That manual effort is tiresome, so he coded up a full export, dsx explode, then pick out the objects he wanted process. Tool cool. 8)
Kenneth Bland

Rank: Sempai
Belt: First degree black
Fight name: Captain Hook
Signature knockout: right upper cut followed by left hook
Signature submission: Crucifix combined with leg triangle
Teej
Participant
Posts: 677
Joined: Fri Aug 08, 2003 9:26 am
Location: USA

Post by Teej »

Really? That would be awesome, since I am exploring a way to automatically pull selected jobs from the repository, group them up, throw them on the designated server, and compile it. Automatic migration without human intervention, increasing the accuracy rates.

Of course, I found a little flaw on the script -- it strips the parameter's default. Now, if you have a PX job for 6.0.1 or higher, and you happens to define $APT_CONFIG_FILE... whoops, the job won't compile without a default.

Also, there are a number of times where we make a set of jobs with parameters for files and tables that we do not pull from the command line (rather defining them within the Sequencer, and sometimes not even then). Preserving the default values allow for easier configuration.

Ah well, different strokes for different folks.

Now I'm trying to figure out how to automatically compile a job on the Client side (without going inside DataStage BASIC -- migrating this to all 100+ projects (and counting) within the entire corporate would be... painful. :) Not to mention the inevitable bug fixes...)

Anyone know how?

Heck, is that PVCS suite point/click for sale/share? I'll have to see if it can utilize CVS...

-T.J.

P.S. VERY slick website you got there, Ken! Great job! Was half-expecting a picture of your baby on the banner though... ;-)
Developer of DataStage Parallel Engine (Orchestrate).
ogmios
Participant
Posts: 659
Joined: Tue Mar 11, 2003 3:40 pm

Post by ogmios »

Teej wrote:Now I'm trying to figure out how to automatically compile a job on the Client side (without going inside DataStage BASIC -- migrating this to all 100+ projects (and counting) within the entire corporate would be... painful. :) Not to mention the inevitable bug fixes...)

Anyone know how?
Install CompileAllPlus or use version control and you're good to go in v6.x, in v7.x mass compile should be basic functionality in DataStage. This is about how automatic you can get on the client side for now :wink:

Ogmios
Teej
Participant
Posts: 677
Joined: Fri Aug 08, 2003 9:26 am
Location: USA

Post by Teej »

Problem: We have more than just DataStage to archive. We prefers to include everything within a single package. We also love the use of category to distinct jobs.

The script on this one really does the first (and last) step of what we wanted. CompileAllPlus is nice, but I still have to open that thing, find the jobs that was imported, and then compile it. Not fun, and not reliable, and definitely not to trust a production analyst with, requiring another person, and adding yet more dollars to the bottom line.

That's the problem I am facing right now -- finding a low (one-time) cost solution that would minimize the cost (and risks) of migration and archiving.

-T.J.
Developer of DataStage Parallel Engine (Orchestrate).
kduke
Charter Member
Charter Member
Posts: 5227
Joined: Thu May 29, 2003 9:47 am
Location: Dallas, TX
Contact:

Post by kduke »

TJ

Several BASIC compile programs have been posted. Why not add that to Perl or VC.
Mamu Kim
dickspaans
Participant
Posts: 5
Joined: Mon Oct 11, 2004 6:39 am

Post by dickspaans »

This dsx-cutter is great. I can use for version control. Now, i'm heading to the second step.
I want to export a given project. However: the dsjob -lprojects command runs on the server-side and the dscmdexport command runs on the client side. Does anybody have a tool or (Unix)script which is able to export a given and existing project?

thanks, Dick Spaans
tonystark622
Premium Member
Premium Member
Posts: 483
Joined: Thu Jun 12, 2003 4:47 pm
Location: St. Louis, Missouri USA

Post by tonystark622 »

Similarly, I run DataStage on Unix, but need a way to get a list of projects on the client. I currently have a .BAT file that backs up a project. What I need is a way to get the list of projects, so that I can backup each one... I appreciate your help.

Tony
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Tony, there's a batch file that's posted at ADN that will back up all projects on a server. I'm using it and it is working great for me. In combination with command line WinZip, I can keep all of the exports from each night zipped up and taking up much less space than they would otherwise.

It's called DataStageBackup.zip and it is here on ADN.
-craig

"You can never have too many knives" -- Logan Nine Fingers
tonystark622
Premium Member
Premium Member
Posts: 483
Joined: Thu Jun 12, 2003 4:47 pm
Location: St. Louis, Missouri USA

Post by tonystark622 »

Thanks, Craig! I had forgotten about that. I just downloaded it and I'll look at it in a bit.

Tony
Post Reply