I need a DSX-Cutter
Moderators: chulett, rschirm, roy
I need a DSX-Cutter
Hi all,
As said in this thread: viewtopic.php?t=85488
...what I need is a DSX cutter: a little program to cut a DSX file in smal pieces: one file for each object.
A program in Perl would be great because we can integrate it with our versioncontrol tool.
tia,
Victor.
As said in this thread: viewtopic.php?t=85488
...what I need is a DSX cutter: a little program to cut a DSX file in smal pieces: one file for each object.
A program in Perl would be great because we can integrate it with our versioncontrol tool.
tia,
Victor.
-
- Participant
- Posts: 17
- Joined: Sun Sep 14, 2003 6:14 am
Re: I need a DSX-Cutter
why dont you export the DSX in XML format, and then s the XML reader/writer to reorganise it?
Re: I need a DSX-Cutter
I created two perl scripts a while ago to do just this. It's great for source code control since you typically want to do change control at the job/routine level.
ParseDSX.pl will split a DSX file into separate DSX files, one per job/routine in the same folder structure that they appear in DataStage.
CatDSX goes the other way. It combines multiple DSX files into one or more equally distributed DSX files suitable for migration importing.
I was going to spend some time documenting these before posting it because the ParseDSX script does a few more things that you may or may not like.
A few things you need to know about ParseDSX:
I wanted the ability to be able to do comparisions between versions of checked in DSX files (jobs or routines) with freshly exported version from production projects. This ability is crucial in order to be able to audit your source code repository and confirm that your migration procedures are working and that "freelance" edits are not happening in production. There are data elements in the DSX that are generated at export time that ParseDSX will mask out with generic values such as export date time, job last edit date time etc. This allows you to do file compares without getting a lot of "false positive" changes.
Also, all parameter values are stripped out of jobs that do not begin with "PROTOTYPE" in the name. This is because these script were written to dovetail with Ken Blands job control.
I really need to do a more thorough job of documenting these and will at some point. But for now, I would suggest reviewing the code and making modifications where you like.
These were developed against DataStage 5.1 DSX export files.
Here they are:
ParseDSX.pl
CatDSX.pl
Enjoy,
-Steve
ParseDSX.pl will split a DSX file into separate DSX files, one per job/routine in the same folder structure that they appear in DataStage.
CatDSX goes the other way. It combines multiple DSX files into one or more equally distributed DSX files suitable for migration importing.
I was going to spend some time documenting these before posting it because the ParseDSX script does a few more things that you may or may not like.
A few things you need to know about ParseDSX:
I wanted the ability to be able to do comparisions between versions of checked in DSX files (jobs or routines) with freshly exported version from production projects. This ability is crucial in order to be able to audit your source code repository and confirm that your migration procedures are working and that "freelance" edits are not happening in production. There are data elements in the DSX that are generated at export time that ParseDSX will mask out with generic values such as export date time, job last edit date time etc. This allows you to do file compares without getting a lot of "false positive" changes.
Also, all parameter values are stripped out of jobs that do not begin with "PROTOTYPE" in the name. This is because these script were written to dovetail with Ken Blands job control.
I really need to do a more thorough job of documenting these and will at some point. But for now, I would suggest reviewing the code and making modifications where you like.
These were developed against DataStage 5.1 DSX export files.
Here they are:
ParseDSX.pl
Code: Select all
#!/usr/bin/perl
##############################################################################
#
# Program: ParseDSX.pl
#
# Description: See ShowBlurb function below for details
#
# === Modification History ===================================================
# Date Author Comments
# ---------- --------------- -------------------------------------------------
# 07-18-2002 Steve Boyce Created.
# 08-21-2002 Steve Boyce Exporting routines now includes binary info.
# 08-28-2002 Steve Boyce Corrected bug relating to jobs and routines
# located in the root folder. They now get
# created in the correct location.
# 08-28-2002 Steve Boyce Changed default output directory to be the name
# of the dsx file being parsed without the
# extension.
# -s option now works.
# 10-04-2002 Steve Boyce Eliminated -c option. That now the default
# and only behavior.
# Default Parmameter metadata is now stripped out
# of all jobs except any jobs that have PROTOTYPE
# or Batch:UTIL in the name.
# Routines are unaffected.
# 11-12-2002 Steve Boyce Added Version dipsplay option.
# Corrected source code generation bug.
# 02-27-2003 Steve Boyce Added -x option to strip out ValidationStatus
# 03-21-2003 Steve Boyce Stripping out ValidationStatus is now default
# behavior.
# 04-29-2003 Steve Boyce Bumped version number
#
##############################################################################
use Getopt::Std;
use File::Basename;
my $version="2.1.00";
##############################################################################
sub ShowBlurb
{
print <<ENDOFBLURB;
Syntax: ParseDSX.pl -h -l<ListFile> -o<OutputDir> -s -v -y <DSXFile>
Version: $version
Description: Extracts individual jobs and routines from a DataStage export
file.
Parameters: <DSXFile> Name of DataStage DSX file to parse. This file is
assumed to be generated from the DataStage export
process.
Options: -l job/routine list file. (future enhancement)
This file contains a list of jobs and routines to extract
from the <DSXFile>.
-o Explicitly specify <OutputDir> directory.
Default is the name of the parsed dsx file without the
extension in the current directory.
-s Extract job "Job Control Code" and routine "source"
code into "source files".
<job>.src and <routine>.src
These will appear in the same directory as the generated
dsx files in the <OutputDir> directory.
-v Display version information.
-y Force a "Yes" answer to overwrite existing <OutputDir>
directory prompt.
-h This help.
Notes: Job and routine names are case sensitive in DataStage. Extracted
jobs and routines are placed in file names constructed based on
job or routine names. Running this utility on the Windows
platform will ignore case and possibly consider some jobs and
routines duplicates when the UNIX platform will not.
It is a good practice to not rely on case as a differentiator
for file names.
ENDOFBLURB
}
##############################################################################
sub ShowVersion
{
print <<ENDOFBLURB;
ParseDSX.pl Version $version
ENDOFBLURB
}
##############################################################################
sub DieWith
{
my ($MessageLine) = @_;
print "$MessageLine\nType ParseDSX.pl -h for help.\n";
exit 1;
}
##############################################################################
sub OKToOverWriteOutputDir
{
my ($OutPutDirectory, $opt_y) = @_;
my $RetVal = 0;
if ( -e $OutPutDirectory ) {
if ( $opt_y ) {
print "*** Warning: <OutputDir> directory ($OutPutDirectory) already exists. Using anyway.\n";
$RetVal = 1;
}
else {
print "*** Warning: <OutputDir> directory ($OutPutDirectory) already exists.\n";
print "Proceed anyway? [y|n] ";
$Ans = <STDIN>;
chomp($Ans) if ($Ans);
if ( "$Ans" eq "Y" || "$Ans" eq "y" ) {
$RetVal = 1;
}
else {
DieWith("Aborting.");
}
}
}
else {
if ( MakeDir($OutPutDirectory, 777) ) {
$RetVal = 1;
}
else {
DieWith("Error: Could not create ($OutPutDirectory) directory");
}
}
return $RetVal;
}
##############################################################################
sub LoadObjectList
{
my ($DSXListFile) = @_;
my %DSXObjectList = ();
if ( $DSXListFile ) {
if (open fhDSXListFile, "<".$DSXListFile) {
while (<fhDSXListFile>) {
chop;
#-- Push line onto array
$DSXObjectList{$_} = 1;
}
close fhDSXListFile;
}
else {
DieWith("Error: Can't open $DSXListFile");
}
while ( ($key,$value) = each %DSXObjectList ) {
print "$key=$value\n";
}
}
return %DSXObjectList;
}
##############################################################################
sub MakeDir
{
my ($FullDirPath, $Mode) = @_;
my @DirList = ();
my $PartialDirPath = "";
my $RetVal = 1;
$FullDirPath =~ tr/\\/\//;
@DirList = split(/\//,$FullDirPath);
foreach $Directory ( @DirList ) {
$PartialDirPath = $PartialDirPath . $Directory. "/" ;
if ( ! (length($PartialDirPath) == 3 && substr($PartialDirPath, 1, 2) eq ":/") ) {
if ( ! -e $PartialDirPath ) {
if ( ! mkdir($PartialDirPath, $Mode) ) {
$RetVal = 0;
}
}
}
}
return $RetVal;
}
##############################################################################
sub ParseQuotedString
{
my ($InputLine) = @_;
my $FirstQuotePos = 0;
my $SecondQuotePos = 0;
my $Length = 0;
$FirstQuotePos = index($InputLine, '"');
$SecondQuotePos = index($InputLine, '"', $FirstQuotePos+1);
$Length = $SecondQuotePos - $FirstQuotePos;
return substr($InputLine, $FirstQuotePos + 1, $Length - 1);
}
##############################################################################
sub MakeDuplicateName
{
my ($OriginalName) = @_;
my $NewName = "";
my $DupSuffix = 1;
$NewName = $OriginalName . "_dup" . "$DupSuffix";
while ( -e $NewName ) {
if ( $DupSuffix > 99 ) {
DieWith("Error: There seems to be more than 99 duplicate jobs or routines.\n");
}
$DupSuffix += 1;
$NewName = $OriginalName . "_dup" . "$DupSuffix";
}
return $NewName;
}
##############################################################################
sub WriteDSXHeader
{
my ($fhOutputFile) = @_;
print $fhOutputFile "BEGIN HEADER\n";
print $fhOutputFile " CharacterSet \"ENGLISH\"\n";
print $fhOutputFile " ExportingTool \"Ardent DataStage Export\"\n";
print $fhOutputFile " ToolVersion \"3\"\n";
print $fhOutputFile " ServerName \"$cStandardServerName\"\n";
print $fhOutputFile " ToolInstanceID \"$cStandardToolInstanceID\"\n";
print $fhOutputFile " MDISVersion \"1.0\"\n";
print $fhOutputFile " Date \"$cStandardDate\"\n";
print $fhOutputFile " Time \"$cStandardTime\"\n";
print $fhOutputFile "END HEADER\n";
}
##############################################################################
sub WriteDSXObjectFile
{
my ($ObjectType, $tmpDSXObjectHolder, $OutPutDirectory, $DSXObjectName, $DSXCategoryName) = @_;
my $x = 0;
my $TranslatedDSXObjectName = "";
my $TranslatedCategorytName = "";
my $OutputFileName = "";
my $OutputLine = "";
my $WriteLine = 1;
$TranslatedDSXObjectName = $DSXObjectName;
$TranslatedDSXObjectName =~ tr/:/_/;
$TranslatedDSXObjectName =~ tr/ /_/;
if ($ObjectType eq "JOB") {
$OutPutDirectory = $OutPutDirectory . "/jobs";
if ( ! -e $OutPutDirectory ) {
if ( ! MakeDir($OutPutDirectory, 777) ) {
DieWith("Error: Could not create directory: $OutPutDirectory");
}
}
}
else {
$OutPutDirectory = $OutPutDirectory . "/routines";
if ( ! -e $OutPutDirectory ) {
if ( ! MakeDir($OutPutDirectory, 777) ) {
DieWith("Error: Could not create directory: $OutPutDirectory");
}
}
}
if ($DSXCategoryName) {
$TranslatedCategoryName = $DSXCategoryName;
$TranslatedCategoryName =~ tr/ /_/;
$TranslatedCategoryName =~ tr/\\/\//s;
$OutPutDirectory = $OutPutDirectory . "/" . $TranslatedCategoryName;
if ( ! -e $OutPutDirectory ) {
if ( ! MakeDir($OutPutDirectory, 777) ) {
DieWith("Error: Could not create directory: $OutPutDirectory");
}
}
}
$OutputFileName = $OutPutDirectory . "/" . $TranslatedDSXObjectName . ".dsx";
print "Writing File: $OutputFileName...\n";
if ( -e $OutputFileName ) {
print "*** WARNING: Job/Routine output DSX file ($OutputFileName) already exists. Creating duplicate.\n";
$OutputFileName = MakeDuplicateName($OutputFileName);
}
if (open (fhOutputFile, ">$OutputFileName")) {
WriteDSXHeader(\*fhOutputFile);
if ($ObjectType eq "ROUTINE") {
print fhOutputFile "BEGIN DSROUTINES\n";
}
while ( $$tmpDSXObjectHolder[$x] ) {
$OutputLine = $$tmpDSXObjectHolder[$x];
$WriteLine = 1;
#-- Filter ValidationStatus metadata out
#-- This metadata seems to be intermitent with no value added.
if ($OutputLine =~ /^.*ValidationStatus /) {
$WriteLine = 0;
}
if ($WriteLine) {
#-- Normalize dates and times
if ($OutputLine =~ /^ {3,6}DateModified /) {
$OutputLine =~ s/\".{10}\"/\"$cStandardDate\"/;
}
else {
if ($OutputLine =~ /^ {3,6}TimeModified /) {
$OutputLine =~ s/\".{8}\"/\"$cStandardTime\"/;
}
}
#-- Send the line to the output file
print fhOutputFile "$OutputLine";
}
$x = $x + 1;
}
if ($ObjectType eq "ROUTINE") {
print fhOutputFile "END DSROUTINES\n";
}
close fhOutputFile;
}
}
##############################################################################
sub WriteDSXSourceFile
{
my ($ObjectType, $tmpDSXObjectSourceHolder, $OutPutDirectory, $DSXObjectName, $DSXCategoryName) = @_;
my $x = 0;
my $TranslatedDSXSourceName = "";
my $TranslatedCategorytName = "";
my $OutputFileName = "";
my $OutputLine = "";
$TranslatedDSXSourceName = $DSXObjectName;
$TranslatedDSXSourceName =~ tr/:/_/;
$TranslatedDSXSourceName =~ tr/ /_/;
if ($ObjectType eq "JOB") {
$OutPutDirectory = $OutPutDirectory . "/jobs";
$SourceKeyword = "JobControlCode";
}
else {
$OutPutDirectory = $OutPutDirectory . "/routines";
$SourceKeyword = "Source";
}
if ($DSXCategoryName) {
$TranslatedCategoryName = $DSXCategoryName;
$TranslatedCategoryName =~ tr/ /_/;
$TranslatedCategoryName =~ tr/\\/\//s;
$OutPutDirectory = $OutPutDirectory . "/" . $TranslatedCategoryName;
}
$OutputFileName = $OutPutDirectory . "/" . $TranslatedDSXSourceName . ".src";
if ( -e $OutputFileName ) {
$OutputFileName = MakeDuplicateName($OutputFileName);
}
#-- Convert single line encoded source code to properly formated code
#-- Chop off trailing 6 spaces after every CR-LF (really leading 6 spaces)
#-- Convert "symbolic CR-LF to real CR-LF
$tmpDSXObjectSourceHolder =~ s/\\\(D\)\\\(A\)/\n/g;
#-- Chop off leading keyword - either Source " or JobControlCode "
$tmpDSXObjectSourceHolder =~ s/^ *$SourceKeyword "//;
#-- Chop off trailing quote
$tmpDSXObjectSourceHolder =~ s/\" *$//;
#-- Replace all \" with "
$tmpDSXObjectSourceHolder =~ s/\\\"/\"/g;
#-- replace all \\ with \
$tmpDSXObjectSourceHolder =~ s/\\\\/\\/g;
if (open (fhOutputFile, ">$OutputFileName")) {
print fhOutputFile $tmpDSXObjectSourceHolder;
close fhOutputFile;
}
}
##############################################################################
sub OKToStripDefaultValue
{
my ($DSXObjectName, $DSParameterName) = @_;
my $RetVal = 0;
if (! ($DSXObjectName =~ /Batch::UTIL/) ) {
if (! ($DSXObjectName =~ /PROTOTYPE/) ) {
if (! ($DSParameterName eq "JobName" or $DSParameterName eq "PartitionNumber" or $DSParameterName eq "PartitionCount" ) ) {
$RetVal = 1;
}
}
}
return $RetVal
}
##############################################################################
sub ParseDSXObjects
{
my ($DSXFileName, $Greppize, $OutPutDirectory, $DSXObjectList) = @_;
my @tmpDSXObjectHolder = ();
my $tmpDSXObjectSourceHolder = "";
my $DSXObjectName = "";
my $DSXCategoryName = "";
my $InDSJobBlock = 0;;
my $InDSRoutineBlock = 0;
my $InDSRecordBlock = 0;
my $InDSSubRecordBlock = 0;
my $InDSUBinaryBlock = 0;
my $DSParameterName = "";
if (open fhDSXFileName, "<".$DSXFileName) {
while (<fhDSXFileName>) {
if ($InDSJobBlock) {
push(@tmpDSXObjectHolder, $_);
if ($_ =~ /^END DSJOB/) {
$InDSJobBlock = 0;
WriteDSXObjectFile("JOB", \@tmpDSXObjectHolder, $OutPutDirectory,
$DSXObjectName, $DSXCategoryName);
if ( $tmpDSXObjectSourceHolder ) {
WriteDSXSourceFile("JOB", $tmpDSXObjectSourceHolder, $OutPutDirectory,
$DSXObjectName, $DSXCategoryName);
}
}
else {
if ($InDSRecordBlock) {
if ($_ =~ /^ END DSRECORD/) {
$InDSRecordBlock = 0;
}
else {
if ($InDSSubRecordBlock) {
if ($_ =~ /^ END DSSUBRECORD/) {
$InDSSubRecordBlock = 0;
}
else {
if ($_ =~ /^ Name/) {
$DSParameterName = ParseQuotedString($_);
}
if ($_ =~ /^ Default/) {
if (OKToStripDefaultValue($DSXObjectName, $DSParameterName)) {
pop(@tmpDSXObjectHolder);
}
}
}
}
else {
if ($_ =~ /^ BEGIN DSSUBRECORD/) {
$InDSSubRecordBlock = 1;
}
else {
if ($_ =~ /^ Category /) {
$DSXCategoryName = ParseQuotedString($_);
}
if ($Greppize) {
if ($_ =~ /^ JobControlCode /) {
$tmpDSXObjectSourceHolder = $_;
}
}
}
}
}
}
else {
if ($_ =~ /^ BEGIN DSRECORD/) {
$InDSRecordBlock = 1;
}
else {
if ($_ =~ /^ Identifier /) {
$DSXObjectName = ParseQuotedString($_);
}
}
}
}
}
else {
if ($InDSRoutineBlock) {
if ($_ =~ /^END DSROUTINES/) {
$InDSRoutineBlock = 0;
}
else {
if ($InDSRecordBlock) {
push(@tmpDSXObjectHolder, $_);
if ($_ =~ /^ END DSRECORD/) {
$InDSRecordBlock = 0;
}
else {
if ($_ =~ /^ Identifier /) {
$DSXObjectName = ParseQuotedString($_);
}
else {
if ($_ =~ /^ Category /) {
$DSXCategoryName = ParseQuotedString($_);
}
if ($Greppize) {
if ($_ =~ /^ Source /) {
$tmpDSXObjectSourceHolder = $_;
}
}
}
}
}
else {
if ($InDSUBinaryBlock) {
push(@tmpDSXObjectHolder, $_);
if ($_ =~ /^ END DSUBINARY/) {
$InDSUBinaryBlock = 0;
WriteDSXObjectFile("ROUTINE", \@tmpDSXObjectHolder, $OutPutDirectory,
$DSXObjectName, $DSXCategoryName);
if ( $tmpDSXObjectSourceHolder ) {
WriteDSXSourceFile("ROUTINE", $tmpDSXObjectSourceHolder, $OutPutDirectory,
$DSXObjectName, $DSXCategoryName);
}
}
else {
if ($_ =~ /^ COMMENT Record is empty/) {
print "*** WARNING: Routine ($DSXObjectName) is missing compiled executable.\n";
}
}
}
else {
if ($_ =~ /^ BEGIN DSRECORD/) {
$InDSRecordBlock = 1;
@tmpDSXObjectHolder = ();
push(@tmpDSXObjectHolder, $_);
$tmpDSXObjectSourceHolder = "";
$DSXCategoryName = "";
}
else {
if ($_ =~ /^ BEGIN DSUBINARY/) {
$InDSUBinaryBlock = 1;
push(@tmpDSXObjectHolder, $_);
}
}
}
}
}
}
else {
if ($_ =~ /^BEGIN DSJOB/) {
$InDSJobBlock = 1;
@tmpDSXObjectHolder = ();
push(@tmpDSXObjectHolder, $_);
$tmpDSXObjectSourceHolder = "";
$DSXCategoryName = "";
}
else {
if ($_ =~ /^BEGIN DSROUTINES/) {
$InDSRoutineBlock = 1;
}
}
}
}
}
close (fhDSXFileName);
}
}
##############################################################################
# Main
#-- Global variables (constants)
$cStandardDate = "2001-01-01";
$cStandardTime = "01.00.00";
$cStandardServerName = "ServerName";
$cStandardToolInstanceID = "ToolInstanceID";
#-- Local variables
my %DSXObjectList = ();
my $NumArgs = 0;
my $DSXFileName = "";
my $OutPutDirectory = "";
my $Ans = "";
if (getopts('hl:o:svy')) {
if ( $opt_h ) {
ShowBlurb();
exit 2;
}
if ( $opt_v ) {
ShowVersion();
exit 2;
}
$NumArgs = scalar(@ARGV);
if ( $NumArgs == 1 ) {
$DSXFileName = $ARGV[0];
if ( -r $DSXFileName ) {
if ( $opt_o ) {
$OutPutDirectory = $opt_o;
}
else {
$OutPutDirectory = basename($DSXFileName, ".dsx");
}
if ( OKToOverWriteOutputDir($OutPutDirectory, $opt_y) ) {
%DSXObjectList = LoadObjectList($opt_l);
ParseDSXObjects($DSXFileName, $opt_s, $OutPutDirectory, \@DSXObjectList);
}
}
else {
DieWith("Error: Unable to read file ($DSXFileName).");
}
}
else {
DieWith("Error: Invalid filespec.");
}
}
else {
DieWith("Error: Invalid options.");
}
Code: Select all
#!/usr/bin/perl
##############################################################################
#
# Program: CatDSX.pl
#
# Description: See ShowBlurb function below for details
#
# Notes: @gDSXFileList() array format
# Column0 - Fully Qualified source DSX file name
# Column1 - Directory Name portion of Column0
# Column2 - File Name portion of Column0
# Column3 - Size in bytes of file pointed to by Column0
# Column4 - Target CombinedDSXFile Number
# Column5 - Job type (J-Job|R-Routine)
#
# This script will create one or more target DSX files that
# are constructed in the same manner that DataStage would have
# created them.
#
# === Modification History ===================================================
# Date Author Comments
# ---------- --------------- -------------------------------------------------
# 03-27-2003 Steve Boyce Created.
#
##############################################################################
use Cwd;
use Getopt::Std;
use File::Basename;
use File::Find;
##############################################################################
sub ShowBlurb
{
print <<ENDOFBLURB;
Syntax: CatDSX.pl -r -s<n> -y -h <CombinedDSXFile> [SourceDSXDir]
Description: Combines individual DSX files in SourceDSXDir into one (or more)
DSX file(s) suitable for importing into DataStage.
Parameters: <CombinedDSXFile> Name of DataStage DSX file(s) to create.
[SourceDSXDir] Path where individual DSXFiles reside.
Optional. Defaults to current directory.
Options: -r Recurse subdirectories
-s<n> Number of CombinedDSXFiles to create (evenly spreads
individual DSXFiles across all CombinedDSXFiles by size).
Must be greater than 0.
Must be less than total number of jobs and routines found
in SourceDSXDir and less than 10.
-y Force a "Yes" answer to overwrite existing <CombinedDSXFile>
file prompt.
-h This help.
Notes: It is assumed that each DSXFile only has one Job or Routine.
ENDOFBLURB
}
##############################################################################
sub Now
{
my ($InFormat) = @_;
my $RetVal = "";
my ($Seconds, $Minutes, $Hours, $Day, $MonthNumber, $YearNumber, $WeekDayNumber, $DayOfYear, $IsDayLightSavings) = localtime(time);
my $Year = $YearNumber + 1900;
my $Month = sprintf("%02d", $MonthNumber + 1);
$Day = sprintf("%02d", $Day);
$Hours = sprintf("%02d", $Hours);
$Minutes = sprintf("%02d", $Minutes);
$Seconds = sprintf("%02d", $Seconds);
if ($InFormat eq "YYYYMMDD") { $RetVal = "$Year$Month$Day"; }
elsif ($InFormat eq "YYYY-MM-DD") { $RetVal = "$Year-$Month-$Day"; }
elsif ($InFormat eq "DDMMYYYY") { $RetVal = "$Day$Month$Year"; }
elsif ($InFormat eq "DD-MM-YYYY") { $RetVal = "$Day-$Month-$Year"; }
elsif ($InFormat eq "YYYYMMDD.HH24MISS") { $RetVal = "$Year$Month$Day.$Hours$Minutes$Seconds"; }
else { $RetVal = "$Year-$Month-$Day $Hours:$Minutes:$Seconds"; }
return $RetVal;
}
##############################################################################
sub ErrorMessage
{
my ($MessageLine) = @_;
print Now()." $MessageLine\n";
print Now()." Type CatDSX.pl -h for help.\n";
}
##############################################################################
sub ValidSplitOption
{
my ($opt_s) = @_;
my $RetVal = $cFalse;
if ( $opt_s ) {
#-- Option specified
if ( $opt_s < 10 ) {
$RetVal = $cTrue;
$gNumberOfCombinedDSXFiles = $opt_s;
}
else {
print Now()." Error: NumberOfCombinedDSXFiles option (-s) must be less than 10.\n";
}
}
else {
#-- Option not specified, assume 1
$RetVal = $cTrue;
$gNumberOfCombinedDSXFiles = 1;
}
return $RetVal;
}
##############################################################################
sub BuildControlList
{
my $RetVal = $cTrue;
sub wanted
{
my $DirectoryName;
my $FileName;
my $FileSize;
$File::Find::prune = !$gRecursive;
if ( -f $File::Find::name ) {
$DirectoryName = dirname($File::Find::name);
$FileName = basename($File::Find::name);
$FileSize = -s $File::Find::name;
push(@gDSXFileList, [$File::Find::name, $DirectoryName, $FileName, $FileSize, 1, "X"]);
}
}
#-- Can't determine if find returns anything useful
find(\&wanted, $gSourceDSXDir);
$RetVal = $#gDSXFileList + 1;
#-- Return number of files found
return $RetVal;
}
##############################################################################
sub AssignJobType
{
my ($NumberOfFiles) = @_;
my $RetVal = $cTrue;
my $x = 0;
my $JobCounter = 0;
my $RoutineCounter = 0;
#-- Spin through list of DSX files
for ($x = 0; $x < $NumberOfFiles; $x++) {
$JobCounter = 0;
$RoutineCounter = 0;
if ( open fhDSXFile, "<".$gDSXFileList[$x][0] ) {
while (<fhDSXFile>) {
chop;
if ($_ =~ /^BEGIN DSJOB/) {
$JobCounter++;
}
if ($_ =~ /^BEGIN DSROUTINES/) {
$RoutineCounter++;
}
}
close fhDSXFile;
#-- Update DSX file array
if ( ($JobCounter + $RoutineCounter) == 1 ) {
#-- This DSX file has only one job or routine
if ( $JobCounter == 1 ) {
$gDSXFileList[$x][5] = "J";
}
else {
$gDSXFileList[$x][5] = "R";
}
}
else {
print Now()." Error: $gDSXFileList[$x][0] has $JobCounter jobs and $RoutineCounter routines.\n";
$RetVal = $cFalse;
}
}
else {
print Now()." Error: Can't open $gDSXFileList[$x][0]\n";
$RetVal = $cFalse;
last
}
}
return $RetVal;
}
##############################################################################
sub SortBySize
{
#-- Bubble sort by size
my ($NumberOfFiles) = @_;
my $x = 0;
my $y = 0;
my $FQName = "";
my $DirName = "";
my $FileName = "";
my $FileSize = "";
my $FileNumber = "";
my $JobType = "";
for ($x = 0; $x < $NumberOfFiles - 1; $x++) {
for ($y = $x+1; $y <= $NumberOfFiles - 1 ; $y++) {
if ( $gDSXFileList[$y][3] > $gDSXFileList[$x][3] ) {
#-- Swap rows
$FQName = $gDSXFileList[$x][0];
$DirName = $gDSXFileList[$x][1];
$FileName = $gDSXFileList[$x][2];
$FileSize = $gDSXFileList[$x][3];
$FileNumber = $gDSXFileList[$x][4];
$JobType = $gDSXFileList[$x][5];
$gDSXFileList[$x][0] = $gDSXFileList[$y][0];
$gDSXFileList[$x][1] = $gDSXFileList[$y][1];
$gDSXFileList[$x][2] = $gDSXFileList[$y][2];
$gDSXFileList[$x][3] = $gDSXFileList[$y][3];
$gDSXFileList[$x][4] = $gDSXFileList[$y][4];
$gDSXFileList[$x][5] = $gDSXFileList[$y][5];
$gDSXFileList[$y][0] = $FQName;
$gDSXFileList[$y][1] = $DirName;
$gDSXFileList[$y][2] = $FileName;
$gDSXFileList[$y][3] = $FileSize;
$gDSXFileList[$y][4] = $FileNumber;
$gDSXFileList[$y][5] = $JobType;
}
}
}
}
##############################################################################
sub AssignTargetDSXFiles
{
my $RetVal = $cFalse;
my ($NumberOfFiles) = @_;
my $TargetFileNumber = 1;
my $x = 0;
if ( $NumberOfFiles >= $gNumberOfCombinedDSXFiles ) {
$RetVal = $cTrue;
if ( $gNumberOfCombinedDSXFiles > 1 ) {
for ($x = 0; $x < $NumberOfFiles; $x++) {
$gDSXFileList[$x][4] = $TargetFileNumber;
$TargetFileNumber++;
if ( $TargetFileNumber > $gNumberOfCombinedDSXFiles ) {
$TargetFileNumber = 1;
}
}
}
}
else {
print Now()." Error: There are fewer DSX Files to process than the NumberOfCombinedDSXFiles option (-s).\n";
}
return $RetVal;
}
##############################################################################
sub SortByTargetFile
{
#-- Bubble sort by TargetFile, JobType, Name
my ($NumberOfFiles) = @_;
my $x = 0;
my $y = 0;
my $FQName = "";
my $DirName = "";
my $FileName = "";
my $FileSize = "";
my $FileNumber = "";
my $JobType = "";
for ($x = 0; $x < $NumberOfFiles - 1; $x++) {
for ($y = $x+1; $y <= $NumberOfFiles - 1 ; $y++) {
if ( ($gDSXFileList[$y][4].$gDSXFileList[$y][5].$gDSXFileList[$y][2]) lt ($gDSXFileList[$x][4].$gDSXFileList[$x][5].$gDSXFileList[$x][2]) ) {
#-- Swap rows
$FQName = $gDSXFileList[$x][0];
$DirName = $gDSXFileList[$x][1];
$FileName = $gDSXFileList[$x][2];
$FileSize = $gDSXFileList[$x][3];
$FileNumber = $gDSXFileList[$x][4];
$JobType = $gDSXFileList[$x][5];
$gDSXFileList[$x][0] = $gDSXFileList[$y][0];
$gDSXFileList[$x][1] = $gDSXFileList[$y][1];
$gDSXFileList[$x][2] = $gDSXFileList[$y][2];
$gDSXFileList[$x][3] = $gDSXFileList[$y][3];
$gDSXFileList[$x][4] = $gDSXFileList[$y][4];
$gDSXFileList[$x][5] = $gDSXFileList[$y][5];
$gDSXFileList[$y][0] = $FQName;
$gDSXFileList[$y][1] = $DirName;
$gDSXFileList[$y][2] = $FileName;
$gDSXFileList[$y][3] = $FileSize;
$gDSXFileList[$y][4] = $FileNumber;
$gDSXFileList[$y][5] = $JobType;
}
}
}
}
##############################################################################
sub OKToOverWriteOutputFile
{
my ($FileName, $DirectoryName, $SuffixName, $opt_y) = @_;
my $RetVal = $cFalse;
my $FirstOutputFile = "";
if ( $gNumberOfCombinedDSXFiles > 1 ) {
$FirstOutputFile = $DirectoryName.$FileName."-Part1".$SuffixName;
}
else {
$FirstOutputFile = $DirectoryName.$FileName.$SuffixName;
}
if ( -e $FirstOutputFile ) {
if ( $opt_y ) {
print Now()." *** Warning: $FirstOutputFile file already exists. Overwriting anyway.\n";
$RetVal = $cTrue;
}
else {
print Now()." *** Warning: $FirstOutputFile file already exists.\n";
print "Proceed anyway? [y|n] ";
$Ans = <STDIN>;
chomp($Ans) if ($Ans);
if ( "$Ans" eq "Y" || "$Ans" eq "y" ) {
$RetVal = $cTrue;
}
else {
print Now()." Aborting.\n";
}
}
}
else {
$RetVal = $cTrue;
}
return $RetVal;
}
##############################################################################
sub WriteDSXHeader
{
my ($fhOutputFile) = @_;
print $fhOutputFile "BEGIN HEADER\n";
print $fhOutputFile " CharacterSet \"ENGLISH\"\n";
print $fhOutputFile " ExportingTool \"Ardent DataStage Export\"\n";
print $fhOutputFile " ToolVersion \"3\"\n";
print $fhOutputFile " ServerName \"$cStandardServerName\"\n";
print $fhOutputFile " ToolInstanceID \"$cStandardToolInstanceID\"\n";
print $fhOutputFile " MDISVersion \"1.0\"\n";
print $fhOutputFile " Date \"$cStandardDate\"\n";
print $fhOutputFile " Time \"$cStandardTime\"\n";
print $fhOutputFile "END HEADER\n";
}
##############################################################################
sub CreateOutputFile
{
my ($OutputFile, $TargetFileNumber, $NumberOfFiles) = @_;
my $RetVal = $cTrue;
my $x = 0;
my $IsPastHeader = $cFalse;
my $IsPastRoutineHeader = $cFalse;
my $LastJobType = "X";
#-- Open output file
if ( open(fhOutputFile, ">$OutputFile" ) ) {
WriteDSXHeader(\*fhOutputFile);
#-- Spin through DSX File array
for ($x = 0; $x < $NumberOfFiles; $x++) {
#-- See if this DSX File in array is targeted for this output file
if ( $gDSXFileList[$x][4] == $TargetFileNumber ) {
#-- This file is targeted to this output file
#-- See if we are processing the first routine in this output file set
if ( $gDSXFileList[$x][5] eq "R" && $LastJobType ne "R" ) {
#-- Must be the first routine
#-- Write out Routine header
print fhOutputFile "BEGIN DSROUTINES\n";
}
#-- Open DSX input file
$IsPastHeader = $cFalse;
$IsPastRoutineHeader = $cFalse;
if ( open fhDSXFile, "<".$gDSXFileList[$x][0] ) {
#-- Spin through source DSX file
while (<fhDSXFile>) {
if ( $IsPastHeader ) {
#-- Filter out routine headers and footers from source DSX file
if ( !(($_ =~ /^BEGIN DSROUTINES/) || ($_ =~ /^END DSROUTINES/)) ) {
#-- Not a routine header or footer
print fhOutputFile $_;
}
}
else {
#-- Spin past header
if ( $_ =~ /^END HEADER/ ) {
$IsPastHeader = $cTrue;
}
}
}
close fhDSXFile;
}
else {
print Now()." Error: Cannot open $gDSXFileList[$x][0].\n";
$RetVal = $cFalse;
}
$LastJobType = $gDSXFileList[$x][5];
}
}
#-- See if the last source DSX file was a routine
if ( $LastJobType eq "R" ) {
#-- Must be a routine
#-- Write out Routine footer
print fhOutputFile "END DSROUTINES\n";
}
close fhOutputFile;
}
else {
print Now()." Error: Cannot create $OutputFile.\n";
$RetVal = $cFalse;
}
return $RetVal;
}
##############################################################################
sub OutputFileProcess
{
my ($NumberOfFiles, $opt_y) = @_;
my $RetVal = $cFalse;
my ($FileName, $DirectoryName, $SuffixName) = fileparse($gCombinedDSXFile, '\.dsx');
my $x = 1;
my $OutputFile = "";
#-- See if output directory exists
if ( -d $DirectoryName ) {
#-- Output directory exists
if ( OKToOverWriteOutputFile($FileName, $DirectoryName, $SuffixName, $opt_y) ) {
#-- Write output file(s)
#-- Spin through output files
for ($x = 1; $x <= $gNumberOfCombinedDSXFiles; $x++) {
if ( $gNumberOfCombinedDSXFiles == 1 ) {
#-- Create one big file
$OutputFile = $DirectoryName.$FileName.$SuffixName;
}
else {
#-- Split output across multiple files
$OutputFile = $DirectoryName.$FileName."-Part".$x.$SuffixName;
}
print Now()." Creating: $OutputFile...\n";
if ( CreateOutputFile($OutputFile, $x, $NumberOfFiles) ) {
$RetVal = $cTrue;
}
else {
last;
}
}
}
}
else {
print Now()." Error: $DirectoryName does not exist.\n";
}
return $RetVal;
}
##############################################################################
sub MainProcess
{
my ($opt_y) = @_;
my $RetVal = $cFalse;
my $NumberOfFiles = 0;
#-- Create Control Array
print Now()." Gathering list of DSX files...\n";
$NumberOfFiles = BuildControlList();
if ( $NumberOfFiles > 0 ) {
#-- Found some files to process
#-- Spin through list and determine JobType
print Now()." Determining job types...\n";
if ( AssignJobType($NumberOfFiles) ) {
#-- Sort Array by size only
print Now()." Sorting DSX file list by Size...\n";
SortBySize($NumberOfFiles);
#-- Assign target DSXFiles
print Now()." Assigning jobs to target output files...\n";
if ( AssignTargetDSXFiles($NumberOfFiles) ) {
#-- Sort Array by TargetFile, JobType, Name
print Now()." Sorting DSX file list by TargetFile, JobType, Name...\n";
SortByTargetFile($NumberOfFiles);
#-- Control the process of creating the outupt files
if ( OutputFileProcess($NumberOfFiles, $opt_y) ) {
$RetVal = $cTrue;
}
else {
ErrorMessage("Aborting: Cannot create Output files.");
}
}
else {
ErrorMessage("Aborting: Cannot properly assign target output files.");
}
}
else {
ErrorMessage("Aborting: One or more DSX files is invalid.");
}
}
else {
ErrorMessage("Aborting: No files to process in $gSourceDSXDir.");
}
return $RetVal;
}
##############################################################################
#-- Main
#-- Global Constants
$cTrue = 1;
$cFalse = 0;
$cOSSuccess = 0;
$cOSFailure = 1;
$cStandardDate = "2001-01-01";
$cStandardTime = "01.00.00";
$cStandardServerName = "ServerName";
$cStandardToolInstanceID = "ToolInstanceID";
#-- Global variables
$gCombinedDSXFile = "";
$gSourceDSXDir = "";
$gNumberOfCombinedDSXFiles = 1;
$gRecursive = $cFalse;
@gDSXFileList = ();
#-- Local variables
my $NumArgs = 0;
my $OSRetVal = $cOSSuccess;
print Now()." Initialization...\n";
if ( getopts('rs:yh') ) {
if ( ! $opt_h ) {
$NumArgs = scalar(@ARGV);
if ( $NumArgs == 1 || $NumArgs == 2 ) {
$gCombinedDSXFile = $ARGV[0];
if ( $NumArgs == 2 ) {
$gSourceDSXDir = $ARGV[1];
}
else {
$gSourceDSXDir = cwd();
}
if ( $opt_r ) {
$gRecursive = $cTrue;
print Now()." Recursively combining all DSX files found in $gSourceDSXDir...\n";
}
else {
print Now()." Combining all DSX files found in $gSourceDSXDir...\n";
}
if ( ValidSplitOption($opt_s) ) {
if ( $gNumberOfCombinedDSXFiles > 1 ) {
print Now()." Splitting SourceDSXFiles across $gNumberOfCombinedDSXFiles CombinedDSXFiles...\n";
}
#-- All input gathered
#-- Do main processing
if ( MainProcess($opt_y) ) {
print Now()." Complete.\n";
}
else {
$OSRetVal = $cOSFailure;
}
}
else {
$OSRetVal = $cOSFailure;
ErrorMessage("Aborting: Invalid split (-s) option.");
}
}
else {
$OSRetVal = $cOSFailure;
ErrorMessage("Aborting: Missing ParameterFile or too many parameters.");
}
}
else {
$OSRetVal = $cOSFailure;
ShowBlurb();
}
}
else {
$OSRetVal = $cOSFailure;
ErrorMessage("Aborting: Invalid options.");
}
exit $OSRetVal;
-Steve
Steve is the man, checkout his picture at http://www.kennethbland.com His Bag of Tricks is voluminous, his DSX cutter is top notch. He has a complete PVCS integration suite setup. He has point-and-click control, the weakest link in the chain being the command line import/export with DataStage because of no ability to singly export job objects by name. That manual effort is tiresome, so he coded up a full export, dsx explode, then pick out the objects he wanted process. Tool cool.
Kenneth Bland
Rank: Sempai
Belt: First degree black
Fight name: Captain Hook
Signature knockout: right upper cut followed by left hook
Signature submission: Crucifix combined with leg triangle
Rank: Sempai
Belt: First degree black
Fight name: Captain Hook
Signature knockout: right upper cut followed by left hook
Signature submission: Crucifix combined with leg triangle
Really? That would be awesome, since I am exploring a way to automatically pull selected jobs from the repository, group them up, throw them on the designated server, and compile it. Automatic migration without human intervention, increasing the accuracy rates.
Of course, I found a little flaw on the script -- it strips the parameter's default. Now, if you have a PX job for 6.0.1 or higher, and you happens to define $APT_CONFIG_FILE... whoops, the job won't compile without a default.
Also, there are a number of times where we make a set of jobs with parameters for files and tables that we do not pull from the command line (rather defining them within the Sequencer, and sometimes not even then). Preserving the default values allow for easier configuration.
Ah well, different strokes for different folks.
Now I'm trying to figure out how to automatically compile a job on the Client side (without going inside DataStage BASIC -- migrating this to all 100+ projects (and counting) within the entire corporate would be... painful. Not to mention the inevitable bug fixes...)
Anyone know how?
Heck, is that PVCS suite point/click for sale/share? I'll have to see if it can utilize CVS...
-T.J.
P.S. VERY slick website you got there, Ken! Great job! Was half-expecting a picture of your baby on the banner though...
Of course, I found a little flaw on the script -- it strips the parameter's default. Now, if you have a PX job for 6.0.1 or higher, and you happens to define $APT_CONFIG_FILE... whoops, the job won't compile without a default.
Also, there are a number of times where we make a set of jobs with parameters for files and tables that we do not pull from the command line (rather defining them within the Sequencer, and sometimes not even then). Preserving the default values allow for easier configuration.
Ah well, different strokes for different folks.
Now I'm trying to figure out how to automatically compile a job on the Client side (without going inside DataStage BASIC -- migrating this to all 100+ projects (and counting) within the entire corporate would be... painful. Not to mention the inevitable bug fixes...)
Anyone know how?
Heck, is that PVCS suite point/click for sale/share? I'll have to see if it can utilize CVS...
-T.J.
P.S. VERY slick website you got there, Ken! Great job! Was half-expecting a picture of your baby on the banner though...
Developer of DataStage Parallel Engine (Orchestrate).
Install CompileAllPlus or use version control and you're good to go in v6.x, in v7.x mass compile should be basic functionality in DataStage. This is about how automatic you can get on the client side for nowTeej wrote:Now I'm trying to figure out how to automatically compile a job on the Client side (without going inside DataStage BASIC -- migrating this to all 100+ projects (and counting) within the entire corporate would be... painful. Not to mention the inevitable bug fixes...)
Anyone know how?
Ogmios
Problem: We have more than just DataStage to archive. We prefers to include everything within a single package. We also love the use of category to distinct jobs.
The script on this one really does the first (and last) step of what we wanted. CompileAllPlus is nice, but I still have to open that thing, find the jobs that was imported, and then compile it. Not fun, and not reliable, and definitely not to trust a production analyst with, requiring another person, and adding yet more dollars to the bottom line.
That's the problem I am facing right now -- finding a low (one-time) cost solution that would minimize the cost (and risks) of migration and archiving.
-T.J.
The script on this one really does the first (and last) step of what we wanted. CompileAllPlus is nice, but I still have to open that thing, find the jobs that was imported, and then compile it. Not fun, and not reliable, and definitely not to trust a production analyst with, requiring another person, and adding yet more dollars to the bottom line.
That's the problem I am facing right now -- finding a low (one-time) cost solution that would minimize the cost (and risks) of migration and archiving.
-T.J.
Developer of DataStage Parallel Engine (Orchestrate).
-
- Participant
- Posts: 5
- Joined: Mon Oct 11, 2004 6:39 am
This dsx-cutter is great. I can use for version control. Now, i'm heading to the second step.
I want to export a given project. However: the dsjob -lprojects command runs on the server-side and the dscmdexport command runs on the client side. Does anybody have a tool or (Unix)script which is able to export a given and existing project?
thanks, Dick Spaans
I want to export a given project. However: the dsjob -lprojects command runs on the server-side and the dscmdexport command runs on the client side. Does anybody have a tool or (Unix)script which is able to export a given and existing project?
thanks, Dick Spaans
-
- Premium Member
- Posts: 483
- Joined: Thu Jun 12, 2003 4:47 pm
- Location: St. Louis, Missouri USA
Tony, there's a batch file that's posted at ADN that will back up all projects on a server. I'm using it and it is working great for me. In combination with command line WinZip, I can keep all of the exports from each night zipped up and taking up much less space than they would otherwise.
It's called DataStageBackup.zip and it is here on ADN.
It's called DataStageBackup.zip and it is here on ADN.
-craig
"You can never have too many knives" -- Logan Nine Fingers
"You can never have too many knives" -- Logan Nine Fingers
-
- Premium Member
- Posts: 483
- Joined: Thu Jun 12, 2003 4:47 pm
- Location: St. Louis, Missouri USA