Page 1 of 1

Deleting work (Hashed and Seq files) after job execution

Posted: Fri Mar 28, 2008 4:15 am
by clarcombe
We have a sequence calling a number of jobs which create several unique Hashed and text files for each file that is passed to the sequence.

We have been "asked" by production management to follow the McDonalds maxim and "Clean as we go".

From what I have read on this, executing a simple "CLEAR.FILE" at the end of the sequence will not be sufficient as the disk space is retained (even though the DATA.30 and OVER.30 are set to minimal limits).

I have thought about creating a routine at the end of the sequence which executes a series of Universe and Unix commands to delete the hashed files and text files.

Is this the best way to approach this ?

Posted: Fri Mar 28, 2008 4:18 am
by ArndW
If you do a CLEAR.FILE and then a "RESIZE {fname} * * *" the OVER.30 will also get set back to it's original size. If you have a MINIMUM.MODULO set your files will not shrink to a couple of Kb, either.

Posted: Fri Mar 28, 2008 4:57 am
by ray.wurlod
An after-job subroutine, or a routine or command (shell script) invoked from a job sequence, are the obvious choices. In either case you can pass a list of names of things to be deleted. Don't forget, in the case of dynamic hashed files, that you'll need a recursive deletion command.

Posted: Fri Mar 28, 2008 5:31 am
by clarcombe
Don't forget, in the case of dynamic hashed files, that you'll need a recursive deletion command.
Do you mean to delete the subdirectory and all its contents ?

Posted: Fri Mar 28, 2008 6:44 am
by chulett
Yes, he does... and that only for 'pathed' hashed files.

Re: Deleting work (Hashed and Seq files) after job execution

Posted: Fri Mar 28, 2008 11:14 am
by ganesh.soundar
Hope you are running in UNIX platform. If so use "rm" command with options available and place all the commands in a file. So the file will have the coding to remove all the files that are created. In Sequence you have an activity called Execute Command. Use that activity and make a call the the Shell script that is created.
Regards,
Raja Soundararajan
clarcombe wrote:We have a sequence calling a number of jobs which create several unique Hashed and text files for each file that is passed to the sequence.

We have been "asked" by production management to follow the McDonalds maxim and "Clean as we go".

From what I have read on this, executing a simple "CLEAR.FILE" at the end of the sequence will not be sufficient as the disk space is retained (even though the DATA.30 and OVER.30 are set to minimal limits).

I have thought about creating a routine at the end of the sequence which executes a series of Universe and Unix commands to delete the hashed files and text files.

Is this the best way to approach this ?

Deleting Hashed Files

Posted: Fri Mar 28, 2008 4:34 pm
by ray.wurlod
If the hashed file was created in a project (account), use CREATE.FILE from within the DataStage environment.

If the hashed file was created as a dynamic hashed file in a directory, use a recursive delete command (rm -rf in UNIX, DEL /S in Windows). This deletes the three files DATA.30, OVER.30 and .Type30, as well as the directory that "is" the hashed file.

If the hashed file was created as a static hashed file in a directory you can use a non-recursive delete command, but the recursive one will work just as well.

If you have used SETFILE or any other means to create a VOC pointer to the deleted hashed file you need to delete that pointer. This is a regular query of the form

Code: Select all

DELETE FROM VOC WHERE "TYPE" in ('F','Q') AND "@ID" = 'hashedfilename';
Caution: Never used DELETE on VOC without a WHERE clause! Your project will immedately become unusable. The constraint on "TYPE" above is not strictly necessary - I include it as a safety measure.