Deletion of datasets using rm

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
clarcombe
Premium Member
Premium Member
Posts: 515
Joined: Wed Jun 08, 2005 9:54 am
Location: Europe

Deletion of datasets using rm

Post by clarcombe »

I have mistakenly deleted some datasets using the rm command.

In order to clean up the orphan data files should I just go onto the nodes where the orphan files are and delete those too or is there a cleaner way ?

Thanks
Colin Larcombe
-------------------

Certified IBM Infosphere Datastage Developer
Nageshsunkoji
Participant
Posts: 222
Joined: Tue Aug 30, 2005 2:07 am
Location: pune
Contact:

Post by Nageshsunkoji »

Hi,

You cannot use the UNIX rm command to delete a data set because DataStage represents a single data set with multiple files. Using rm simply removes the descriptor file, leaving the much larger data files behind.

So, you can use Data set management in Designer or Manager to delete the entire data or cleanup the data.
NageshSunkoji

If you know anything SHARE it.............
If you Don't know anything LEARN it...............
ArndW
Participant
Posts: 16318
Joined: Tue Nov 16, 2004 9:08 am
Location: Germany
Contact:

Post by ArndW »

The orphaned dataset data files have no link back to their "descriptor" so you will need to manually delete them.
I wrote a program to search the machine for all descriptor files and link them to the corresponding data files, then any files that aren't linked in the data directory are junk files and can be deleted. It wasn't too difficult to do (I'd post it but the code is proprietary to the customer I was at).

Basically from UNIX create your own magic file and do a find to get the descriptors. You can then use orchadmin ll {dataset} to get information on the part files and parse that out. Then compare your total list with the contents of any data directories listed in your system's configuration directory for APT files.
Post Reply