Page 1 of 2

Delete all Datasets from folder Datasets

Posted: Wed Jul 01, 2009 10:26 am
by vintipa
Hi Experts,

I want to delete all the existing datasets. all the datasets are created in the Homepath in a folder called $DSHOME/Datasets.

CAN i delete all the datasets by using the unix command rm *.* in the folder $DSHOME/Datasets ?

Actually i want to create another folder in another path for datasets and mention that path in the config file and create all datasets afresh when the jobs run again.

regards,

Posted: Wed Jul 01, 2009 10:33 am
by nagarjuna
you cannot delete the datasets using unix rm command . They are not normal files .Use either orchadmin or dataset mangement utility .

Since you want to delete all the datasets you can use orchadmin is best option .

Posted: Wed Jul 01, 2009 10:51 am
by chulett
Well... technically you can but you'll need to delete all of the associated *.ds control files as well.

Posted: Wed Jul 01, 2009 11:21 am
by priyadarshikunal
you can. Even if you don't delete the descriptor file *.ds it will work. But its better to delete descriptor files also to avoid strange errors.

I did deleted binary files without deleting the descriptor and it worked just fine (probably due to overwrite mode).

I haven't tried but it may create problems while using append mode.

Hence its better if you delete *.ds files also.

Posted: Wed Jul 01, 2009 2:49 pm
by Sainath.Srinivasan
Even though you can, it will leave a mess around - especially if you try to read any.

So best is to use "orchadmin rm".

A simple

Code: Select all

for dsName in `ls -1 *.ds`
do
  orchadmin rm $dsName
done
wil do the trick.

Posted: Wed Jul 01, 2009 3:50 pm
by chulett
If you do it properly, as noted delete all datasets and all control files, there would be no "mess" left. And there's no magic to deleting the files, all orchadmin will just do an "rm" as well. The only thing about orchadmin is it is more smarter than us and can read the control file and thus knows exactly where all of the dataset files are for each control file.

So use orchadim for individual control files and their matching dataset files but for what the original poster stated as their need - "I want to delete all datasets" - you can just take off and nuke them from orbit. It's the only way to be sure.

Posted: Wed Jul 01, 2009 10:40 pm
by vintipa
hi,

Actually i want to know if i could delete the whole folder at once instead of deleting datasets individually to save time. I need to create a fresh folder for datasets in another path.

Also i am facing problem in sourcing the dsenv file with following error.

-------------------------------------------------------------------------------------
# cd DSEngine
# . ./dsenv
ksh: t/datastage/Ascential/DataStage/DSEngine/bin:/dsetlsoft/datastage/Ascential/DataStage/PXEngine/bin:/oracle/app/product/10.1.0/bin: not found.
# pwd
/dsetlsoft/datastage/Ascential/DataStage/DSEngine
#
------------------------------------------------------------------------------------

Posted: Wed Jul 01, 2009 10:43 pm
by chulett
Of course you can.

And please start a new topic for your new, completely unrelated question.

Posted: Wed Jul 01, 2009 11:08 pm
by vintipa
oh thanks Craig,

You mean i can delete the folder with unix rm command,
and if i have to delete individual datasets then i should go for orchadmin.

am i right?

regards,

Posted: Wed Jul 01, 2009 11:13 pm
by chulett
You'd have to use "rm -r" to recursively delete the folder and everything in it but yes, you're right.

Posted: Wed Jul 01, 2009 11:31 pm
by ray.wurlod
vintipa wrote:Actually i want to know if i could delete the whole folder at once instead of deleting datasets individually to save time. I need to create a fresh folder for datasets in another path.
NO
Data Sets' data files live in multiple directories.

If you delete all the *.ds files (which is what I am guessing is your intent) you leave many orphan Data Set data files taking up space on your system.

The correct approach is to create a script that cycles through the "*.ds" file names and invokes orchadmin to delete the Data Set for which each is the descriptor file.

Posted: Thu Jul 02, 2009 1:08 am
by ArndW
I'll chime in as well - don't delete the .ds files with "rm". You can cd to your directory with dataset descriptors and issue

Code: Select all

find . -name *.ds -exec orchadmin rm {} \;
if you'd rather not do an explicit loop.

Posted: Thu Jul 02, 2009 3:25 am
by priyadarshikunal
If you look in to descriptor file last few lines contains the name and path of the datasets' binary files.

Orchadmin reads the descriptor, deletes all binary files associated with it and deletes the descriptor itself.

If you have to delete the folder of binary files and also delete all the descriptor files its ok to use rm command.

Although as mentioned by Arnd, its easy (not easier) to use orchadmin and can be used (atleast it is the formal way). But you can use rm without any problem in this case (easier).

Posted: Thu Jul 02, 2009 3:34 am
by Sainath.Srinivasan
Priyadarshi,

It is enough to delete the .ds files alone as the child will become orphans (even though it will consume space).

The point myself, Ray and ArndW are pointing is w.r.t the limited knowledge of OP and guiding through rm outside the tool may result in some other damage.

Also it is possible that the filesets and lookup tables create objects, which may be deleted by rm.

Posted: Thu Jul 02, 2009 5:54 am
by chulett
Jeez, what a bunch of anal little monkeys. :lol:

It's not about deleting just the ".ds" files, but if you actually read the starting post rather than just chime in midstream you'll see it's about deleting all datasets (which are stored in one location) so they can be recreated fresh in a new location next job run. Hence my comments. And no-one here advocated simply deleting just the .ds files.