compress and delete dataset

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

peep
Premium Member
Premium Member
Posts: 162
Joined: Mon Aug 20, 2012 6:52 pm

compress and delete dataset

Post by peep »

Can i compress all unwanted datasets gzip them and delete using orchadmin command?

orchadmin rm ".*gzip"

if not how can i delete gzip file ?
ArndW
Participant
Posts: 16318
Joined: Tue Nov 16, 2004 9:08 am
Location: Germany
Contact:

Post by ArndW »

Dataset consist of 2 components, the file "abc.ds" that you reference in jobs which is a descriptor file, then the actual data files that reside in locations specified by the APT_CONFIG file. For this reason one cannot just zip up datasets the way you envision. The names and numbers of files in a datasets are not easy to determine.
peep
Premium Member
Premium Member
Posts: 162
Joined: Mon Aug 20, 2012 6:52 pm

Post by peep »

I have to delete dataset descriptor files which are of 10 gb.
how can i delete them ..
Its too hard to delete name by name ( each one at a time).

Any other option?
ArndW
Participant
Posts: 16318
Joined: Tue Nov 16, 2004 9:08 am
Location: Germany
Contact:

Post by ArndW »

Dataset descriptor files (the ones you name in jobs to access the datasets) contain no data and cannot be that large. You are probably talking about the actual data files and you should not manipulate them singly at all. If on Windows you can specify that the directory and its contents are to be compressed in order to save space.
Alternatively you can use "orchadmin dump" to dump the contents of a dataset into a sequential file, then delete the dataset using "orchadmin rm" and then gzip that sequential file.
peep
Premium Member
Premium Member
Posts: 162
Joined: Mon Aug 20, 2012 6:52 pm

Post by peep »

I am talking about descriptor files (3-5 mb each) .They are occupying resource disk on all 3 nodes. it needs clean up. there are many files. so by using orchadmin rm can i delete good number of files at a single entry ?
jerome_rajan
Premium Member
Premium Member
Posts: 376
Joined: Sat Jan 07, 2012 12:25 pm
Location: Piscataway

Post by jerome_rajan »

Not sure descriptor files that big are possible. A descriptor file contains the metadata and a copy of the configuration file.
Jerome
Data Integration Consultant at AWS
Connect With Me On LinkedIn

Life is really simple, but we insist on making it complicated.
peep
Premium Member
Premium Member
Posts: 162
Joined: Mon Aug 20, 2012 6:52 pm

Post by peep »

yes .. is there a way to delete?
peep
Premium Member
Premium Member
Posts: 162
Joined: Mon Aug 20, 2012 6:52 pm

Post by peep »

I m referring to .ds files..
Which are stored in nodes/datasets on resource disk.
How to delete them in bulk?
jerome_rajan
Premium Member
Premium Member
Posts: 376
Joined: Sat Jan 07, 2012 12:25 pm
Location: Piscataway

Post by jerome_rajan »

orchadmin rm "*.ds" should work
Jerome
Data Integration Consultant at AWS
Connect With Me On LinkedIn

Life is really simple, but we insist on making it complicated.
peep
Premium Member
Premium Member
Posts: 162
Joined: Mon Aug 20, 2012 6:52 pm

Post by peep »

It deletes all the .ds files in the folder rite?
Wat if I want to delete all .ds files which are created before 9/10/2012?
Last edited by peep on Thu Sep 20, 2012 4:17 am, edited 2 times in total.
ArndW
Participant
Posts: 16318
Joined: Tue Nov 16, 2004 9:08 am
Location: Germany
Contact:

Post by ArndW »

I'm not sure that using "*.ds" is a good idea unless the OP wishes to delete all of the datasets.
peep
Premium Member
Premium Member
Posts: 162
Joined: Mon Aug 20, 2012 6:52 pm

Post by peep »

Can I run any shell script and compress all those .ds files which were created before mm.dd.yyyy into on file or zip file n then use orchadmin rm ?
Do you think it's possible?
ArndW
Participant
Posts: 16318
Joined: Tue Nov 16, 2004 9:08 am
Location: Germany
Contact:

Post by ArndW »

NO, YOU CANNOT COMPRESS THE FILES and keep the dataset usable. By renaming them the original dataset becomes corrupt.
peep
Premium Member
Premium Member
Posts: 162
Joined: Mon Aug 20, 2012 6:52 pm

Post by peep »

Ok. So there is only way to delete is better delete *.ds all of them If they are not using ?
pnpmarques
Participant
Posts: 35
Joined: Wed Jun 15, 2005 9:27 am

Post by pnpmarques »

If you don't need those datasets why zip them? Just use the orchadmin command to delete them, but be careful when using *.
If you want to keep that data for later use, it's better to save it to a sequential file, that you can easily zip/unzip/move/read.
Post Reply