Page 1 of 1

Number of Datasets

Posted: Fri Jul 13, 2012 2:31 am
by karthi_gana
Hello,

In my current project, The usage os dataset is very high. I believe there may be around 2500 datasets used in my project.

a) I just wanted to get the exact count of datasets, size of the datasets used in my project
b) I would like to know which are all datasets are not used for the past 3 months
c) what kind of maintenance activity needs to be taken if we have lot of datasets in any project
d) if we need to perform any maintenance activity...how often we need to do that
e)

Posted: Fri Jul 13, 2012 3:28 am
by ArndW
The biggest problem with datasets is that users delete the descriptor file (the one specified in the jobs) at the OS-level with a "rm" or "del" command. Unfortunately the actual data is not stored in the descriptor file but in the location(s) specified in the APT_CONFIG file used at runtime. By deleting the descriptor file the actual data files are "orphaned" and will never be accessed or deleted.

I posted this thread on orphans a long time ago on DSXChange on how to locate all dataset files, then with the "orchadmin -f {DescriptorFile}" one can locate all the dataset files that arein use and thereby detect orphaned files.

From the OS level you can go to the data directories and list the files sorted by date-time-accessed or date-time-modified in order to detect old and unused files, then find their assciated descriptor file and correctly delete the dataset using "orchadmin rm {dataset}"

Posted: Fri Jul 13, 2012 7:26 am
by chulett
<breathlessly waiting for e>

:wink: