delete datasets

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
just4u_sharath
Premium Member
Premium Member
Posts: 236
Joined: Sun Apr 01, 2007 7:41 am
Location: Michigan

delete datasets

Post by just4u_sharath »

i have a requirement--- In my first job, after extracting and transforming i will create a dataset which will hold some millions of records. in the second job i will extract the data from previously created dataset and do some transformations and again create a dataset. now the requirement when i pulled the data from the datasest in second job i have to delete the data in the dataset created in first job. this is mainly because of space requirements. I read the user guide and it says we can only delete the datasets from dataset management utility. if this is the case how can i delete the datasets.
ArndW
Participant
Posts: 16318
Joined: Tue Nov 16, 2004 9:08 am
Location: Germany
Contact:

Post by ArndW »

From unix "orchadmin rm {dataset}" will do it.
just4u_sharath
Premium Member
Premium Member
Posts: 236
Joined: Sun Apr 01, 2007 7:41 am
Location: Michigan

Post by just4u_sharath »

ArndW wrote:From unix "orchadmin rm {dataset}" will do it.
But in the document it is said that using unix rm command only descriptor file of dataset is removed leaving the original huge data file on the memory. is orchadmin rm datset different from rm dataset. if i use orchadmin does it remove all the datafile and descriptor file
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

That was not the UNIX rm command that Arnd suggested.

It was orchadmin rm controlfile.ds that was suggested, to be executed from a UNIX shell with the appropriate environment variables (such as APT_CONFIG_FILE and APT_ORCHHOME) set to correct values.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
just4u_sharath
Premium Member
Premium Member
Posts: 236
Joined: Sun Apr 01, 2007 7:41 am
Location: Michigan

Post by just4u_sharath »

ArndW wrote:From unix "orchadmin rm {dataset}" will do it.
But in the document it is said that using unix rm command only descriptor file of dataset is removed leaving the original huge data file on the memory. is orchadmin rm datset different from rm dataset. if i use orchadmin does it remove all the datafile and descriptor file
just4u_sharath
Premium Member
Premium Member
Posts: 236
Joined: Sun Apr 01, 2007 7:41 am
Location: Michigan

delete datasets

Post by just4u_sharath »

ray.wurlod wrote:That was not the UNIX rm command that Arnd suggested.

It was orchadmin rm controlfile.ds that was suggested, to be executed from a UNIX shell with the appropriate environment variables (such as APT_CONFIG_FILE and APT_ORCHHOME) set to correct values.
Lets suppose if the dataset name is xxx.ds and was stored in /etl/store.
now where i can find the control file and what are the values i should set for apt_config_file and apt_orchhome. i guess i ca execute the script from
before/after subroutine of stage.
ArndW
Participant
Posts: 16318
Joined: Tue Nov 16, 2004 9:08 am
Location: Germany
Contact:

Post by ArndW »

The orchadmin command is used to access and manipulate datasets. One of the options is rm which removes the descriptor and component files. Read the documentation to confirm this. Or perhaps you could actually just try it and see if it works.
mdan
Charter Member
Charter Member
Posts: 46
Joined: Mon Apr 28, 2003 4:21 am
Location: Brussels
Contact:

Post by mdan »

Do not forget to set-up the environment variable APT_CONFIG_FILE before running the orchadmin. :idea:
anandkumarm
Premium Member
Premium Member
Posts: 55
Joined: Tue Feb 24, 2004 8:17 am

Hi

Post by anandkumarm »

mdan wrote:Do not forget to set-up the environment variable APT_CONFIG_FILE before running the orchadmin. :idea:
when you say set_up the environment variable.do you mean that it is to be added at the job level...can you please be more clear on seting up the environment variable
thanks
ArndW
Participant
Posts: 16318
Joined: Tue Nov 16, 2004 9:08 am
Location: Germany
Contact:

Post by ArndW »

Anand - have you tried it? If you don't have the environment variable set you will get an error message.
just4u_sharath
Premium Member
Premium Member
Posts: 236
Joined: Sun Apr 01, 2007 7:41 am
Location: Michigan

delete datasets

Post by just4u_sharath »

just4u_sharath wrote:
ArndW wrote:From unix "orchadmin rm {dataset}" will do it.
what is this control file. is it the same as dataset name. setting apt_config_file and pat_orchhome means what. with what values i have to set. please explain
ArndW
Participant
Posts: 16318
Joined: Tue Nov 16, 2004 9:08 am
Location: Germany
Contact:

Post by ArndW »

In your case control file and xxx.ds are synonymous
just4u_sharath
Premium Member
Premium Member
Posts: 236
Joined: Sun Apr 01, 2007 7:41 am
Location: Michigan

delete datasets

Post by just4u_sharath »

ArndW wrote:In your case control file and xxx.ds are synonymous
Thankyou
And in my case already apt_config_file adn apt_orchome has a default pathname. so i think i just have to add those two paremeters to my job and using before/after subroutine write the unix command. Am i right.
ArndW
Participant
Posts: 16318
Joined: Tue Nov 16, 2004 9:08 am
Location: Germany
Contact:

Post by ArndW »

The UNIX command inherits the ENV settings of the calling job, so you will already have a setting for the APT_CONFIG_FILE.
Post Reply