Dataset Mgmt.

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
gopskrish
Participant
Posts: 37
Joined: Thu Mar 31, 2005 7:42 am

Dataset Mgmt.

Post by gopskrish »

Hello Everybody,

I want to rename and delete dataset files which are 7 days old ?
Can i use Unix shell scripts to do that or Is there a better way of handling it ? I heard from one of my friend that Datastage automatically deletes the files in a periodical basis in the latest version. Is that true ? Let me know.
I am verymuch interested to learn an ETL tool and i find datastage as the best one to do so.
kcshankar
Charter Member
Charter Member
Posts: 91
Joined: Mon Jan 10, 2005 2:06 am

Post by kcshankar »

Hi,
You can do that in Datastage Manager.

It is given in the Datastage Documentation that,we cannot use Unix cp or rm commands to copy or delete a dataset becuse,Datastage represents a single data set with multiple files.Using rm simply removes the descriptor file,leaving the much larger files behind.
gopskrish
Participant
Posts: 37
Joined: Thu Mar 31, 2005 7:42 am

Post by gopskrish »

Hi,

Thanks a lot. I would like to know the equivalent commands in Datastage if i dont want to use Unix commands, and how to specify the value (number of days , like 7 in my case) for deletion or renaming.

If u do have some scripts based on that, do share it with me so that i can see if i can implement the same from my end.

cheers,
gops
I am verymuch interested to learn an ETL tool and i find datastage as the best one to do so.
ArndW
Participant
Posts: 16318
Joined: Tue Nov 16, 2004 9:08 am
Location: Germany
Contact:

Post by ArndW »

kcshankar wrote:Hi,
You can do that in Datastage Manager.

It is given in the Datastage Documentation that,we cannot use Unix cp or rm commands to copy or delete a dataset becuse,Datastage represents a single data set with multiple files.Using rm simply removes the descriptor file,leaving the much larger files behind.
what you would do from UNIX is "orchadmin rm {yourfile}", this will remove the dataset data files as well as the dataset descriptor.
elavenil
Premium Member
Premium Member
Posts: 467
Joined: Thu Jan 31, 2002 10:20 pm
Location: Singapore

Post by elavenil »

Hi Arnd,

Is "orchadmin rm <the descriptor file>" equivalent to deleting the dataset from Dataset Management.

How would you set up OrchAdmin in Unix?

Pls share with us.

Regards
Saravanan
kcshankar
Charter Member
Charter Member
Posts: 91
Joined: Mon Jan 10, 2005 2:06 am

Post by kcshankar »

Hi ,
Sorry gops,i have not done this in Shell Scripts.

Even I have few doubts.

1.Is the Blocks size of a file given in Datastage Manager and Unix are same.
For a same Dataset file,
in Unix it is showing 16 Blocks
in Ds manager it is showing 1 Block. :o

2.What is happening when we delete .ds file.
First I deleted .ds file in a particular directory then I checked the size of the directory,I found that the file is deleted completely.
But in the document it is given that,using rm simply removes the descriptor file,leaving the much larger files behind. :?:

Thanks in advance
kcs
gopskrish
Participant
Posts: 37
Joined: Thu Mar 31, 2005 7:42 am

Post by gopskrish »

Hi Arnd,

Thanks a lot. I hope that i need administrative privilage to execute that piece of code. Correct me if i am wrong.

Best Regards,
Gops
I am verymuch interested to learn an ETL tool and i find datastage as the best one to do so.
ArndW
Participant
Posts: 16318
Joined: Tue Nov 16, 2004 9:08 am
Location: Germany
Contact:

Post by ArndW »

elavenil wrote:Hi Arnd,

Is "orchadmin rm <the descriptor file>" equivalent to deleting the dataset from Dataset Management.

How would you set up OrchAdmin in Unix?

Pls share with us.

Regards
Saravanan
No magic to that - if you set up your user's environment with the correct settings for PX then the orchadmin command will be in your search path. Just call up orchadmin and see what happens.
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

A persistent DataSet comprises the control file and one or more data files. You can cat the control file to determine the location of the data files. Then you can process this list using find (with -ls option) to determine which ones are old enough. Then use appropriate commands to move or remove all parts of the DataSet.

If all you do is remove the control file (the one with a ".ds" suffix) then all the data files will hang around forever and you won't know to which DataSet each belongs.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
gopskrish
Participant
Posts: 37
Joined: Thu Mar 31, 2005 7:42 am

Post by gopskrish »

Hi Ray,


Can you please help me out on how to find the location of datafile using the cat ? I am more of a beginner in Unix. Please share your thoughts.

Best Regards,
Gops
I am verymuch interested to learn an ETL tool and i find datastage as the best one to do so.
ArndW
Participant
Posts: 16318
Joined: Tue Nov 16, 2004 9:08 am
Location: Germany
Contact:

Post by ArndW »

gopskrish,

if you do a cat <mydataset.ds> or more <mydataset.ds> at unix you will, apart from quite a bit of other text, see one or more UNIX paths and filenames. You can then attach to that directory to delete the datafiles.

Your PX installation will also contain default locations for these dataset files; but in the DataSet management tool you can select your dataset and then click on each segment file to see where on the UNIX system that particular node has been placed.
gopskrish
Participant
Posts: 37
Joined: Thu Mar 31, 2005 7:42 am

Post by gopskrish »

Hi ArndW,

Thanks a lot for an immediate response. I will proceed accordingly. Your timely help is verymuch appreciated.

cheers,
gopskrish
I am verymuch interested to learn an ETL tool and i find datastage as the best one to do so.
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Not much I can add to Arnd's response. The default location is a directory called Datasets.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
gopskrish
Participant
Posts: 37
Joined: Thu Mar 31, 2005 7:42 am

Post by gopskrish »

Hi,

I can able to delete old files using unix command based on the number passed but when i replace it with orchadmin delete option it is not deleting files. Herewith I am attaching the command which i use for testing. Correct me if it is wrong.

It is executing fine if the command is
find /detld1/etl/ascential/ascential/DataStage/Projects/CTI_London/IAM/staging/ -name "*.ds" -mtime +5 -print -exec compress {} \;

But when i use orchadmin in place of that it is not doing the job

find /detld1/etl/ascential/ascential/DataStage/Projects/CTI_London/IAM/staging/ -name "*.ds" -mtime +5 -print -exec orchadmin delete {} \; Looking for an earlier reply.

cheers,
gopskrish
I am verymuch interested to learn an ETL tool and i find datastage as the best one to do so.
gopskrish
Participant
Posts: 37
Joined: Thu Mar 31, 2005 7:42 am

Post by gopskrish »

Hi ,

I am very sorry for sending my previous mail. The same command is executing fine and I had a small typo error when i ran it lasttime.
Sorry onceagain to all.


cheers,
gopskrish
I am verymuch interested to learn an ETL tool and i find datastage as the best one to do so.
Post Reply