hashed file - (project directory or different directory)

Post questions here relative to DataStage Server Edition for such areas as Server job design, DS Basic, Routines, Job Sequences, etc.

Moderators: chulett, rschirm, roy

Post Reply
mystuff
Premium Member
Premium Member
Posts: 200
Joined: Wed Apr 11, 2007 2:06 pm

hashed file - (project directory or different directory)

Post by mystuff »

Is there any advantage of having hashed files in project directory (i.e. under account name) over having hashed files in different directory.
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Only one. It's easier to delete the project.

Best practice is to keep your hashed files in a separate directory, perhaps a sub-directory in the project directory. But if they're going to grow large, then maybe a separate file system is indicated. Create a script to delete the hashed files.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
vinbhate
Participant
Posts: 36
Joined: Tue Jul 17, 2007 12:51 am
Location: India-Mumbai

Post by vinbhate »

Hi ArndW

Your answer i cannot view fully properly...because it seems u r premium context .For Temporary basis can you mail your answer on vinita.bhate@gmail.com so that i will be clear in understanding

Thanks in advance
Regards and Thanks,
Vinita Bhate
ArndW
Participant
Posts: 16318
Joined: Tue Nov 16, 2004 9:08 am
Location: Germany
Contact:

Post by ArndW »

vinbhate, I now marked the post as non-premium.
saikir
Participant
Posts: 92
Joined: Wed Nov 08, 2006 12:25 am
Location: Minneapolis
Contact:

Post by saikir »

Hi,

As stated earlier, it is better to store the hashed files in a particular directory. In this case you can track the hashed files easily rather then seraching for them in the project home. In our project for all the jobs we have separate folders where the hashed files are explicitly stored. You can also delete the hashed files on a timely basis, as the hashed files get generated every time the job runs.

Sai
DatastageSolutions
Participant
Posts: 7
Joined: Wed Sep 19, 2007 1:56 am
Location: County Durham, UK

Post by DatastageSolutions »

Personally, I like to use the project area because then you have all the Universe commands available to you. I know you can set a pointer in the VOC to get around this but if you want to check whether a row exists its so much easier to just go into Administrator and type in a bit of SQL.

BUT... a lot of people say you should never use the project area because if you run out of space you'll leave Datastage rolling around on its back kicking its legs in the air. Datastage saves all its internal hashfiles in the project area. Its easy enough for the log files to get huge, if you're putting all your data in there as well then things can fill up quickly. If you run out of space then you'll usually end up corrupting the internal hashfiles and leave things in right a mess.

So I tend to save small hashfiles for lookups in the project area but big 64 bit hashfiles always go into an area of their own.

Hope that helps,
Dave
ArndW
Participant
Posts: 16318
Joined: Tue Nov 16, 2004 9:08 am
Location: Germany
Contact:

Post by ArndW »

DataStageSolutions - I tend to take the best of both worlds in this case. I will always place the hashed files (regardless of initial or expected size) in a different directory. But I will then put a remote pointer to the hashed file in the project VOC so that I can use normal commands. Filling up the project directory is a bad thing that usually done once per project...
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Best practice, of course, is to fill the project directory's file system zero times.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
DatastageSolutions
Participant
Posts: 7
Joined: Wed Sep 19, 2007 1:56 am
Location: County Durham, UK

Post by DatastageSolutions »

Arnd - I'm busy running one of your jobs and it stores everything in the project area! :wink:
Post Reply