Hashed File Problem

Post questions here relative to DataStage Server Edition for such areas as Server job design, DS Basic, Routines, Job Sequences, etc.

Moderators: chulett, rschirm, roy

Post Reply
danddmrs
Premium Member
Premium Member
Posts: 86
Joined: Fri Apr 20, 2007 12:55 pm

Hashed File Problem

Post by danddmrs »

On two occasions hashed files have been 'corrupted' without any change to the program. Instead of having 3 files within the hashed file folder there is a file being created for each row of the hashed file. The names of each of these files is a series of numbers with a few value marks inserted, 0u16125u7210u45u3 for example.

The same data is extracted on a different server and the DATA.30 file is 275,908 KB, the OVER.30 file is 64,026 KB so I don't believe it hit the 2G limit.

Previously when this happened I renamed the hashed file and moved on but that was in the test environment. Do not have the option to rename and move on in Model and Production environments due to change control procedures.

Has anyone seen anything like this and is there a way to fix/prevent it?
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Yes, I think anyone that's done heavy Server work has seen this before. I have and I know there are posts here about it as well. I forget the internal type number that is where each record of a file but it's crazy slow and mind-blowing the first time you see it.

I'm not aware of any known explicit cause but anything that causes the directory to lose the hidden .Type30 file that it uses to determine that it is a dynamic hashed file can result in this behavior. Say for example file system writes that fail for whatever reason. Are you clearing the hashed file each run? Dropping and recreating? Or just writing to it? If you are doing one of the first two actions then anecdotal evidence suggests 'clearing' is safer that deleting and recreating, especially if multiple jobs leverage the hashed file.

You are going to have to do as you noted - rename / remove the corrupted hashed file and let it automatically rebuild it. That or restore it from backup. I'm not aware of any other remedy at least.
-craig

"You can never have too many knives" -- Logan Nine Fingers
danddmrs
Premium Member
Premium Member
Posts: 86
Joined: Fri Apr 20, 2007 12:55 pm

Post by danddmrs »

In the properties for the Hashed File 'Create File' and 'Clear file before writing' are selected.
From the Options button in Create file options the 'Delete file before create' option is selected.

I started the job with a limit of 10 rows thinking I would let the Delete option get rid of the corrupt data and then delete the hashed file, it's on a Directory Path, to see if that would work but the job seems to be stuck on deleting the file. Has been running for 1 hour without processing any rows. I'll let it go for a bit to see what happens then if necessary kill it, release the locks, and rename/delete/rebuild. Thanks
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Using both clear and delete doesn't make sense, use one or the other. And still prefer the former. Hashed files will always be created / recreated if they don't exist regardless of setting.
-craig

"You can never have too many knives" -- Logan Nine Fingers
Post Reply