Weird hash file behavior

Post questions here relative to DataStage Server Edition for such areas as Server job design, DS Basic, Routines, Job Sequences, etc.

Moderators: chulett, rschirm, roy

Post Reply
tonystark622
Premium Member
Premium Member
Posts: 483
Joined: Thu Jun 12, 2003 4:47 pm
Location: St. Louis, Missouri USA

Weird hash file behavior

Post by tonystark622 »

I have one job that does lookups against several hash files. On one of the hash file lookups, the transformer appears to read the first source record (the record count on the input link stays at 1), then generate about 100,000 lookup and output records, then the source record count moves upward from 1 and the rest of the rows are processed, seemingly ok. There are only 8500 input rows and about 110,000 output rows from this transform stage. When I look at the output records the first 100,000 or so, have the same key values, but other fields within the record are different. I have deleted this hash file and recreated it, but it still does this. Just to make it interesting, the lookup hash file does have about 100,000 rows in it, but only one row that matches the key values in the rows at the beginning that appear as duplicates.

I apologize if this is confusing. Please ask questions on points that don't make sense and I'll be glad to clarify.

As always, I appreciate your help.

Tony Stark
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

You are right, it is a little confusing. [;)]

Anything special about the value in the first row that you use to do this lookup? Is it just this data set or is the behaviour the same (first row = 'full table scan'ish result) regardless of the data being pushed through?

-craig
tonystark622
Premium Member
Premium Member
Posts: 483
Joined: Thu Jun 12, 2003 4:47 pm
Location: St. Louis, Missouri USA

Post by tonystark622 »

Hi Craig!
I appreciate your reply, especially on the weekend.

Nothing special about the first row that I can see.

I copied the job and went to work on the copy. I deleted the transform stage and re-added everything by hand, taking care not to copy/paste anything from the original job. Of cource, now it works. [8D]

I used the original job as the model to hand key things back in and don't see anything different between the two jobs. Oh, and it's using the same hash file.

I don't understand, but at least I have an idea what to do to fix the original job.

Weirdness on a Saturday :) Oh, and only a month to go to implementation. Sigh.

Thanks again for your reply,
Tony
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Don't ya love it when that's what it takes to fix something - recreate it with no changes. [}:)]

Hey, at least you've got a month. I have a *week*, hence the whole 'Working on the Weekend' thing.

Enjoy!

-craig
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Would really need to see both jobs to make any accurate diagnosis.
Why not do separate exports of each, and compare the DSX files using diff? This might reveal the awful truth, whatever it is.
The behaviour is redolent of UniVerse behaviour when trying to process a non-existent Select List; all records are processed. But this is (theoretically) not possible with a Hashed File stage, because this uses the hashing algorithm directly, rather than any other method.
Was the hashed file's primary key properly defined in the metadata?

Ray Wurlod
Education and Consulting Services
ABN 57 092 448 518
tonystark622
Premium Member
Premium Member
Posts: 483
Joined: Thu Jun 12, 2003 4:47 pm
Location: St. Louis, Missouri USA

Post by tonystark622 »

Good idea about diff'ing the DSX files, Ray. I'll try to do that on Monday when I get back and let you folks know what I find. I think the keys were defined correctly. Again, I'll check on Monday and let you know.

Craig, I _should_ be working every day until we get done, but I just don't have enough brain cells to make it working every day 7 days a week. So, I take Sundays off. Or I have until now. We'll see what the next month brings. I may be working without brain cells some days [:D]

Tony
Post Reply