-------Rows missing in tha Hash File--------

alan_antony · Post by **alan_antony** » Mon Sep 11, 2006 4:56 am

Hi,

We have developed a job which sorts records on the basis of a field.
Later on, these records are loaded into a hash file.

The Problem what I am facing is, Some records are missing when i start searching in the hash file. I connected a Sequential file instead of the Hash file and found out that all the records are correct...

When I run the job, the performance statistics says 20793 records are loaded into the Hash file as well as the Sequential file. But , The records loaded into the Hash file are less...The records in the Sequential and Hash file are different in the sense that, some records are missing from the Hash File. I need to Use The Hash File of this job for the next Job.

Could anyone help me on this issue...

Thank you.

Alan

thurmy34 · Post by **thurmy34** » Mon Sep 11, 2006 5:28 am

Hi
Do you have keys in your hash files ?
Is so maybe some records are override.

Hope this help

alan_antony · Post by **alan_antony** » Mon Sep 11, 2006 5:37 am

Hi,

Yes , I have a key in the hash file named as "Entity_skey" and there are multiple records for that Key...

In that case,How should I proceed since I need the Hash file in the next job as the above mentioned field as key.

thurmy34 · Post by **thurmy34** » Mon Sep 11, 2006 5:47 am

Hi,

You must have a key in a hash_file stage but maybe you don't need this stage.

You can use the ouput of the sort stage directly.

FILE --> SORT ---> TRANSFORMER

Hope This Help

alan_antony · Post by **alan_antony** » Mon Sep 11, 2006 5:54 am

I have understood what You have said....But...actually, This Hash file (Target File) is required as a lookup in the next Job as the key mentioned above.

thurmy34 · Post by **thurmy34** » Mon Sep 11, 2006 6:02 am

Hi

I would say your key is not a key because it's not unique.
What are the traitements include in your next job ?
How is working your lookup ?
Maybe you can add a column which will become the hash_file key.

Hope This Help.

kcbland · Post by **kcbland** » Mon Sep 11, 2006 7:16 am

It's a key, it must be unique. Repeating rows with the same key will overwrite.

alan_antony · Post by **alan_antony** » Mon Sep 11, 2006 7:39 am

Thank you very much..
I got through the problem..

Actually, I had a doubt. The job is like this.

Skey Skey
Amount---->Aggregator (Sum on amount)-----> New_Amount
EID EID
AE Skey AE Skey
After this I need to pick up the records with max(New_Amount) Corresponding to each Key. I Put a Aggregator stage but it was not helping me get the correct result...

I Put the Sort stage after the Aggregator(Summation) and specified the properties(Sort Specification) as New_Amount ASC . I Did get the Max Amount in the Hash File which was the Target. But, Actually in this case What I feel is that,the lowest amount should have been loaded for the corresponding key specified in the hash
file When the records are overided....I'm not sure of the result...

thurmy34 · Post by **thurmy34** » Mon Sep 11, 2006 7:46 am

Hi

I don't understand your last post.
If you sum your amount on Skey you will have only one row in the aggregator ouput.

Skey Amount
1 100
2 200
1 100
2 200
3 300

with this data the result will be
Skey NewAmount
1 200
2 400
3 300

And now the hash_file with don't override any records.

alan_antony · Post by **alan_antony** » Mon Sep 11, 2006 7:56 am

The values are like this

Skey Amount EID AESkey
1 20 35 12
2 30 45 23
1 5 35 12
1 13 12 15

The output from the Aggregator After Summation on Amount With Group By function on rest Three Fields will give

Skey Amount EID AESkey
1 25 35 12
2 30 45 23
1 13 12 15

Now I need to Pick the MAX(Amount ) Based on the Skey.

I hope this will help you understand the problem..

Thank You

thurmy34 · Post by **thurmy34** » Mon Sep 11, 2006 8:18 am

I think that you can't do that without lose or separate the Skey.

I think a transformer with an aggregator will do what you want but again a key is a key if it's unique.

Agg1(Sum(amount) on Skey,EID,Aeskey) ---> Transformer (Skey,NewAmount) ---> Agg2 (Max(amount) on Skey) ---> hash_file (Key?)

I don't know if my i'm clear.

nirajtakalkhede · Post by **nirajtakalkhede** » Mon Sep 11, 2006 9:16 am

Just Sort the file based on SKEY and then sort numeric reverse based on Amount .. Then choose the first incoming record for SKEY and that will be the one with MAX amount....

alan_antony · Post by **alan_antony** » Mon Sep 11, 2006 9:53 am

Nirah,

Can you please elaborate on this..

alan_antony · Post by **alan_antony** » Thu Sep 14, 2006 1:39 pm

I found the solution!

Put two aggregators, one for the sum and the other for the max. Then put a sorter in ascending and load it to the hash file. This will help get the required output :D

DSXchange

-------Rows missing in tha Hash File--------

-------Rows missing in tha Hash File--------

hi

Re: hi

Re: hi