-------Rows missing in tha Hash File--------

Post questions here relative to DataStage Server Edition for such areas as Server job design, DS Basic, Routines, Job Sequences, etc.

Moderators: chulett, rschirm, roy

Post Reply
alan_antony
Participant
Posts: 9
Joined: Sat Jul 22, 2006 5:40 am

-------Rows missing in tha Hash File--------

Post by alan_antony »

Hi,

We have developed a job which sorts records on the basis of a field.
Later on, these records are loaded into a hash file.

The Problem what I am facing is, Some records are missing when i start searching in the hash file. I connected a Sequential file instead of the Hash file and found out that all the records are correct...

When I run the job, the performance statistics says 20793 records are loaded into the Hash file as well as the Sequential file. But , The records loaded into the Hash file are less...The records in the Sequential and Hash file are different in the sense that, some records are missing from the Hash File. I need to Use The Hash File of this job for the next Job.

Could anyone help me on this issue...


Thank you.

Alan
thurmy34
Premium Member
Premium Member
Posts: 198
Joined: Fri Mar 31, 2006 8:27 am
Location: Paris

Post by thurmy34 »

Hi
Do you have keys in your hash files ?
Is so maybe some records are override.

Hope this help
alan_antony
Participant
Posts: 9
Joined: Sat Jul 22, 2006 5:40 am

Post by alan_antony »

Hi,

Yes , I have a key in the hash file named as "Entity_skey" and there are multiple records for that Key...

In that case,How should I proceed since I need the Hash file in the next job as the above mentioned field as key.
thurmy34
Premium Member
Premium Member
Posts: 198
Joined: Fri Mar 31, 2006 8:27 am
Location: Paris

Post by thurmy34 »

Hi,

You must have a key in a hash_file stage but maybe you don't need this stage.

You can use the ouput of the sort stage directly.

FILE --> SORT ---> TRANSFORMER

Hope This Help
alan_antony
Participant
Posts: 9
Joined: Sat Jul 22, 2006 5:40 am

Post by alan_antony »

I have understood what You have said....But...actually, This Hash file (Target File) is required as a lookup in the next Job as the key mentioned above.
thurmy34
Premium Member
Premium Member
Posts: 198
Joined: Fri Mar 31, 2006 8:27 am
Location: Paris

Post by thurmy34 »

Hi

I would say your key is not a key because it's not unique.
What are the traitements include in your next job ?
How is working your lookup ?
Maybe you can add a column which will become the hash_file key.

Hope This Help.
kcbland
Participant
Posts: 5208
Joined: Wed Jan 15, 2003 8:56 am
Location: Lutz, FL
Contact:

Post by kcbland »

It's a key, it must be unique. Repeating rows with the same key will overwrite.
Kenneth Bland

Rank: Sempai
Belt: First degree black
Fight name: Captain Hook
Signature knockout: right upper cut followed by left hook
Signature submission: Crucifix combined with leg triangle
alan_antony
Participant
Posts: 9
Joined: Sat Jul 22, 2006 5:40 am

Post by alan_antony »

Thank you very much..
I got through the problem..

Actually, I had a doubt. The job is like this.

Skey Skey
Amount---->Aggregator (Sum on amount)-----> New_Amount
EID EID
AE Skey AE Skey
After this I need to pick up the records with max(New_Amount) Corresponding to each Key. I Put a Aggregator stage but it was not helping me get the correct result...

I Put the Sort stage after the Aggregator(Summation) and specified the properties(Sort Specification) as New_Amount ASC . I Did get the Max Amount in the Hash File which was the Target. But, Actually in this case What I feel is that,the lowest amount should have been loaded for the corresponding key specified in the hash
file When the records are overided....I'm not sure of the result...
thurmy34
Premium Member
Premium Member
Posts: 198
Joined: Fri Mar 31, 2006 8:27 am
Location: Paris

Post by thurmy34 »

Hi

I don't understand your last post.
If you sum your amount on Skey you will have only one row in the aggregator ouput.

Skey Amount
1 100
2 200
1 100
2 200
3 300

with this data the result will be
Skey NewAmount
1 200
2 400
3 300

And now the hash_file with don't override any records.
Hope This Helps
Regards
alan_antony
Participant
Posts: 9
Joined: Sat Jul 22, 2006 5:40 am

Post by alan_antony »

The values are like this

Skey Amount EID AESkey
1 20 35 12
2 30 45 23
1 5 35 12
1 13 12 15


The output from the Aggregator After Summation on Amount With Group By function on rest Three Fields will give

Skey Amount EID AESkey
1 25 35 12
2 30 45 23
1 13 12 15

Now I need to Pick the MAX(Amount ) Based on the Skey.

I hope this will help you understand the problem..

Thank You
thurmy34
Premium Member
Premium Member
Posts: 198
Joined: Fri Mar 31, 2006 8:27 am
Location: Paris

Post by thurmy34 »

I think that you can't do that without lose or separate the Skey.

I think a transformer with an aggregator will do what you want but again a key is a key if it's unique.



Agg1(Sum(amount) on Skey,EID,Aeskey) ---> Transformer (Skey,NewAmount) ---> Agg2 (Max(amount) on Skey) ---> hash_file (Key?)

I don't know if my i'm clear.
Hope This Helps
Regards
nirajtakalkhede
Participant
Posts: 1
Joined: Fri Oct 21, 2005 4:47 am

hi

Post by nirajtakalkhede »

Just Sort the file based on SKEY and then sort numeric reverse based on Amount .. Then choose the first incoming record for SKEY and that will be the one with MAX amount....
alan_antony
Participant
Posts: 9
Joined: Sat Jul 22, 2006 5:40 am

Re: hi

Post by alan_antony »

Nirah,

Can you please elaborate on this..
alan_antony
Participant
Posts: 9
Joined: Sat Jul 22, 2006 5:40 am

Re: hi

Post by alan_antony »

I found the solution!

Put two aggregators, one for the sum and the other for the max. Then put a sorter in ascending and load it to the hash file. This will help get the required output :D
Post Reply