Duplicate Row Issue

Post questions here relative to DataStage Server Edition for such areas as Server job design, DS Basic, Routines, Job Sequences, etc.

Moderators: chulett, rschirm, roy

Post Reply
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Reverse your sort order. Last one in wins, period - so if you want the 'first' one you need to make sure it arrives last.
-craig

"You can never have too many knives" -- Logan Nine Fingers
rajkraj
Premium Member
Premium Member
Posts: 98
Joined: Wed Jun 15, 2005 1:41 pm

Post by rajkraj »

Thanks for your quick reply, but sorting is not possible because I have only 2 columns in hash file (member_key,sese_id) and member_key is key column.
chulett wrote:Reverse your sort order. Last one in wins, period - so if you want the 'first' one you need to make sure it arrives last.
DSguru2B
Charter Member
Charter Member
Posts: 6854
Joined: Wed Feb 09, 2005 3:44 pm
Location: Houston, TX

Post by DSguru2B »

Why isnt reverse sorting possible? Any who,
Another approach is , that you can write to a hashed file and reference the same hashed file. If the key is found dont write it again. Hence your constraint will be reflink.NOTFOUND.
Make sure your hashed file is locked for updates.
Creativity is allowing yourself to make mistakes. Art is knowing which ones to keep.
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

You don't - can't - have multiples of the key value in a hashed file. They way they work relies absolutely upon having unique keys.

Therefore, you will need a different approach - perhaps a secondary key lookup via a UV stage from a hashed file with an artificial key. If you do that, you can readily constrain for the first matching record (you will need to sort the result set, of course, using an ORDER BY clause - you can use FIRST 1 as the additional constraint).
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

rajkraj wrote:Thanks for your quick reply, but sorting is not possible because I have only 2 columns in hash file (member_key,sese_id) and member_key is key column.
Not sure what having two columns in the hashed file has to do with the ability to sort. You stated that when you build a hashed file and send more than one row for a particular key, the last one in is what is retained. That is correct due to the 'destructive overwrite' nature of hashed files.

You then stated that you wanted the first record for any particular key to be the one 'saved'. You do that a couple of different ways. One way is to ensure that the first record per key arrives last, hence the 'reverse your sort order' bit of advice. There's no reason that can't happen.

Or you can arrange for the first record per key to be the only record per key written. This requires the use of stage variables and the ability to detect changing 'groups' - in your case, only allow the first occurance of member_key to be written to the hashed file.

Either approach will solve your problem as I've understood it.
-craig

"You can never have too many knives" -- Logan Nine Fingers
rajkraj
Premium Member
Premium Member
Posts: 98
Joined: Wed Jun 15, 2005 1:41 pm

Post by rajkraj »

Thanks for your reply...but how would we do 'reverse sort' on a single key.

Ex:Member_key Id_key
1234 23
1234 24
1234 25

When we store this data into a Hashed file it will keep the last record but I want the first record.

I used order by while extracting the data from table but it's not changing the order

Please advise me
chulett wrote:
rajkraj wrote:Thanks for your quick reply, but sorting is not possible because I have only 2 columns in hash file (member_key,sese_id) and member_key is key column.
Not sure what having two columns in the hashed file has to do with the ability to sort. You stated that when you build a hashed file and send more than one row for a particular key, the last one in is what is retained. That is correct due to the 'destructive overwrite' nature of hashed files.

You then stated that you wanted the first record for any particular key to be the one 'saved'. You do that a couple of different ways. One way is to ensure that the first record per key arrives last, hence the 'reverse your sort order' bit of advice. There's no reason that can't happen.

Or you can arrange for the first record per key to be the only record per key written. This requires the use of stage variables and the ability to detect changing 'groups' - in your case, only allow the first occurance of member_key to be written to the hashed file.

Either approach will solve your problem as I've understood it.
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

So, from your eample:

Code: Select all

Member_key Id_key 
1234        23 
1234        24 
1234        25 
You would like the Id_key of 23 to be stored for Member_key 1234? You've said you've done an order by, now you just need to make the secondary sort descending. Something like:

Code: Select all

ORDER BY Member_key, Id_key desc
That should make them arrive like this:

Code: Select all

Member_key Id_key 
1234        25
1234        24 
1234        23 
And solve your problem. Yes?
-craig

"You can never have too many knives" -- Logan Nine Fingers
rajkraj
Premium Member
Premium Member
Posts: 98
Joined: Wed Jun 15, 2005 1:41 pm

Post by rajkraj »

Thanks for quick reply ....but my issue is still ther ...

Actually my data looks like this....

Member_key Id_key
1234 75
1234 abc
1234 %Agh
1234 =AVX

here order by is not useful...right....

chulett wrote:So, from your eample:

Code: Select all

Member_key Id_key 
1234        23 
1234        24 
1234        25 
You would like the Id_key of 23 to be stored for Member_key 1234? You've said you've done an order by, now you just need to make the secondary sort descending. Something like:

Code: Select all

ORDER BY Member_key, Id_key desc
That should make them arrive like this:

Code: Select all

Member_key Id_key 
1234        25
1234        24 
1234        23 
And solve your problem. Yes?
DSguru2B
Charter Member
Charter Member
Posts: 6854
Joined: Wed Feb 09, 2005 3:44 pm
Location: Houston, TX

Post by DSguru2B »

Did you try giving method a shot. Its easy and painless with no sorting involved. It will retain the first record of a key in the order it came in.
Creativity is allowing yourself to make mistakes. Art is knowing which ones to keep.
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Would have been nice to have sample data that matched reality a little better. :?

You can still sort, yes. Just to beat this horsey a little longer...

Assign a record number to each row in a transformer, @INROWNUM will work nicely. Then use a Sort stage to sort ascending on your key field and descending on the record number. For example:

Code: Select all

Member_key Id_key Record
1234        75      1
1234        abc     2
1234        %Agh    3
1234        =AVX    4
What will come out of the Sort stage will be:

Code: Select all

Member_key Id_key Record
1234        =AVX    4
1234        %Agh    3
1234        abc     2
1234        75      1
It *can* be done.
-craig

"You can never have too many knives" -- Logan Nine Fingers
rajkraj
Premium Member
Premium Member
Posts: 98
Joined: Wed Jun 15, 2005 1:41 pm

Post by rajkraj »

"THANKS A LOT" It would work.


quote="chulett"]Would have been nice to have sample data that matched reality a little better. :?


You can still sort, yes. Just to beat this horsey a little longer...

Assign a record number to each row in a transformer, @INROWNUM will work nicely. Then use a Sort stage to sort ascending on your key field and descending on the record number. For example:

Code: Select all

Member_key Id_key Record
1234        75      1
1234        abc     2
1234        %Agh    3
1234        =AVX    4
What will come out of the Sort stage will be:

Code: Select all

Member_key Id_key Record
1234        =AVX    4
1234        %Agh    3
1234        abc     2
1234        75      1
It *can* be done.[/quote]
Post Reply