Page 1 of 2

retain duplicates out of hash file

Posted: Wed Sep 02, 2009 12:02 pm
by DSbox61
Hi everyone, I have a requirement where i would need to retain duplicates out of hash file.

my primary link is db2 stage going into a transformer. i have reference link coming out of hash file into a transformer stage. output link out of transformer stage goes to a sequential file.

now, in hash file, i have columnA and columnB. I have selected columnA as primary key. i can map this primary key in transfore stage to another column from DB2 stage. but by doing this, if i will be losing duplicates from hash file. i want to retain them. and if i select columnB also as primary key then i don't have anything to map it to from DB2 stage.

the values are:
123,abc
123,def

please let me know how i can retain these duplicates and pass them on to output link. thanks!

Posted: Wed Sep 02, 2009 1:02 pm
by chulett
Sounds like what you want is a "multi row result set" which requires a UV or ODBC stage to implement. Search here for all of the gory details on how to set that up.

Posted: Wed Sep 02, 2009 2:07 pm
by DSbox61
No, Craig. I just want a way to pass on all the duplicates also to my output file.

Posted: Wed Sep 02, 2009 2:46 pm
by chulett
That's just it, there are no duplicates in a hashed file unless you fake it by follow the technique suggested that the search would turn up. Let me make sure we're both on the same page - if you want one input row to hit a reference lookup and return more than one lookup result (your 'duplicates'), that's a "multi row result set".

If that's not what you meant, you'll need to take another shot at explaining what exactly it is you need.

Posted: Thu Sep 03, 2009 4:13 am
by PhilHibbs
DSbox61 wrote:No, Craig. I just want a way to pass on all the duplicates also to my output file.
Once the data goes into a hashed file, the duplicates are gone. The second row with the same key will overwrite the data from the first row. You're trying to turn sausages back into pigs.

Posted: Thu Sep 03, 2009 5:30 am
by chulett
Exactly, which is why one would need to play the game of writing the data out with multiple keys (so as to retain the duplicates) and then reading it back with fewer keys for the "multi-row" part. More complicated than that of course, as a search would reveal, but that's the gist of it.

Posted: Thu Sep 03, 2009 7:32 am
by JRodriguez
Are the values in column B known and an small list? Like Months, Week days, Quater .... If yes then I will be happy to share a trick to de-normalize data that I fetch once in while from my "Bag Of Tricks"


Let us know

Posted: Thu Sep 03, 2009 7:41 am
by Sainath.Srinivasan
Denormalization needs creation of metadata.

Better store the hashed file source into another table and do a join.

Posted: Thu Sep 03, 2009 7:47 am
by chulett
:idea: Rather than continue to go round and round, why don't we wait for DSbox61 to come back and clear things up before we throw out any more 'solutions'?

Posted: Thu Sep 03, 2009 8:47 am
by DSbox61
Thanks everyone for the solutions. Craig, you are right. I'm looking for multi row result set. I had pretty much hit the dead end as i can't use UV or odbc stage as that requires creation of DSNs and lot more stuff which i have to go thru DBAs and Admins. Instead i started re-designing the whole approach and trying to do with session tables and queries and then join the session tables and get the data out. hope it works. :roll:

thanks again everyone.

Posted: Thu Sep 03, 2009 9:14 am
by chulett
The UV stage doesn't require a DSN, just a VOC record and an index, neither of which require admin privs and they can all be automated in a job. But if you want to take a different route, I appreciate that and good luck with it.

Posted: Thu Sep 03, 2009 11:18 am
by DSbox61
Craig, for information sake, i would like to know how i can set up a universe connection. i read the admin manual and it says that i need to make entries in the uvodbc.config file. can you please tell me where i can find this file and is this what i need to be able to use an UV stage.

thanks!

Posted: Thu Sep 03, 2009 12:40 pm
by chulett
You don't need to set up a 'Universe connection' unless (I suppose) you are connecting to an actual Universe database. Here, the technique involves creating a regular hashed file and then accessing it with a UV stage - and all that UV stage needs to do that is a VOC record. Now, you've already got one if the hashed file is created in an 'account' but you will need to add one for 'pathed' hashed files. A search for SETFILE will get you the syntax.

Posted: Thu Sep 03, 2009 3:07 pm
by ray.wurlod
The UniVerse connection already exists, and is called localuv.

Posted: Thu Sep 03, 2009 4:05 pm
by chulett
I guess I should have said 'an actual Universe database other than the local repository'. :wink: