Page 1 of 1

use of hash files

Posted: Thu Aug 04, 2005 11:46 am
by shivan
hi,
i m new to datastage. i was going through the documentation but couldnt find what is the real use of hash files. The only thing i understand is that it uses hashing algorithm to feed the rows.

shivan

Posted: Thu Aug 04, 2005 12:37 pm
by diamondabhi
Hi Shivan,
Hash Files can be used for manythingd, search for Hash Files and u will find a lot of information about it.

Thanks,
Abhi.

Posted: Thu Aug 04, 2005 2:12 pm
by kduke
The main function of hash files is for fast lookups.

Posted: Thu Aug 04, 2005 4:32 pm
by ray.wurlod
Very fast.

And it's hashed file, not hash file.

Posted: Thu Aug 04, 2005 4:34 pm
by pnchowdary
They can also be used for eliminating duplicates. :)

Posted: Fri Aug 05, 2005 1:30 am
by ppalka
And to handle multivalue fields :)

Posted: Fri Aug 05, 2005 8:38 am
by kumar_s
pnchowdary wrote:They can also be used for eliminating duplicates. :)
May i know how to eliminate duplicate using this stage, is it thru lookup with the same set of file or thru some other method.......

regards
kumar

Posted: Fri Aug 05, 2005 8:42 am
by ArndW
Kumar,

the key in a hash file is always unique, so doing a subsequent WRITE to the same key will overwrite the previous value; thereby removing duplicates.

-Arnd.

Posted: Fri Aug 05, 2005 8:48 am
by kumar_s
ArndW wrote:Kumar,

the key in a hash file is always unique, so doing a subsequent WRITE to the same key will overwrite the previous value; thereby removing duplicates.

-Arnd.
Thanx Arnd i never thought about this...

regards
kumar

Posted: Fri Aug 05, 2005 10:49 am
by kollurianu

Code: Select all

And to handle multivalue fields 

what does the above statement mean ppalka? can explain me...


Thanks a bunch,

Posted: Fri Aug 05, 2005 7:15 pm
by ray.wurlod
A multi-valued field (or column) may contain a list, rather than an atomic value. Some databases support the concept, though sometimes not using this terminology. This is particularly so with a database that promises OLAP capability (such as Oracle 9i/10g and MS SQL Server Plato) and databases with a Pick heritage (such as UniVerse, UniData, D3 and so on).
The main "support" for multi-valued data in DataStage is the ability automatically to "explode", or "un-nest", the multiple values to expose the "nested table" in at least first normal form.