Page 1 of 1

Load from Multiple Hash

Posted: Mon Nov 29, 2004 9:13 am
by coolkhan08
Hi,
I wanted to know how we can load from multiple hash files, having a single column into another hash file having that column and a coresponding type column which would have a hard coded value for the column coming from each hash file. For ex. if we have 5 source hash files then the target hash file would have all the values of all the sources with corresponding values 1 to 5 in the column next to the value in the target.
Thanks
Sam

Posted: Mon Nov 29, 2004 9:15 am
by kcbland
Can you please post an example, as I cannot follow what you are trying to do.

Posted: Mon Nov 29, 2004 9:20 am
by coolkhan08
I have 5 hash files with a single column. Iam trying to merge all these files into a single file and associate a type value in the target hash file. The value coming from hash file 1 can have a type value as 1 and value coming from second hash file can have a value of 2 in addition to the hash file column.
Hope this makes it clear.

Posted: Mon Nov 29, 2004 9:59 am
by kcbland
Maybe I'm just dense this morning, but I'm still not following. If you have 5 hash files with a single column, then that column MUST be the primary key.

If you're "trying to merge all these files into a single file", then you need to in effect concatenate and then unique the results.

Now you talk about associate values across hash files. This leads me to believe that you are mistating the situation. If you have a primary key in each hash file, plus an attribute column, then we're talking about a different situation. You either have a driver file that references the other hash files; or you have a full outer join situation where the attributes would categorize into separate columns in the final result set.

Can you elaborate?

Posted: Mon Nov 29, 2004 12:29 pm
by coolkhan08
Now Iam dense... :)
For Ex:
HashA HashB HashC
Col1 Col1 Col1
text1 text3 Text1
text2 text1 Text4

The output I need is
HashOutput
Col1 Col2
text1 A
text2 A
text3 B
text1 B
text1 C
text4 C
How can this be acheived other than going for 3 different jobs.
Thanks.
Sam

Posted: Mon Nov 29, 2004 12:42 pm
by kcbland

Code: Select all

HASHa --> XFMa --> 
HASHb --> XFMb --> LINKCOLLECTOR --> AGG --> SEQ
HASHc --> XFMc --> 
Use your target file metadata in the XFM? stages. Map each hash file into that metadata, putting @NULL as the derivations for the columns that don't come from that hash file.

In the AGGregator stage, group by the primary key column. For ALL attribute columns choose MAX(...) function. When the source row has @NULL, it won't apply towards the max. This will pivot the potentially 3 rows into one. Output from the AGGregator to the stage of your choice, I show a SEQ file.