hashed file

bunny · Post by **bunny** » Tue May 03, 2005 1:55 pm

Hi all,

i could not understabd my target ER model. my source is flat file (tab delimited) , there are 47 fields. from them 21(nsar 62a..... nsar 66 ) fields are manually populated into pri_asset type table as records), 18 fields are manually populated into investment_risk table as records.

there are 5 tables in my target datamodel. 2 of them are manually populated.

nsar table (master table) is loading directly from flatfile, which is again we are using it as look up table when we laod remainning 2 tables pri_ast_typ_nsar_map , investment_rsk_nsar_map.

question 1: when we load table pri_ast_typ_nsar_map from source , we have 2 lookups. nsar table, pri_assett_type table

there is matching key column( FUND NAME) between source and nsar table
but there are no matching key columns between my source and pri_asset_type ( In pri_asset _type table we have key column pri_asset_type _id but that key filed exists as differnt fileds in source ) , whether we have to pivot stage to convert columns to rows. but we are using server jobs .

thanks

kduke · Post by **kduke** » Tue May 03, 2005 2:37 pm

What is your question?

Sunshine2323 · Post by **Sunshine2323** » Tue May 03, 2005 11:27 pm

Hi,

Can you please explain what does this mean,

In pri_asset _type table we have key column pri_asset_type _id but that key filed exists as differnt fileds in source

Does this mean that the names of the key fields are different in the source and the lookup tables? If so there is no issue at all

Also about pivoting, I don't understand how it relates to your problem but PIVOTING CAN BE DONE IN SERVER JOBS.....using the Pivot stage

kduke · Post by **kduke** » Wed May 04, 2005 9:49 am

Most of us use hash files to do all the lookups. Most use surrogate keys in all target tables. You will need to to build cross reference hash file lookups meaning the hash file has the old source key as the hash file key and it stores the surrogate key as a regular (not key) field.

Does that make sense?

Usually the lookups are on the fact table only. There are always exceptions especially when you have snowflaked dimensions. There are also lookups when you combine multiple source tables into one target table. You can do this in a join on the source SQL or load the source tables into hash files and look them up from the main SQL extract. If you can join in source SQL then it usually will perform better than completely loading a source table into a hash file. All this depends on how big each of these tables are. Hash file lookups are very fast especially compared to outer joins. The way to learn what works best is to try it as many different ways as possible.

Sainath.Srinivasan · Post by **Sainath.Srinivasan** » Wed May 04, 2005 10:20 am

In your case, as you are populating all rows from same source, you can run them through transforms and load the result into an hash file or a db table with insert+update rows.

harithay · Post by **harithay** » Wed May 04, 2005 11:40 am

Hi
thanks to Kim, Sainath;

i am adding one more doubt to the my previous question,

i am loading 2 tables manually

pri_asset_type (21 records),

investment_risk (18 records),

above 2 tables are loaded from source table conveting fileds in to rows ,

is it possible to use pri_asset-type , investment_risk tables as lookups when i load other 2 tables
pri_Ast_typ_nsar_map, inv_rsk_nsar-map from same source Nsar_out

because there is no matching column bewteen Nsar_out Table and Pri_asset_typ Table ,

key filed (21 records) in pri_asset_typ exist in source as 21 fileds

in similar way key field (18 records) in investment _risk exista as 18 fields .

kduke · Post by **kduke** » Wed May 04, 2005 12:49 pm

To do any kind of lookup requires something to be in common between the 2. All hash files require one or more fields to be the key fields. The same is true if you lookup using an ODBC stage or a plugin.