Page 1 of 1

Dynamic Hashed file

Posted: Thu Mar 30, 2006 9:44 am
by avi21st
Needed small clarification on Hash Load. :)

We have a look up to be used while loading data to the target table.
The look up has to be updated automatically whenever the table is updated.
i.e. the lookup should be dynamic.

We have created one table for look up (eg SETID lookup) and inserted one row.
We ran the hash load before running the server job for the first time.
Now one more row has been inserted into the look up table.
If we run the server job without running the hash load, the look up is having only one row.
Its not getting updated whenever the table has changed.
Our requirement is without running the hash load, we want the data to be updated in the hash file.

Can u pls suggest me how to implement the dynamic hash concept?


Regards
Avishek

Posted: Thu Mar 30, 2006 10:09 am
by ArndW
The reference hashed file needs to be updated from somewhere.
Can you not update your hashed lookup file in your main server job when you insert new rows? This is the most common method of keeping such a lookup file up-to-date.

Posted: Thu Mar 30, 2006 12:32 pm
by emma
The referenced hash file can be updated in the same job with the same transformer.
Don't forget to check "Preload file to memory" -> Enabled for the reference and "Allow stage write cache" for the output.

Posted: Thu Mar 30, 2006 1:02 pm
by gateleys
emma wrote:The referenced hash file can be updated in the same job with the same transformer.
Don't forget to check "Preload file to memory" -> Enabled for the reference and "Allow stage write cache" for the output.
Hi emma,
In that case the job may look something like below, with the properties, that you mentioned, set-

Code: Select all

             HashedFile <---|
               |            |
               v            |
SeqFile ---> Xfmr -----> TargetDB
Can you set 'Allow Stage Cache write' with this design? If not, do you mind sharing your design?

Thanks,
gateleys

Posted: Thu Mar 30, 2006 1:26 pm
by emma
gateleys ,

Sorry I don't know how to insert the design.

In your design, the target must be a hash file, the same HF as the reference, and then you can update the table.

Posted: Thu Mar 30, 2006 1:43 pm
by gateleys
emma wrote:gateleys ,

Sorry I don't know how to insert the design.
Hi emma,
To insert design or code, you can use the tag CODE, just above the editor, and end it with *CODE, both within square brackets.

Thanks for sharing the information about concurrent dynamically updating hashed file for 'current' reference.

gateleys

Posted: Thu Mar 30, 2006 2:12 pm
by ray.wurlod
You MUST use separate Hashed File stages, one for the lookup, one for the update. This is because a passive stage can not open its output until all its inputs are closed.

Do NOT use read cache or write cache. Do use "lock for update" when performing lookups; this will set a record level lock that will be cleared when the row is written into the hashed file.

And it's "hashed" file, not "hash" file.

Dynamic HAsh File: Realtime process

Posted: Thu Mar 30, 2006 2:16 pm
by avi21st
gateleys wrote:
emma wrote:The referenced hash file can be updated in the same job with the same transformer.
Don't forget to check "Preload file to memory" -> Enabled for the reference and "Allow stage write cache" for the output.
Hi emma,
In that case the job may look something like below, with the properties, that you mentioned, set-

Code: Select all

             HashedFile <---|
               |            |
               v            |
SeqFile ---> Xfmr -----> TargetDB
Can you set 'Allow Stage Cache write' with this design? If not, do you mind sharing your design?

Thanks,
gateleys

Thanks all for your replies.

Actually what I wanted was not to extract from the Database and load the Hashfile. I wanted some knind for Realtime process which would update the HAshFile whenever a new record is inserted in the Lookup table.

My datastage job would never reload the hash file from the Lookup table. Presently the hash file is small. But we also have in line other jobs having larger files.

Please suggest.

Posted: Thu Mar 30, 2006 2:28 pm
by ray.wurlod
Simply have another link from the Transformer stage to a Hashed File stage. With this you can write whatever rows you like into the hashed file (not "hash" file) subject to a constraint expression on this link.

Posted: Thu Mar 30, 2006 3:05 pm
by emma
Actually there are a lot of situations where I have used this solution to write and reference the same hashed file and it works absolute correctly. Maybe, some lack of performance if you have some more as 2 millions records.

Posted: Fri Mar 31, 2006 7:01 am
by avi21st
Thanks all for your inputs.

I have forwarded your suggestions to the team. I would update you with their decision.

Previously how we planned was:

We didnt want to reload the Hash file in each run.
We wanted to update reload the Hash file only if the Lookup table has new row in it.
We planned to write a Unix scipts which would run count the number of records present in the new SETID lookup and compare with a previous value(total count of records in the old SETID lookup table) stored in a file say TotalRecCountBefore.txt....always update the total count after each job run

If they are same no need to run the Hash file load sequence (Job activity). If true then run the Hash file load activity.

Now I think we should use Datastage for this.

Thanks