Dynamic Hashed file

Post questions here relative to DataStage Server Edition for such areas as Server job design, DS Basic, Routines, Job Sequences, etc.

Moderators: chulett, rschirm, roy

Post Reply
avi21st
Charter Member
Charter Member
Posts: 135
Joined: Thu May 26, 2005 10:21 am
Location: USA

Dynamic Hashed file

Post by avi21st »

Needed small clarification on Hash Load. :)

We have a look up to be used while loading data to the target table.
The look up has to be updated automatically whenever the table is updated.
i.e. the lookup should be dynamic.

We have created one table for look up (eg SETID lookup) and inserted one row.
We ran the hash load before running the server job for the first time.
Now one more row has been inserted into the look up table.
If we run the server job without running the hash load, the look up is having only one row.
Its not getting updated whenever the table has changed.
Our requirement is without running the hash load, we want the data to be updated in the hash file.

Can u pls suggest me how to implement the dynamic hash concept?


Regards
Avishek
Last edited by avi21st on Thu Mar 30, 2006 2:39 pm, edited 1 time in total.
Avishek Mukherjee
Data Integration Architect
Chicago, IL, USA.
ArndW
Participant
Posts: 16318
Joined: Tue Nov 16, 2004 9:08 am
Location: Germany
Contact:

Post by ArndW »

The reference hashed file needs to be updated from somewhere.
Can you not update your hashed lookup file in your main server job when you insert new rows? This is the most common method of keeping such a lookup file up-to-date.
emma
Premium Member
Premium Member
Posts: 95
Joined: Fri Aug 08, 2003 10:30 am
Location: Montreal

Post by emma »

The referenced hash file can be updated in the same job with the same transformer.
Don't forget to check "Preload file to memory" -> Enabled for the reference and "Allow stage write cache" for the output.
Thanks,
Emma
gateleys
Premium Member
Premium Member
Posts: 992
Joined: Mon Aug 08, 2005 5:08 pm
Location: USA

Post by gateleys »

emma wrote:The referenced hash file can be updated in the same job with the same transformer.
Don't forget to check "Preload file to memory" -> Enabled for the reference and "Allow stage write cache" for the output.
Hi emma,
In that case the job may look something like below, with the properties, that you mentioned, set-

Code: Select all

             HashedFile <---|
               |            |
               v            |
SeqFile ---> Xfmr -----> TargetDB
Can you set 'Allow Stage Cache write' with this design? If not, do you mind sharing your design?

Thanks,
gateleys
emma
Premium Member
Premium Member
Posts: 95
Joined: Fri Aug 08, 2003 10:30 am
Location: Montreal

Post by emma »

gateleys ,

Sorry I don't know how to insert the design.

In your design, the target must be a hash file, the same HF as the reference, and then you can update the table.
Thanks,
Emma
gateleys
Premium Member
Premium Member
Posts: 992
Joined: Mon Aug 08, 2005 5:08 pm
Location: USA

Post by gateleys »

emma wrote:gateleys ,

Sorry I don't know how to insert the design.
Hi emma,
To insert design or code, you can use the tag CODE, just above the editor, and end it with *CODE, both within square brackets.

Thanks for sharing the information about concurrent dynamically updating hashed file for 'current' reference.

gateleys
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

You MUST use separate Hashed File stages, one for the lookup, one for the update. This is because a passive stage can not open its output until all its inputs are closed.

Do NOT use read cache or write cache. Do use "lock for update" when performing lookups; this will set a record level lock that will be cleared when the row is written into the hashed file.

And it's "hashed" file, not "hash" file.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
avi21st
Charter Member
Charter Member
Posts: 135
Joined: Thu May 26, 2005 10:21 am
Location: USA

Dynamic HAsh File: Realtime process

Post by avi21st »

gateleys wrote:
emma wrote:The referenced hash file can be updated in the same job with the same transformer.
Don't forget to check "Preload file to memory" -> Enabled for the reference and "Allow stage write cache" for the output.
Hi emma,
In that case the job may look something like below, with the properties, that you mentioned, set-

Code: Select all

             HashedFile <---|
               |            |
               v            |
SeqFile ---> Xfmr -----> TargetDB
Can you set 'Allow Stage Cache write' with this design? If not, do you mind sharing your design?

Thanks,
gateleys

Thanks all for your replies.

Actually what I wanted was not to extract from the Database and load the Hashfile. I wanted some knind for Realtime process which would update the HAshFile whenever a new record is inserted in the Lookup table.

My datastage job would never reload the hash file from the Lookup table. Presently the hash file is small. But we also have in line other jobs having larger files.

Please suggest.
Avishek Mukherjee
Data Integration Architect
Chicago, IL, USA.
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Simply have another link from the Transformer stage to a Hashed File stage. With this you can write whatever rows you like into the hashed file (not "hash" file) subject to a constraint expression on this link.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
emma
Premium Member
Premium Member
Posts: 95
Joined: Fri Aug 08, 2003 10:30 am
Location: Montreal

Post by emma »

Actually there are a lot of situations where I have used this solution to write and reference the same hashed file and it works absolute correctly. Maybe, some lack of performance if you have some more as 2 millions records.
Thanks,
Emma
avi21st
Charter Member
Charter Member
Posts: 135
Joined: Thu May 26, 2005 10:21 am
Location: USA

Post by avi21st »

Thanks all for your inputs.

I have forwarded your suggestions to the team. I would update you with their decision.

Previously how we planned was:

We didnt want to reload the Hash file in each run.
We wanted to update reload the Hash file only if the Lookup table has new row in it.
We planned to write a Unix scipts which would run count the number of records present in the new SETID lookup and compare with a previous value(total count of records in the old SETID lookup table) stored in a file say TotalRecCountBefore.txt....always update the total count after each job run

If they are same no need to run the Hash file load sequence (Job activity). If true then run the Hash file load activity.

Now I think we should use Datastage for this.

Thanks
Avishek Mukherjee
Data Integration Architect
Chicago, IL, USA.
Post Reply