Hash file maximum number of columns

Post questions here relative to DataStage Server Edition for such areas as Server job design, DS Basic, Routines, Job Sequences, etc.

Moderators: chulett, rschirm, roy

Post Reply
p.thier1
Participant
Posts: 10
Joined: Wed Jun 09, 2004 7:26 am

Hash file maximum number of columns

Post by p.thier1 »

does anybody know if there is a maximum number
of columns in a hash file. We are working with Hash files
which have more than 150 columns, and the result is that
we have poor performances, even some jobs are in the running
status but links are in a starting status and the jobs do nothing

So here is my second question, is there something in the datastage product which enable a sort of timeout, or can it be pragrammated ?

Thank you for your answers...
ogmios
Participant
Posts: 659
Joined: Tue Mar 11, 2003 3:40 pm

Re: Hash file maximum number of columns

Post by ogmios »

Somewhere there will be a limit on the number of columns but I don't think you've reached it yet.

I would consider 150 columns a bit too much to put in a hashfile.

Ogmios
denzilsyb
Participant
Posts: 186
Joined: Mon Sep 22, 2003 7:38 am
Location: South Africa
Contact:

Re: Hash file maximum number of columns

Post by denzilsyb »

p.thier1 wrote:does anybody know if there is a maximum number
of columns in a hash file.
If its anything like a transformer then i would stay away from 1000 columns. I killed DS when i tried this with an occurs depending field from a CFF.
p.thier1 wrote:We are working with Hash files
which have more than 150 columns, and the result is that
we have poor performances
Do you _have_ to put all the columns in the HASH file? These files are best utilised by using few columns for quick reference - i.e. a key and a data column.

If you are doing delta detection, push that to the DB you need to perform on, otherwise limit the number of columns you are using.

You might also try using STATIC HASH files for performance gains, but i would seriously consider moving away from 150 columns.
dnzl
"what the thinker thinks, the prover proves" - Robert Anton Wilson
p.thier1
Participant
Posts: 10
Joined: Wed Jun 09, 2004 7:26 am

Post by p.thier1 »

Thank you for your answer, that's what I thought
I advise developpers to create very little hash file (one key, one field
you want to use in a look up ) but they prefer build a very large
hash file which will be used along the whole process (more
than 100 times) .

Do you have any ideas about my secon question ?

Is is possible to define a timeout for a job ?

thanks a lot for your help...
denzilsyb
Participant
Posts: 186
Joined: Mon Sep 22, 2003 7:38 am
Location: South Africa
Contact:

Post by denzilsyb »

p.thier1 wrote:they prefer build a very large hash file which will be used along the whole process
it is tempting to do that, but in the greater scheme of things it is not good practice. Rather have them plan the job design a little more.
p.thier1 wrote: Do you have any ideas about my secon question ?
Is is possible to define a timeout for a job ?
As far as I know, only for inactivity while a developer has the job open in designer, not while it is running. You might want to look at the cache settings on the HASH files; this might explain why there is the 'wait' when a job is 'finished'
dnzl
"what the thinker thinks, the prover proves" - Robert Anton Wilson
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

p.thier1 wrote:Is is possible to define a timeout for a job ?
Depends on what you mean by that. Expand on it a little for us...
-craig

"You can never have too many knives" -- Logan Nine Fingers
p.thier1
Participant
Posts: 10
Joined: Wed Jun 09, 2004 7:26 am

Post by p.thier1 »

OK, I will try to explain

When we run job (via a job control), it happens that this jobs begin
its treatment and after, it stays in a running status but the
number of treated records doesn't increase (maybe a lock on the
database, problem with HSH or Sequential files ...). Even, It can be in a running status, links of this job are in a starting status but in fact
the treatment never starts ....

The Idee is to define for example a maximum time duration for each
job in the processus and after this time, stop the processus.

Imagine a job which habitually take 1mn to do its treatement, I can fix
a maximum duration time at 3 mn (for example). The day it appears, I can automatically stop the process and send a mail for example...

I hope you will understand me ...
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Sure, that's what I thought you meant but wanted to be sure. :wink:

We do this very thing via Job Control wrappers, the utilities that execute each job. One of the 'parameters' associated with each job is a threshold much like you mention. The job control code constantly monitors the execution time versus the threshold setting and alarms out once it is exceeded.

Not something automagic, as far as I know, but it can be done.
-craig

"You can never have too many knives" -- Logan Nine Fingers
p.thier1
Participant
Posts: 10
Joined: Wed Jun 09, 2004 7:26 am

Post by p.thier1 »

OK

can you give me more explanation about your answer or
show me an example of code you use in your job control...

Thanks a lot
p.thier1
Participant
Posts: 10
Joined: Wed Jun 09, 2004 7:26 am

Post by p.thier1 »

I have just phone to the french ascential hot line
and they told me that there in no number column
limit for universe table but the limit is fixed (by ascential)
at 8192 columns in a hash file ...

I Hope it may interest you to know that ...
kduke
Charter Member
Charter Member
Posts: 5227
Joined: Thu May 29, 2003 9:47 am
Location: Dallas, TX
Contact:

Post by kduke »

You can hit limits in how long you are connected to a database. In Oracle I have gotten "snapshot too old" errors. The job took too long to process the data.
Mamu Kim
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

p.thier1 wrote:can you give me more explanation about your answer or show me an example of code you use in your job control...
Don't really have an example handy, but perhaps a quick overview would help.

One thing I'd suggest is for you to build an example Sequencer job and have it run several jobs at the same time using the Job Activity stage. Run any kind of a trigger from each job to a Sequencer stage and set the mode to "All". Doesn't even need to run, all I want you to do is - after you compile it - go to the Job Properties and look at the Job Control tab. There you will see the generated code for the Sequencer. It will look a little... messy... but you will gather some key learnings from it.

Basically, your job control code is hand coded BASIC that runs and monitors your jobs, much like the Sequencer job does. Do this by building memory structures, typically linked lists of dynamic arrays, that hold the 'handles' and status information (among other things) of all the jobs it is currently running. You can then build routines to continuously loop through these arrays and check to see what the jobs are doing, until all of the monitored jobs are in some sort of completed state, good or bad.

Check their status to see if they are finished. If they are finished, perhaps do something like log their completion time and statistics. If a job is still running, compare the current time to the saved start time of the job and get a delta. Compare that delta to a configuration threshold parameter and send out an alarm if the threshold runtime has been exceeded. You could also have a second threshold for a 'kill time', something that if exceeded must mean that something is terribly wrong and the job should be stopped. Send out another alarm and issue a stop against the job.

There are all kinds of possibilities here and this can be as simple (so to speak) or as complicated / feature rich as you feel the need to provide. Most of these activities end up being standard things in your toolkit over time and this is just another opportunity to put those pieces together in an interesting way.

Don't let it intimidate you - once you foray out into the world of hand-rolled code you'll be surprised what you can do without the tether of a GUI. :wink:
-craig

"You can never have too many knives" -- Logan Nine Fingers
Post Reply