Hash file maximum number of columns
Moderators: chulett, rschirm, roy
Hash file maximum number of columns
does anybody know if there is a maximum number
of columns in a hash file. We are working with Hash files
which have more than 150 columns, and the result is that
we have poor performances, even some jobs are in the running
status but links are in a starting status and the jobs do nothing
So here is my second question, is there something in the datastage product which enable a sort of timeout, or can it be pragrammated ?
Thank you for your answers...
of columns in a hash file. We are working with Hash files
which have more than 150 columns, and the result is that
we have poor performances, even some jobs are in the running
status but links are in a starting status and the jobs do nothing
So here is my second question, is there something in the datastage product which enable a sort of timeout, or can it be pragrammated ?
Thank you for your answers...
Re: Hash file maximum number of columns
Somewhere there will be a limit on the number of columns but I don't think you've reached it yet.
I would consider 150 columns a bit too much to put in a hashfile.
Ogmios
I would consider 150 columns a bit too much to put in a hashfile.
Ogmios
Re: Hash file maximum number of columns
If its anything like a transformer then i would stay away from 1000 columns. I killed DS when i tried this with an occurs depending field from a CFF.p.thier1 wrote:does anybody know if there is a maximum number
of columns in a hash file.
Do you _have_ to put all the columns in the HASH file? These files are best utilised by using few columns for quick reference - i.e. a key and a data column.p.thier1 wrote:We are working with Hash files
which have more than 150 columns, and the result is that
we have poor performances
If you are doing delta detection, push that to the DB you need to perform on, otherwise limit the number of columns you are using.
You might also try using STATIC HASH files for performance gains, but i would seriously consider moving away from 150 columns.
dnzl
"what the thinker thinks, the prover proves" - Robert Anton Wilson
"what the thinker thinks, the prover proves" - Robert Anton Wilson
Thank you for your answer, that's what I thought
I advise developpers to create very little hash file (one key, one field
you want to use in a look up ) but they prefer build a very large
hash file which will be used along the whole process (more
than 100 times) .
Do you have any ideas about my secon question ?
Is is possible to define a timeout for a job ?
thanks a lot for your help...
I advise developpers to create very little hash file (one key, one field
you want to use in a look up ) but they prefer build a very large
hash file which will be used along the whole process (more
than 100 times) .
Do you have any ideas about my secon question ?
Is is possible to define a timeout for a job ?
thanks a lot for your help...
it is tempting to do that, but in the greater scheme of things it is not good practice. Rather have them plan the job design a little more.p.thier1 wrote:they prefer build a very large hash file which will be used along the whole process
As far as I know, only for inactivity while a developer has the job open in designer, not while it is running. You might want to look at the cache settings on the HASH files; this might explain why there is the 'wait' when a job is 'finished'p.thier1 wrote: Do you have any ideas about my secon question ?
Is is possible to define a timeout for a job ?
dnzl
"what the thinker thinks, the prover proves" - Robert Anton Wilson
"what the thinker thinks, the prover proves" - Robert Anton Wilson
OK, I will try to explain
When we run job (via a job control), it happens that this jobs begin
its treatment and after, it stays in a running status but the
number of treated records doesn't increase (maybe a lock on the
database, problem with HSH or Sequential files ...). Even, It can be in a running status, links of this job are in a starting status but in fact
the treatment never starts ....
The Idee is to define for example a maximum time duration for each
job in the processus and after this time, stop the processus.
Imagine a job which habitually take 1mn to do its treatement, I can fix
a maximum duration time at 3 mn (for example). The day it appears, I can automatically stop the process and send a mail for example...
I hope you will understand me ...
When we run job (via a job control), it happens that this jobs begin
its treatment and after, it stays in a running status but the
number of treated records doesn't increase (maybe a lock on the
database, problem with HSH or Sequential files ...). Even, It can be in a running status, links of this job are in a starting status but in fact
the treatment never starts ....
The Idee is to define for example a maximum time duration for each
job in the processus and after this time, stop the processus.
Imagine a job which habitually take 1mn to do its treatement, I can fix
a maximum duration time at 3 mn (for example). The day it appears, I can automatically stop the process and send a mail for example...
I hope you will understand me ...
Sure, that's what I thought you meant but wanted to be sure.
We do this very thing via Job Control wrappers, the utilities that execute each job. One of the 'parameters' associated with each job is a threshold much like you mention. The job control code constantly monitors the execution time versus the threshold setting and alarms out once it is exceeded.
Not something automagic, as far as I know, but it can be done.
We do this very thing via Job Control wrappers, the utilities that execute each job. One of the 'parameters' associated with each job is a threshold much like you mention. The job control code constantly monitors the execution time versus the threshold setting and alarms out once it is exceeded.
Not something automagic, as far as I know, but it can be done.
-craig
"You can never have too many knives" -- Logan Nine Fingers
"You can never have too many knives" -- Logan Nine Fingers
Don't really have an example handy, but perhaps a quick overview would help.p.thier1 wrote:can you give me more explanation about your answer or show me an example of code you use in your job control...
One thing I'd suggest is for you to build an example Sequencer job and have it run several jobs at the same time using the Job Activity stage. Run any kind of a trigger from each job to a Sequencer stage and set the mode to "All". Doesn't even need to run, all I want you to do is - after you compile it - go to the Job Properties and look at the Job Control tab. There you will see the generated code for the Sequencer. It will look a little... messy... but you will gather some key learnings from it.
Basically, your job control code is hand coded BASIC that runs and monitors your jobs, much like the Sequencer job does. Do this by building memory structures, typically linked lists of dynamic arrays, that hold the 'handles' and status information (among other things) of all the jobs it is currently running. You can then build routines to continuously loop through these arrays and check to see what the jobs are doing, until all of the monitored jobs are in some sort of completed state, good or bad.
Check their status to see if they are finished. If they are finished, perhaps do something like log their completion time and statistics. If a job is still running, compare the current time to the saved start time of the job and get a delta. Compare that delta to a configuration threshold parameter and send out an alarm if the threshold runtime has been exceeded. You could also have a second threshold for a 'kill time', something that if exceeded must mean that something is terribly wrong and the job should be stopped. Send out another alarm and issue a stop against the job.
There are all kinds of possibilities here and this can be as simple (so to speak) or as complicated / feature rich as you feel the need to provide. Most of these activities end up being standard things in your toolkit over time and this is just another opportunity to put those pieces together in an interesting way.
Don't let it intimidate you - once you foray out into the world of hand-rolled code you'll be surprised what you can do without the tether of a GUI.
-craig
"You can never have too many knives" -- Logan Nine Fingers
"You can never have too many knives" -- Logan Nine Fingers