Stop the job from running when there is no i/p file

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
Sravani
Participant
Posts: 23
Joined: Thu Jun 15, 2006 3:56 am
Location: Hyderabad

Stop the job from running when there is no i/p file

Post by Sravani »

Hi,
I have a sequencer which runs 100s of jobs in parallel. All these jobs are being ran even though if the input file is of 0KB. which is actually wastage of time. So We want to restrict those jobs from being ran when there is no input file. Can anybody help me in finding a way out?

Thanks.
keshav0307
Premium Member
Premium Member
Posts: 783
Joined: Mon Jan 16, 2006 10:17 pm
Location: Sydney, Australia

Post by keshav0307 »

so call the job, only when there is atleast one record in the file.
Sravani
Participant
Posts: 23
Joined: Thu Jun 15, 2006 3:56 am
Location: Hyderabad

Post by Sravani »

Is there any effective way to check that?
Sravani
ag_ram
Premium Member
Premium Member
Posts: 524
Joined: Wed Feb 28, 2007 3:51 am

Post by ag_ram »

Sravani

One of the solutions may be to write a Unix script(Which is your Operating Environment as you mentioned) to check the file existence and if file exists, check its emptiness.

And add an Execute command activity with calling that sceript before all those Job Activities in the Job Sequence.
keshav0307
Premium Member
Premium Member
Posts: 783
Joined: Mon Jan 16, 2006 10:17 pm
Location: Sydney, Australia

Post by keshav0307 »

in sequence you can use nested loop condition .
insert a execute command stage, check the file size or line count in the file there.
if the line count is greater then 0 then call the job, else exit
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Since it's in a job sequence, making use of the test command in an Execute Command activity is a convenient way to check for existence and/or size of a file.

On Windows it's marginally more indirect - you can use DIR command in your Execute Command activity and parse the result using an expression based on the command's output.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Sravani
Participant
Posts: 23
Joined: Thu Jun 15, 2006 3:56 am
Location: Hyderabad

Post by Sravani »

Hi Ram,
Thanks for reply.

It is a very good Idea actually.

But in our case, the sequencer is very huge. We are having 100's of jobs being called in the sequencer. Hence 100's of Exceute Command activities should be called before 100's of jobs. So, it would be a big change. So I am looking for any optimal change which can solve my problem.

Thanks.
ag_ram wrote:Sravani

One of the solutions may be to write a Unix script(Which is your Operating Environment as you mentioned) to check the file existence and if file exists, check its emptiness.

And add an Execute command activity with calling that sceript before all those Job Activities in the Job Sequence.
OddJob
Participant
Posts: 163
Joined: Tue Feb 28, 2006 5:00 am
Location: Sheffield, UK

Post by OddJob »

You could place a Nested Condition stage prior to the jobs on the Sequencer. A link should go the Nested Condition to a job, one for each job i.e. 100 links, or as many as required.

In the trigger condition for each link of the Nested Condition, use a Custom Condition that calls a common routine to check a file's existence, returning false if the file doesn't exist and true if it does.

The Nested Condition should then only run the jobs for which files exist.

I think they should all run in parallel as well.
ajay.vaidyanathan
Participant
Posts: 53
Joined: Fri Apr 18, 2008 8:13 am
Location: United States

Stop the job from running when there is no i/p file

Post by ajay.vaidyanathan »

is dis regarding the availability of source file or is it regarding the presence of data in the source file....................if u want ur jobs 2 run only wen ur source file is present den it is possible..................r if it is like u wanna run ur job only wen a populated source file is present den v hav 2 think abt it.................let me know ur requirement................
Regards
Ajay
Sravani
Participant
Posts: 23
Joined: Thu Jun 15, 2006 3:56 am
Location: Hyderabad

Post by Sravani »

Ajay,
Actually, I am writing source file data to one temporary file. Now I want to check whether the temp file exists. If it exists then only I wud like to run the corresponding jobs. Otherwise I dont want to make any call to the corresponding job.

Now, please let me know if there is any possibility with optimal solution.

Thanks.
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Grammar Check

Ajay,

The second person personal pronoun in English is spelled "you", not "u". U is a common Burmese name, and is the name of one of the posters on DSXchange.

The appropriate present tense form of the verb "to be" is spelled "are", not "r".

The second person personal possessive pronoun is spelled "your", not "ur". Ur is/was a city in ancient Babylon.

Not "wanna", which is lazy, but "want to".

The adverb you intended to use is spelled "then", whereas "den" is a place where lions sleep.

The imperative form of the verb "to think" is "have to think", not "hav 2 think".

The correct spelling of the adverb is "about", not "abt". Abt is a form of rack railway used for steep inclines. Your keyboard DOES have a full set of vowels.

Here at DSXchange we encourage and expect professional standards of written English, not least because many of our participants do not have English as a first language, and have to work hard even to comprehend properly written English. We don't want to make their task any harder by taking a ballistic approach to abbreviation.

Nor is DSXchange a mobile telephony device.

Please use good English here - the standard of English you would use in your job documentation or in your curriculum vitae.


These are my requirements, which you invited.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
ag_ram
Premium Member
Premium Member
Posts: 524
Joined: Wed Feb 28, 2007 3:51 am

Post by ag_ram »

Sravani
Can you elaborate saying about How did you design your Job Sequence?

By my assumption,

1. If you have all Job Activities in parallel fashion(different flows), you need to add Execute Command Activity and Sequencer Activity in the Job Sequence as following,

Code: Select all

         [Exec Comm Act]---->[Seqn Act]---->[Job Act](Link/Job Activity)   


Description:
You can set any mode ANY or ALL in the Sequencer Activity since we have only one Input link to it. And we need to derive as many output links from Sequener Actity as the Job Activities you have in the Job Sequence.

2. If all jobs are designed in Serial fashion(one after another), there will be no need of Sequener Activity at all.
Sravani
Participant
Posts: 23
Joined: Thu Jun 15, 2006 3:56 am
Location: Hyderabad

Post by Sravani »

My sequencer is like

There is one job from which I am taking one one input file and the output would be some 100 files depending on some criteria. So in my Sequencer this would be the first job.
Then for each file there is a one specific job. That is for 100 output files from the above job I have created 100 jobs and I am running these 100 jobs parallelly.
But for each input file in the first job it is not compulsory that everytime we can get 100 only.. there can be 10 output files some times. So this count varies for different input files. But we have a single sequencer which runs all the 100 jobs in parallel as they are independent. So, we want to restrict the jobs (out of those 100) which are not having the input file. In other way, I want to run only those 10 jobs eventhough the sequencer is containing 100 jobs because only those 10 files will be created from the job1.
So my question is how can I implement this logic?
ag_ram
Premium Member
Premium Member
Posts: 524
Joined: Wed Feb 28, 2007 3:51 am

Post by ag_ram »

Maybe the least possible solution to execute only rerquired Jobs in the JOb Sequence is as follows,

1. Create a common(reusable) Server Routine with Input parameters as FilePath, FileName. and call a UNIX script with giving those Input prameters via DSExecute Command.
A common UNIX(reusable) script will check the file presence and file emptiness and returns proper return codes.

2. Drag a Nested Condition Activity before all Job Activties with linking its output to all Job Activities and in the Custom Condtion, call your Server routine with passing seperate file name for each Custom Condition and Check the Return code in there.
Reminding you that for each Custom condition, parameters(file name,file path) would be different.

This way may seem to be a likely solution untill the best solution arrives.

Note: If possible, try to make the DataStage Component reusable for ease of undertanding and deriving the optimized solution(regarding 100) Jobs)
ganesh.soundar
Participant
Posts: 9
Joined: Tue Jan 08, 2008 7:21 am
Location: Chennai

Post by ganesh.soundar »

To prevent the execution of jobs that don't have data in its source file, you need to check the data availability of each input file using the shell script and call it will the help of Execute Command. But i feel you need to incorporate this before all the 100 Job activities that are available.

Sravani wrote:My sequencer is like

There is one job from which I am taking one one input file and the output would be some 100 files depending on some criteria. So in my Sequencer this would be the first job.
Then for each file there is a one specific job. That is for 100 output files from the above job I have created 100 jobs and I am running these 100 jobs parallelly.
But for each input file in the first job it is not compulsory that everytime we can get 100 only.. there can be 10 output files some times. So this count varies for different input files. But we have a single sequencer which runs all the 100 jobs in parallel as they are independent. So, we want to restrict the jobs (out of those 100) which are not having the input file. In other way, I want to run only those 10 jobs eventhough the sequencer is containing 100 jobs because only those 10 files will be created from the job1.
So my question is how can I implement this logic?
Post Reply