Data set stage question

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
jreddy
Premium Member
Premium Member
Posts: 202
Joined: Tue Feb 03, 2004 5:09 pm

Data set stage question

Post by jreddy »

I have 3 independent jobs
1) Reads source data into dataset1 and existing target data into dataset2
2) The 2 above datasets are input to a change capture stage and the differences captured to a dataset3
3)Read dataset3 and apply changes via a transformer/filter

When i run these 3 jobs independently one after the other from designer, the datasets have the right information (datasets have OVERWRITE option)
But when i run them from sequencer, the datasets are not updated with correct/latest information, but seem to have data from prior runs and its producing incorrect results... What could be the possible problem?

i have made sure that job params are exactly the same and mapped correctly and also using 'DAtaset management' tool, i was able to see that the file modified time doesnt change when i run the jobs via sequencer..
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Are there any warnings in the log, particularly about changes to structure of Data Set, or "node not found", either if which might indicate use of an incompatible configuration file than that used when the Data Set was created.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
ArndW
Participant
Posts: 16318
Joined: Tue Nov 16, 2004 9:08 am
Location: Germany
Contact:

Re: Data set stage question

Post by ArndW »

jreddy wrote:...But when i run them from sequencer, the datasets are not updated with correct/latest information...
What is your timing in the sequencer, you should have the completion of the first job trigger the next; it seem that you are not doing this and thus might be getting a copy of the old datasets before they are re-created.
jreddy
Premium Member
Premium Member
Posts: 202
Joined: Tue Feb 03, 2004 5:09 pm

Post by jreddy »

There are no warnings in the log for the sequencer or the job.. it seems like they run fine, but just with the incorrect data in the data sets.. Also the timing is the same ballpark range when they would run individually..

I am wondering if there is some environment variable that needs to be set so we can share the data sets across jobs when run as part of sequencer..?
ArndW
Participant
Posts: 16318
Joined: Tue Nov 16, 2004 9:08 am
Location: Germany
Contact:

Post by ArndW »

The reason I asked about timing was to make you check whether the dataset write job is complete before the dataset read job starts. You haven't confirmed or refuted that possible cause yet.
jreddy
Premium Member
Premium Member
Posts: 202
Joined: Tue Feb 03, 2004 5:09 pm

Post by jreddy »

In the sequencer, i have the triggers set to start the next job only when the first job is finished OK

the first job is to create the Datasets and second is to read those Datasets and do the CDC, so that makes me think the first job is done writing to the DS and then only the second job kicks in..
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

To clarify, this only occurs when run from a job sequence? Try interposing an Execute Command or Routine activity between each pair of Job activities in which you will get the job sequence to sleep for, say, 30 seconds. This will give the file system a chance to flush the Data Sets to disk.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Post Reply