Passing values from excel to two jobs sequentially

Post questions here relative to DataStage Server Edition for such areas as Server job design, DS Basic, Routines, Job Sequences, etc.

Moderators: chulett, rschirm, roy

Post Reply
mahi_mahi
Charter Member
Charter Member
Posts: 45
Joined: Mon Aug 01, 2005 10:02 am

Passing values from excel to two jobs sequentially

Post by mahi_mahi »

Hi

I have 2 jobs which take a vaue from one of the columns of an excel sheet. Job1 has to be run first and then job2 for each value of the parameter taken from the xls.(or .csv)

there are 2300 rows in the csv file. I tried runnign job 2 for one value and it ran for 3min.

Please let me know how can I pass the value to both the jobs one after the other and it has to run in loop for all the values without our intervention.

And is it that teh time taken will be approx 2300*3mins or can we reduce the time taken.

Please let me know if any more information is required.
kcbland
Participant
Posts: 5208
Joined: Wed Jan 15, 2003 8:56 am
Location: Lutz, FL
Contact:

Post by kcbland »

If you can setup the job to run under multiple instances (unique file names, no locks, etc), then you could run several "configurations" from your csv file.

The best option is to design the process so that it can run once and do everything necessary at one time for all of the values within your csv file.
Kenneth Bland

Rank: Sempai
Belt: First degree black
Fight name: Captain Hook
Signature knockout: right upper cut followed by left hook
Signature submission: Crucifix combined with leg triangle
mahi_mahi
Charter Member
Charter Member
Posts: 45
Joined: Mon Aug 01, 2005 10:02 am

Post by mahi_mahi »

Can u pls be clear. sorry I could not follow.

I have to take oen value from xls at a time and run both the jobs in sequence because the first job is creating a hashed file which is used in the second.

We use the value taken from the file as parameter in both the jobs
DSguru2B
Charter Member
Charter Member
Posts: 6854
Joined: Wed Feb 09, 2005 3:44 pm
Location: Houston, TX

Post by DSguru2B »

What Ken is trying to say is, that the best way to handle this is to design everything in sucha way, that once you fire off your job or sequence, it should handle all the values from csv file. Right ken?

If you ask me, i would load the csv file into a hashed file and add another column as a dummy column which will be the key. The key column will be a sequence of numbers.

Now you have all the values that need to be passed to the jobs as parameters in a hashed file.
Build a sequence looping for the number of records present in the csv file. In the job activity you can use the function UtilityHashLookup() as a derivation of the job parameter. The UtilityHashLookup() will get the first value which will be keyed on 1 and assign it the parameter. Then for the second loop, the UtilityHashLookup() will get the second value keyed on 2 and so on. This way your loop will run for x number of times, everytime passing a value as a parameter to the jobs.

Did i confuse you buddy :?:
Creativity is allowing yourself to make mistakes. Art is knowing which ones to keep.
kcbland
Participant
Posts: 5208
Joined: Wed Jan 15, 2003 8:56 am
Location: Lutz, FL
Contact:

Post by kcbland »

Running 2 jobs for each row in a spreadsheet will take a long time. If you could use job instances you can run many copies of your process at the same time.

Or, figure out how to run for everything just once. I don't know why you have to run in a loop like you're doing. It's not very efficient, but you don't need me to point that out, right?
Kenneth Bland

Rank: Sempai
Belt: First degree black
Fight name: Captain Hook
Signature knockout: right upper cut followed by left hook
Signature submission: Crucifix combined with leg triangle
DSguru2B
Charter Member
Charter Member
Posts: 6854
Joined: Wed Feb 09, 2005 3:44 pm
Location: Houston, TX

Post by DSguru2B »

Ken is right. For 2300 rows, a loop would take for ever. Make your jobs multi-instance. And if you got my approach, you can eliminate the loop activity and just run the same multi-instance jobs by giving sequential invocations ids which will be used as keys in the UtilityHashLookup(). This way you can run as many instances as you want.
Creativity is allowing yourself to make mistakes. Art is knowing which ones to keep.
mahi_mahi
Charter Member
Charter Member
Posts: 45
Joined: Mon Aug 01, 2005 10:02 am

Post by mahi_mahi »

Hi,
thanks a lot for the quick response.
I loaded data to a hashed file as suggested by DSGuru. But when I was passing the arguements to the UtilityHashLookup,
I could not understand by Hash Key value which is the second arguement for hash Utility lookup routine activity. Please explain me.
I gave the actual column order number as the third arguement which has to be passed to the server job.
mahi_mahi
Charter Member
Charter Member
Posts: 45
Joined: Mon Aug 01, 2005 10:02 am

Post by mahi_mahi »

mahi_mahi wrote:Hi,
thanks a lot for the quick response.
I loaded data to a hashed file as suggested by DSGuru. But when I was passing the arguements to the UtilityHashLookup,
I could not understand by Hash Key value which is the second arguement for hash Utility lookup routine activity. Please explain me.
I gave the actual column order number as the third arguement which has to be passed to the server job.
I gave the key column name(in quotes as it si not accepting without quotes) of the hashed file as the second arguement. I ran the sequence job and when I see the log, its saying *TABLE NOT FOUND*

I checked other posting on the same ; but not much helpful.
i checked the path where the file resides in the server and copied the path as it is. But still its showing up the same.

Please help me.
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Which part of "Table Not Found" is unclear? Is it the Hashed File stage that is generating this message? If so is the hashed file declared as being in an account or in a directory? In either case which one? Do you have a VOC pointer to the hashed file? If it is the ODBC stage that is generating the message, did you import the spreadsheet metadata as a system table? How, precisely, are you attempting to read from this data source?

Those questions are intended, as much as anything, to convince you to do some of the detective work yourself, rather than have us waste time speculating.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
mahi_mahi
Charter Member
Charter Member
Posts: 45
Joined: Mon Aug 01, 2005 10:02 am

Post by mahi_mahi »

I have a sequence job where I gave the routine activity stage followed by the server jobs where in we call the arguement.

When I checked the log for Sequence job,
Step1: Routine_Activity stage started
Step2: Routine_Activity stage: *TABLE NOT FOUND*


The hashed file is being declared in the account. I dont have direct access to the server. but I could see the path of the hashed file and gave the same for arg1 in the routine activity stage.

When I opened the hashed file I found two files inside One being the DATA file and another file. I dont remember the name.

The server jobs are in compiled status only. The sequence job is struck at the first stage, routine Activity stage in the sequence job.

Please suggest me what needs to be done.
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

What routine is being called by the Routine activity, and with what argument values?
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
mahi_mahi
Charter Member
Charter Member
Posts: 45
Joined: Mon Aug 01, 2005 10:02 am

Post by mahi_mahi »

ray.wurlod wrote:What routine is being called by the Routine activity, and with what argument values?
UtilityHashLookup is being called in the routine activity stage which is used to pass a value from the hashed file to the server job.

arg1--hashed file name with fullpath
arg2--Hashed Key value(which i am not able to understand)
arg3--postion of the another col from the hashed file
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

UtilityHashLookup does not work with pathed hashed files; a VOC entry for the hashed file must exist. If you look at the code for UtilityHashLookup (category Routines\sdk\Utility) you will see that it uses the Open statement to open the hashed file.

You will also see that this routine is the source of the "*TABLE NOT FOUND*" message.

You can create your own clone of UtilityHashLookup (maybe called UtilityHashPathLookup) and replace Open with OpenPath. That should be the only change required.

The second argument is the key value you are attempting to lookup; the third argument is the field number within the hashed file record that you want to return. The first non-key colkumn is field number 1, the second non-key column is field number 2, and so on.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
DSguru2B
Charter Member
Charter Member
Posts: 6854
Joined: Wed Feb 09, 2005 3:44 pm
Location: Houston, TX

Post by DSguru2B »

Mahi_mahi, just created an in-account hashed file. If that is against the standard use at your client, then create a VOC pointer by using

Code: Select all

SETFILE <fully qualified hashed file name> VOC pointer Name
Then in the UtilityHashLookup() pass the pointer name as hashed file name. That should take care of it.
Creativity is allowing yourself to make mistakes. Art is knowing which ones to keep.
Post Reply