Help needed in utilizing TRANS() function

Post questions here relative to DataStage Server Edition for such areas as Server job design, DS Basic, Routines, Job Sequences, etc.

Moderators: chulett, rschirm, roy

Post Reply
vivekgadwal
Premium Member
Premium Member
Posts: 457
Joined: Tue Sep 25, 2007 4:05 pm

Help needed in utilizing TRANS() function

Post by vivekgadwal »

Gurus,

I have tried searching the forums for a relevant answer and couldn't find it. Most of the answers were very helpful, but I couldn't find a topic which deals with using a TRANS() function from within a loop. May be I missed it and would sure appreciate it if you can point me in the right direction.

Overall Requirement:
I have to load a table with one month data, wait for another process to make changes to the data and then retrieve it once that other process is done and load the target table, all within one batch job. This process should be repeating for 2 years worth of data. No manual intervention is needed.

In order to do the part where I load the table and wait for the other process, I thought of doing it this way: Have a separate job that keeps polling that table for an indicator that the process is done and once it finds it, populate (say a value '1') a hashed file (chose this because I came across the TRANS() function which might help me in reading the value) and read from it...If it is a '1' then the process is finished and I can pick up the records from the table and load it into the target table, if not wait for some more time before checking again.

I coded a dummy job to test out this functionality (as the tables are not yet built). This job populates a hashed file with three rows (say with values 'X', 'Y', 'Z') as key columns. I am passing in a parameter value for that job and all this does is load that value for all the key columns in the Hashed file.
Ex: If I passed '1' as parameter value then the Hashed file with populate as

Code: Select all

DmyVal (key)       Value
     X               1
     Y               1
     Z               1
Now, I created a dummy Batch job which takes in parameters like pEndVal (for loop ending purpose), pSleepSec (number of seconds to sleep), and all the 4 args needed for the TRANS() function (filename, record ID, field #, control.code).
This is what I am doing within this batch: (NOTE - our Hashed files are externally path-ed)

Code: Select all

vExecCmd = "SETFILE ":pFilePath:" ":pFilePointer:" OVERWRITING"
Call DSExecute("UV", vExecCmd, Output, SystemReturnCode)

If SystemReturnCode = 0
      then
      FOR vLoop = 1 to pEndVal STEP 1
            vFileval = ''
      ....<ATTACH THE JOB THAT LOADS THE HASHED FILE>....
      ....<passing the parameter into the job as vLoop>....

      vFileval = TRANS(pFilePointer,pRecId,pFieldVal,pErrCode)
      If vLoop <= pEndVal then
               If Len(Trim(vFileval)) = 0 or vFileval < pEndVal
               Then
                  vSleepSec = pSleepSec
                  Call DSLogInfo('The Value of ':pRecId:' is: ':vFileval:'. We need to have ':pEndVal,'JobControl')
                  Sleep vSleepSec
                  CONTINUE               
               End
               Else
                  Call DSLogInfo('SUCCESS!! The Value of ':pRecId:' is: ':vFileval,'JobControl')
                  EXIT
               End
            end
            else
               Call DSLogInfo('Number of iterations is up. Exiting the process','JobControl')
               EXIT
            end

        NEXT vLoop
      end

      else
         Call DSLogFatal('UV Command did NOT execute!','JobControl')
      end
Inputs:
pEndVal = 3; pFilePointer = pathname & name of the Hashed file; pRecId = X; pFieldVal = 1; pErrCode = X; pSleepSec = 30; pFilePointer = fp1

I want to check if 'Z' row has a value of '3'...if so, exit the loop and declare SUCCESS else, keep checking for it (for 3 iterations) and then exit the loop.

Hashed file would look like:

Iteration 1
X 1
Y 1
Z 1

Iteration 2
X 2
Y 2
Z 2

Iteration 3
X 3
Y 3
Z 3

For first iteration, I am getting

Code: Select all

The Value of Z is: 1. We need to have 3...
Sleep executes and second run is happening...
which is correct.

For Second iteration: I should be getting

Code: Select all

The Value of Z is: 2. We need to have 3...
Sleep executes and third run should happen...
But, I am getting

Code: Select all

The Value of Z is: 1. We need to have 3...
Sleep executes and third run is happening...
Similarly, for Third iteration the value of Z is read as '1' not '3'!!!

What I am thinking is that the the TRANS() function is reading correctly for the first time and when it comes to the next iterations, it is not at all reading from the Hashed file. I came to this conclusion, as I have checked the job (View Data on the Hashed file) between iterations and the file is being updated. But, in the log of the batch job, I am viewing that '1' is the value being read from the Hashed file irrespective of the iteration.

Am I missing something or is this how TRANS() was supposed to work? If so, can you please let me know if there are any alternatives to achieve this? I apologize for such a long post, but I thought giving all the information (including my trials) can help achieve a faster solution :D

Can you also let me know how I can delete the VOC entry created by SETFILE command?

Thanks in advance,
Vivek Gadwal.
Vivek Gadwal

Experience is what you get when you didn't get what you wanted
kduke
Charter Member
Charter Member
Posts: 5227
Joined: Thu May 29, 2003 9:47 am
Location: Dallas, TX
Contact:

Post by kduke »

TRANS() does not need a file pointer. It uses the real filename. If you have the hashed file open then just read it. You also do not need SETFILE if you use openpath.

Code: Select all

openpath pFilePath to pFilePointer else ...error logic here
read Rec from pFilePointer else Rec = ''
Field7 = Rec<7>
Mamu Kim
kduke
Charter Member
Charter Member
Posts: 5227
Joined: Thu May 29, 2003 9:47 am
Location: Dallas, TX
Contact:

Post by kduke »

Code: Select all

Field1 = TRANS("VOC", "SELECT", 1, "X")
Will do the same as

Code: Select all

open "VOC" to fVoc else stop
read Rec from fVoc, "SELECT" else Rec = ''
Field1 = Rec<1>
Mamu Kim
vivekgadwal
Premium Member
Premium Member
Posts: 457
Joined: Tue Sep 25, 2007 4:05 pm

Post by vivekgadwal »

kduke wrote:TRANS() does not need a file pointer. It uses the real filename. If you have the hashed file open then just read it. You also do not need SETFILE if you use openpath.

Code: Select all

openpath pFilePath to pFilePointer else ...error logic here
read Rec from pFilePointer else Rec = ''
Field7 = Rec<7>
Thanks for your response Kim. Correct me if I am wrong in understanding you:

Code: Select all

OPENPATH pFilePath to pFilePointer ON ERROR <error logic>
THEN
       vFileval = TRANS(pFilePointer,pRecId,pFieldVal,pErrCode)
ELSE
       <Statements>

....<Read the vFileval to determine if it has reached the desired value>....
....<Rest of the logic in the code I posted>....
I have few questions here...
1) Isn't this similar to what SETFILE is doing?

2) Also, if I am using OPENPATH instead of SETFILE, does DataStage behind the scenes close the connection once the file is read and reopens it when it reads OPENPATH in Iteration 2?

The reason for question # 2 is, basing on the output (correctly returning the value for the first iteration and maintaining that value through out all the other iterations), I thought that TRANS() is probably not reading the Hashed file for the second and third times as the batch is looping back and loading the file.

3) A General question: when we are writing to a Hashed file repeatedly within a batch job, does DataStage immediately write to the Hashed file or does it cache the data in memory and then loads it at the end of the Batch job?

Please pardon me if you already tried to explain these in your posts and I was unable to understand it.

Thanks,
Vivek Gadwal.
Vivek Gadwal

Experience is what you get when you didn't get what you wanted
kduke
Charter Member
Charter Member
Posts: 5227
Joined: Thu May 29, 2003 9:47 am
Location: Dallas, TX
Contact:

Post by kduke »

SETFILE creates a VOC entry. OPENPATH does not. It is just more straight forward. Cleaner. A VOC entry allows you to run TCL commands against the hashed file like SELECT.

You need to have a common in order to not open your hashed file over and over. Look at the code in SDK for sequences. It has a common which opens the file the first time. The TRANS() function does another open. So read is more efficient. I am sure Ray could give you a few more details or what we call the full Wurlod.

Hashed files can be cached inside jobs not batch jobs or subroutines. The whole hashed file can be stored in memory. This can be turned off in the stage. A batch job is one which has no stages but is all basic job control.
Mamu Kim
vivekgadwal
Premium Member
Premium Member
Posts: 457
Joined: Tue Sep 25, 2007 4:05 pm

Post by vivekgadwal »

kduke wrote: SETFILE creates a VOC entry. OPENPATH does not. It is just more straight forward. Cleaner. A VOC entry allows you to run TCL commands against the hashed file like SELECT.
Thanks Kim. I will try this approach and let you know how it goes. Unfortunately, I couldn't find "Sequences" under the Routines>sdk category. Can you elaborate more on "common" (you were mentioning in your post). Did you mean, a place where you just open the Hashed file once (say using OPENPATH) and then you just keep reading from it?

I am hopeful Ray will also chip in with his comments on this discussion.
kduke wrote: Hashed files can be cached inside jobs not batch jobs or subroutines. The whole hashed file can be stored in memory. This can be turned off in the stage. A batch job is one which has no stages but is all basic job control.
I guess my question # 3 was not clear enough. I was trying to find out about the process of writing into the Hashed file and immediately reading from it. I was thinking, may be to minimize the I/O, DataStage has some built-in functionality to store the data in cache (for lack of a better term in my mind) while we are writing to it.

This question is stemming from the fact that I noticed, during the execution of the batch job I posted in my original post, that once the job is executing, the Hashed file is getting updated with the iteration number (vLoop...Noticed this using View Data on the Hashed file stage). But, the TRANS() that immediately follows the Job run (within the batch) just kept on reading the value that first got updated in the Hashed file and not from the later iterations :).

Regards,
Vivek Gadwal

Experience is what you get when you didn't get what you wanted
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

By "Sequences" he is referring to the KeyMgmt routines in the sdk branch of the repository.
-craig

"You can never have too many knives" -- Logan Nine Fingers
kduke
Charter Member
Charter Member
Posts: 5227
Joined: Thu May 29, 2003 9:47 am
Location: Dallas, TX
Contact:

Post by kduke »

Yes it is immediately available. Never source a hashed file and output it in the same job.

You can create a list of variables which values remain between subroutine calls. These variables need to be declared in a common block. Do a search. Common has been discussed before.
Mamu Kim
vivekgadwal
Premium Member
Premium Member
Posts: 457
Joined: Tue Sep 25, 2007 4:05 pm

Post by vivekgadwal »

kduke wrote: SETFILE creates a VOC entry. OPENPATH does not. It is just more straight forward. Cleaner. A VOC entry allows you to run TCL commands against the hashed file like SELECT.
I apologize for the brief delay in getting back with the results as I was pulled into different things over at my work. Anyways, I was able to get my Dummy routine going, thanks to Kim's suggestion on using OPENPATH.

But, interestingly, if I am writing the code like this:

Code: Select all

OPENPATH vFilePath to vFilePointer
ELSE Call DSLogFatal('File cannot be opened!!','JobControl')

FOR vLoop = 1 to pEndVal STEP 1
   vFileval = ''
   <Run the job to populate the Hashed file (please refer to my first post)>
   
   READ vFileval FROM vFilePointer,pRecId
   ELSE Call DSLogFatal('The input file cannot be Read!','JobControl')
   
   <The If...Then...Else block mentioned above with SLEEP stmt>
NEXT vLoop
the value being read out of the Hashed file is the same in every iteration and it was the same problem that I was facing before.

If I move the OPENPATH into the FOR loop (the idea is to create separate file variables for each iteration), it works like a charm!!!

Code: Select all

vFilePointerName = pFilePointer

FOR vLoop = 1 to pEndVal STEP 1
   vFileval = ''
   vFilePointer = vFilePointerName:vLoop

   <Run the job to populate the Hashed file (please refer to my first post)>
   
   OPENPATH vFilePath to vFilePointer
   ELSE Call DSLogFatal('File cannot be opened!!','JobControl')

   READ vFileval FROM vFilePointer,pRecId
   ELSE Call DSLogFatal('The input file cannot be Read!','JobControl')
   
   <The If...Then...Else block mentioned above with SLEEP stmt>
NEXT vLoop
The above code is returning the correct values as iterations progress. This, even though it made me happy, has raised some questions:

1) Is there some kind of "lock" (as in databases) happening between a File Variable created by the OPENPATH statement? This question arises because if I attempt to read off one File Variable, it is not giving me the updated row value. If I am reading off different File Variables created during each iteration, then I am able to read the correct row value!

2) Is there a more "elegant" solution than creating more than one file variables every time this batch runs? The requirement is to run this kind of batch daily. So, if I keep on creating file variables, may be few months down the lane, I might run into a lot of problems! Please advise.

3) There are commands like READU, READL which talk about opening locks and read a field. Can I leverage them in this situation?

I really appreciate all of your input and look forward for the answers to the above questions.
Vivek Gadwal

Experience is what you get when you didn't get what you wanted
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

1) No lock. Are the writes employing write cache? If so they will not be visible to other processes until the job finishes and the cache is flushed to disk.

2) If it's the same hashed file, you should never need to open it more than once. Place the file variable in a COMMON area of memory, so that it's available to all routine calls in the same process.

3) Probably not. These are record-level locks, and your routine is only reading. If it also locked, then the job would be prevented from writing.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
vivekgadwal
Premium Member
Premium Member
Posts: 457
Joined: Tue Sep 25, 2007 4:05 pm

Post by vivekgadwal »

ray.wurlod wrote:1) No lock. Are the writes employing write cache? If so they will not be visible to other processes until the job finishes and the cache is flushed to disk.

2) If it's the same hashed file, you should never need to open it more than once. Place the file variable in a COMMON area of memory, so that it's available to all routine calls in the same process.

3) Probably not. These are record-level locks, and your routine is only reading. If it also locked, then the job would be prevented from writing.
Thanks for the info Ray. I apologize for the delay in responding back. I really appreciate the input given by Kim and Craig. I got it working and I am marking the topic as resolved.

Regards,
Vivek Gadwal

Experience is what you get when you didn't get what you wanted
Post Reply