Help in implementing server routine

Post questions here relative to DataStage Server Edition for such areas as Server job design, DS Basic, Routines, Job Sequences, etc.

Moderators: chulett, rschirm, roy

Post Reply
vinodlakshmanan
Participant
Posts: 82
Joined: Wed Jul 14, 2004 7:21 am
Location: India

Help in implementing server routine

Post by vinodlakshmanan »

We have around 30 server jobs which access ODBC stages and insert or update other tables. The databases are Informix and TeraData. Now to check for insert or update we do lookups on hash files and after comparing similar fields, we decide whether it is an insert or update.
The problem here is that NULLs are possible in the columns and hence rows get dropped. We want to do this comparison and null-checking though a routine or transform
My first question is which would be better, a routine or a transform? Secondly, I think it would be cumbersome to call the routine for each check, as some jobs have over 10 checks. Is there any better way of implementing this?
I am new to routines, so please provide an explanation of the logic of your code :)
Integrity is what you do when no one is watching
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Hashed file lookups are by primary key. Primary keys can not be null (by definition). Likewise the primary keys in your tables can not be null. Therefore how are you getting nulls?

You load the key value from your table into a hashed file. You perform a lookup against the key. It is either found (an update is required) or it is not found (an insert is required).

Are you claiming that you have target tables with keys that are not null?

The other comparison columns are, of course, tricker, in that you must handle nulls there. Something like the following to test for a match

Code: Select all

If Not(IsNull(source.MyCol)) And Not(IsNull(lookup.MyCol)) Then (source.MyCol = lookup.MyCol) Else (IsNull(source.MyCol) And IsNull(lookup.MyCol))
You definitely do not need routines in this case.

But if you did, a generic routine would have the form:

Code: Select all

FUNCTION IsEqual(Arg1, Arg2)

Arg1IsNull = IsNull(Arg1)
Arg2IsNull = IsNull(Arg2)

If Not(Arg1IsNull) And Not(Arg2IsNull)
Then
   Ans = (Arg1 = Arg2)
End
Else
   Ans = (Arg1IsNull And Arg2IsNull)
End

RETURN(Ans)
The logic is identical to the expression above - it's just laid out differently!

This function returns true if both values are not null and equal or if both arguments are null, and returns false otherwise. The two assignment statements have what's called a logical, or Boolean, expression on the right hand side; each locical expression can only result in a true/false value.

The IsNull function results were loaded into local variables solely to avoid evaluating each twice.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
vinodlakshmanan
Participant
Posts: 82
Joined: Wed Jul 14, 2004 7:21 am
Location: India

Post by vinodlakshmanan »

Thanks Ray.
I have a couple of more queries on routines:
1. Which are better performance-wise: routines or transforms?
2. Is it possible to give a variable list of arguments to a routine? For e.g, I have to process 10 jobs using a single routine, but a different number of columns in each job. Is it possible to pass a variable number of arguments? I was thinking that one parameter to the routine would be a string of delimited columns. We would count the number of delimeters, get the number of columns, get each column (similar to "cut" command in Unix) and then process each column. Is this possible and feasible to implement? Is there any better method? Each job would pass on an average 8-10 columns.
Integrity is what you do when no one is watching
Sainath.Srinivasan
Participant
Posts: 3337
Joined: Mon Jan 17, 2005 4:49 am
Location: United Kingdom

Post by Sainath.Srinivasan »

Transforms are inline commands. When you compile your job, these transform gets substituted with the actual code.

Routines are external procedures and will be called for each row that refers to the routine. This may cause performance impact depending on the size of code.

Secondly, you can send varying number of values to a routine. You can use the format of RoutineName(CallingJobName, ParamList)
and concatenate your ParamList using any other character than a comma (',').

This you can split within your routine.
vinodlakshmanan
Participant
Posts: 82
Joined: Wed Jul 14, 2004 7:21 am
Location: India

Post by vinodlakshmanan »

I am passing the list using pipe delimited format ("|"). I am extracting the fields in a for loop using FIELD. How do I get the last field? Taking the example given in the BASIC guide:

Code: Select all

D=FIELD("###DHHH#KK","#",4)
gives me "DHHH".
How do I get "KK"?
Integrity is what you do when no one is watching
vinodlakshmanan
Participant
Posts: 82
Joined: Wed Jul 14, 2004 7:21 am
Location: India

Post by vinodlakshmanan »

Sorry got it. One of those instances when I did not read the documentation properly :oops:
Integrity is what you do when no one is watching
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

To clarify, DataStage does not support routines with a variable number of arguments.

Therefore, as others suggested, you can pass to (or from) any argument a delimited list. The routine (or caller) needs to parse the list.

If the delimiter is a dynamic array delimiter character (for example, @FM) then you can use dynamic array operations to parse. For example:

Code: Select all

BirthDate = Arg1<1>
HireDate = Arg1<2>
InjuryDate1 = Arg1<3>
InjuryDate2 = Arg1<4>
InjuryDate3 = Arg1<5>
If the index to the dynamic array refers to a non-existent position, the value returned is "".

A Transform and a Routine that perform an identical task are pretty much indistinguishable in terms of performance. The only difference is the first call overhead for the routine. In a medium to large number of rows being processed this is negligible.

If you change a Routine you do not need to recompile any jobs that use it. If you change a Transform you do need to recompile any jobs that use it.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Post Reply