NOT IN in datastage

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

qutesanju
Participant
Posts: 373
Joined: Tue Aug 26, 2008 4:52 am

NOT IN in datastage

Post by qutesanju »

there is a SQL query

select * from table1
where DIR ='IN'
AND CD_FLAG ='D'
AND CLASS NOT IN ( 'ARC','PPD','CCD','CTX','COR','DNE')

so i'm trying to implement above query in datastage
and i implemented it in transformer as

lnk_xfmr_target.DIR ='IN' AND lnk_xfmr_target.CD_FLAG ='D'
AND (lnk_xfmr_target.CLASS <> 'ARC'
and lnk_xfmr_target.CLASS <> 'PPD'
and lnk_xfmr_target.CLASS <> 'CCD'
and lnk_xfmr_target.CLASS <> 'CTX'
and lnk_xfmr_target.CLASS <> 'COR'
and lnk_xfmr_target.CLASS <> 'DNE')

i know we can implement NOT IN in datastage using <>
but seeking experts advise to check did i implemeted it correctly?
jwiles
Premium Member
Premium Member
Posts: 1274
Joined: Sun Nov 14, 2004 8:50 pm
Contact:

Post by jwiles »

The logic looks good. An alternative, which may be more efficient especially if the list of codes grows, would be

Code: Select all

lnk_xfmr_target.DIR ='IN' AND lnk_xfmr_target.CD_FLAG ='D' 
AND Index('|ARC|PPD|CCD|CTX|COR|DNE|','|':lnk_xfmr_target.CLASS:'|') < 1
Regards,
- james wiles


All generalizations are false, including this one - Mark Twain.
qutesanju
Participant
Posts: 373
Joined: Tue Aug 26, 2008 4:52 am

Post by qutesanju »

perfect james

i m thinking which one is correct in below two

1)
lnk_xfmr_target.DIR ='IN' AND lnk_xfmr_target.CD_FLAG ='D'
AND (lnk_xfmr_target.CLASS <> 'ARC'
and lnk_xfmr_target.CLASS <> 'PPD'
and lnk_xfmr_target.CLASS <> 'CCD'
and lnk_xfmr_target.CLASS <> 'CTX'
and lnk_xfmr_target.CLASS <> 'COR'
and lnk_xfmr_target.CLASS <> 'DNE')

2)

lnk_xfmr_target.DIR ='IN' AND lnk_xfmr_target.CD_FLAG ='D'
AND (lnk_xfmr_target.CLASS <> 'ARC'
OR lnk_xfmr_target.CLASS <> 'PPD'
OR lnk_xfmr_target.CLASS <> 'CCD'
OR lnk_xfmr_target.CLASS <> 'CTX'
OR lnk_xfmr_target.CLASS <> 'COR'
OR lnk_xfmr_target.CLASS <> 'DNE')

And i believe my option 1 corrosponds your code

lnk_xfmr_target.DIR ='IN' AND lnk_xfmr_target.CD_FLAG ='D'
AND Index('|ARC|PPD|CCD|CTX|COR|DNE|','|':lnk_xfmr_target.CLASS:'|') < 1

so what is the equivalent condition of option2 with your code?
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

#1 is valid, #2 is not as the 'OR's make it always evaluate to TRUE.
-craig

"You can never have too many knives" -- Logan Nine Fingers
jwiles
Premium Member
Premium Member
Posts: 1274
Joined: Sun Nov 14, 2004 8:50 pm
Contact:

Post by jwiles »

As Craig states, #2 is incorrect. You are guaranteed that at least one of your conditions within the parenthesis will always be true. You might as well have "AND 1=1" instead!

The equivalent Index function would be

Code: Select all

AND Index('|ARC|PPD|CCD|CTX|COR|DNE|','|':lnk_xfmr_target.CLASS:'|') >= 0
which is always true.

Regards,
- james wiles


All generalizations are false, including this one - Mark Twain.
vamsi.4a6
Participant
Posts: 334
Joined: Sun Jan 22, 2012 7:06 am
Contact:

Post by vamsi.4a6 »

jwiles wrote:The logic looks good. An alternative, which may be more efficient especially if the list of codes grows, would be

Code: Select all

lnk_xfmr_target.DIR ='IN' AND lnk_xfmr_target.CD_FLAG ='D' 
AND Index('|ARC|PPD|CCD|CTX|COR|DNE|','|':lnk_xfmr_target.CLASS:'|') < 1
could anybody explain what is the need for | and concatenation.I think following will also work.please correct me if i am wrong?

lnk_xfmr_target.DIR ='IN' AND lnk_xfmr_target.CD_FLAG ='D'
AND Index('ARCPPDCCDCTXCORDNE',lnk_xfmr_target.CLASS) < 1
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

It prevents "false positives" on a substring match. For example, yours would return true for "CPP" which is not in the valid set.
-craig

"You can never have too many knives" -- Logan Nine Fingers
qutesanju
Participant
Posts: 373
Joined: Tue Aug 26, 2008 4:52 am

Post by qutesanju »

I tried alternative approach which is

lnk_xfmr_target.DIR ='IN' AND lnk_xfmr_target.CD_FLAG ='D'
AND Index('|ARC|PPD|CCD|CTX|COR|DNE|','|':lnk_xfmr_target.CLASS:'|') < 1

but it shows
not enough arguments to function index
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Index() needs three arguments - if it's any consolation, James missed this also. The third argument is the occurrence number you're searching for - this is almost always 1. But the function does, for example, allow you to look for the nth occurrence of the search substring.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
qutesanju
Participant
Posts: 373
Joined: Tue Aug 26, 2008 4:52 am

Post by qutesanju »

so can someone shows correct function with arguments?
jwiles
Premium Member
Premium Member
Posts: 1274
Joined: Sun Nov 14, 2004 8:50 pm
Contact:

Post by jwiles »

My mistake...typing it in from memory :(

Index('|ARC|PPD|CCD|CTX|COR|DNE|','|':lnk_xfmr_target.CLASS:'|',1)

The ,1 is the third argument (missing from my original). You can also just click the [...] button in the derivation editor and search for Index in the String functions. Selecting it will provide you the correct template for the function. Also documentation is available through the Help button and online docs. :)

Regards,
- james wiles


All generalizations are false, including this one - Mark Twain.
harishkumar.upadrasta
Participant
Posts: 18
Joined: Tue Dec 25, 2012 10:39 pm
Location: Detroit,MI

Post by harishkumar.upadrasta »

Hi,

I think there needs a small correction as the string in which we need to search should be the first argument, please correct me if i'm wrong.

Earlier posted:

Index('|ARC|PPD|CCD|CTX|COR|DNE|','|':lnk_xfmr_target.CLASS:'|' ,1 )


Corrected:

Index('|':lnk_xfmr_target.CLASS:'|','|ARC|PPD|CCD|CTX|COR|DNE|' ,1 )


Example:


MyString = "P1234XXOO1299XX00P1"
Position = Index(MyString, 1, 2)
* The above returns the index of the second "1" character (10).
Position = Index(MyString, "XX", 2)
* The above returns the start index of the second "XX"
* substring (14).
Position = Index(MyString, "xx", 2)
* The above returns 0 since the substring "xx" does not occur.
Position = Index(MyString, "XX", 3)
* The above returns 0 since the third occurrence of
* substring "XX" * cannot be found.
Harish
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

harishkumar.upadrasta wrote:Corrected:

Index('|':lnk_xfmr_target.CLASS:'|','|ARC|PPD|CCD|CTX|COR|DNE|' ,1 )
Your correction is... correct! From the docs:

Input: string (string) substring (string) occurrence (int32)
-craig

"You can never have too many knives" -- Logan Nine Fingers
harishkumar.upadrasta
Participant
Posts: 18
Joined: Tue Dec 25, 2012 10:39 pm
Location: Detroit,MI

Post by harishkumar.upadrasta »

Thank u craig :)
Harish
jwiles
Premium Member
Premium Member
Posts: 1274
Joined: Sun Nov 14, 2004 8:50 pm
Contact:

Post by jwiles »

Yes, thank you indeed!

Regards,
- james wiles


All generalizations are false, including this one - Mark Twain.
Post Reply