Page 1 of 2

condition in a transformer

Posted: Wed Aug 16, 2006 11:26 am
by fmartinsferreira
I need to apply one condition in a transformer and I'm little lost. :oops:

rule: Only load the records where main_id = lookup_id
and dt_record between dt_start and dt_end.

- main input I have main_id and dt_record
- lookup input I have lookup_id, lookup_desc, dt_start and dt_end.
- target I have main_id, lookup_desc and dt_record

for example:

main input
main_id dt_record
1 10/10/2005
2 10/10/2005
3 10/09/2005

lookup input
lookup_id lookup_desc dt_start dt_end
1 AAA 01/01/2005 10/07/2005
1 BBB 10/08/2005 01/10/2005
1 CCC 02/10/2005 10/01/2006

if I have this sources, what I shoud load in the target is:

main_id lookup_desc dt_record
1 CCC 10/10/2005

Additional info:
main input -> store procedure Sql Server
lookup input -> hashed file(all the columns in the hashed file is key)

I would like to know what key expression should I put in the transformer?

Regards,

Fernando

Posted: Wed Aug 16, 2006 11:33 am
by kris007
What you are trying to do is called a Range lookup or Multi-row lookup. You will have to use a Universe stage as a look up instead of hashed file.Clickhere. It's described in this post on how to achieve what you want. If you have any questions come back here. :wink:

Posted: Wed Aug 16, 2006 11:34 am
by Krazykoolrohit
once you do a lookup that will ensure "Only load the records where main_id = lookup_id ". just make sure you eliminate all rows with null lookup_id.
ex: if Isnull(lookup_id) then 1 else 0

as for the second condition. you will fetch the start dates and end dates. use these to calculate the eligibility for each row in a stage variable.
ex: if dt_start > dt_record and dt_record < dt_end then 1 else 0


add a constraint which passes the rows only is both of the above stage variables are zero.

NOTE: before using any lookup row in stage variable you will have to check them for nullls.
ex: Is Itnotnull(dt_start) and Isnotnull(dt_end) then if dt_start > dt_record and dt_record < dt_end then 1 else 0

Posted: Wed Aug 16, 2006 11:36 am
by Krazykoolrohit
kris007 wrote:What you are trying to do is called a Range lookup or Multi-row lookup. You will have to use a Universe stage as a look up instead of hashed file.Clickhere. It's described in this post on how to achieve what you want. If you have any questions come back here. :wink:

i dont think range lookup is necessary as he just wants to check the data row if it lies within two columns which are fetched from the lookup table.

Posted: Wed Aug 16, 2006 11:42 am
by kris007
Yep. Overlooked that the hashed file and the source file have a common key other than the date.

Posted: Wed Aug 16, 2006 11:50 am
by fmartinsferreira
kris007 wrote:What you are trying to do is called a Range lookup or Multi-row lookup. You will have to use a Universe stage as a look up instead of hashed file.Clickhere. It's described in this post on how to achieve what you want. If you have any questions come back here. :wink:
kris007 the link is broken and only show this message: "The topic or post you requested does not exist" :cry:

Posted: Wed Aug 16, 2006 11:53 am
by chulett
The link is missing the "www" in front of the "dsxchange.com".

Posted: Wed Aug 16, 2006 11:55 am
by kris007
Clickhere

Posted: Wed Aug 16, 2006 11:57 am
by fmartinsferreira
as for the second condition. you will fetch the start dates and end dates. use these to calculate the eligibility for each row in a stage variable.
ex: if dt_start > dt_record and dt_record < dt_end then 1 else 0


Krazykoolrohit I can't do "if LINK_002.dt_start > LINK_001.dt_record and LINK_001.dt_record < LINK_002.dt_end then 1 else 0" because only show me the LINK_001 fields.

LINK_001.main_id
LINK_001.dt_record

LINK_002.lookup_id
LINK_002.lookup_desc
LINK_002.dt_start
LINK_002.dt_end

If I put "if LINK_002.dt_start > LINK_001.dt_record and LINK_001.dt_record < LINK_002.dt_end then 1 else 0" the job don't compile because second DataStage there is an error. :cry:

Fernando

Posted: Wed Aug 16, 2006 12:52 pm
by DSguru2B
This is a simple lookup keyed on a single key with an extra condition that can be accomodated in the contraint.
You constraint will look something like

Code: Select all

NOT(LookupLink.NOTFOUND) AND ICONV(dt_record,"D/MDY[2,2,4]") >= ICONV(dt_start,"D/MDY[2,2,4]") AND ICONV(dt_record,"D/MDY[2,2,4]") <= ICONV(dt_end,"D/MDY[2,2,4]")

Posted: Wed Aug 16, 2006 1:14 pm
by Krazykoolrohit
fmartinsferreira wrote:Krazykoolrohit I can't do "if LINK_002.dt_start > LINK_001.dt_record and LINK_001.dt_record < LINK_002.dt_end then 1 else 0" because only show me the LINK_001 fields.
What i wrote there was an idea not exact syntax. Use DSguru2B syntax there.

Posted: Wed Aug 16, 2006 1:19 pm
by fmartinsferreira
DSguru2B wrote:This is a simple lookup keyed on a single key with an extra condition that can be accomodated in the contraint.
You constraint will look something like

Code: Select all

NOT(LookupLink.NOTFOUND) AND ICONV(dt_record,"D/MDY[2,2,4]") >= ICONV(dt_start,"D/MDY[2,2,4]") AND ICONV(dt_record,"D/MDY[2,2,4]") <= ICONV(dt_end,"D/MDY[2,2,4]")
I can't put this in the constraint because all my lookup's fields are key.

Fernando

Re: condition in a transformer

Posted: Wed Aug 16, 2006 1:23 pm
by DSguru2B
fmartinsferreira wrote:rule: Only load the records where main_id = lookup_id
and dt_record between dt_start and dt_end.
But i thought it was just a single key as per your rule above. Well it doesnt matter how many keys you have. Just link the keys properly to the hashedfile in the transformer. Not(RefLink.NOTFOUND) will take care of all the keys.

Posted: Wed Aug 16, 2006 1:24 pm
by Krazykoolrohit
fmartinsferreira wrote: I can't put this in the constraint because all my lookup's fields are key.
Fernando
How does that affect the calculations?? you are doing lookup matching all your columns in lookup file with the main file?

NOTE: before using this calculation you have to check and ensure that no null value is being used (ie do this calculation only if lookup succeeds)

refer to my previous post for the logic

Posted: Wed Aug 16, 2006 2:35 pm
by kris007
A quick look at the sample data shows that a single row in the source column has more than single match in the lookup data when you are matching the id's from both files. Now, what OP needs is id's to match and the input date falling in the range between startdate and end date for that particular input Id. And since the startdate and enddate are keys, I am expecting them to be not nulls.

Try using a Universe stage where you can join even the date column from the input to the lookup dates(since they are also keys) and follow what I have suggested in the other link. It might help.

HTH