keylookup with date between effective_dt and expiration_dt
Moderators: chulett, rschirm, roy
keylookup with date between effective_dt and expiration_dt
Being new to Enterprise Edition I can't figure how to perform a lookup against a type 2 slowly changing dimension table. I need to perform a lookup to return a surrogate key against a table whose natural key value is id, effective_dt, expiration_dt. Has anyone implemented a solution for doing a keylookup against this type of dimension? The lookup should effectively return the surrogate key where id_value = lkp.id and trans_date between lkp.effective_dt and lkp.expiration_dt (or trans_date >= lkp.effective_dt and trans_date < lkp.expiration_dt). Seems the lookup transform only allows inner joins to the lookup table/file.
TIA
TIA
Mike Czerniawski
Hi mczern.
Type 2 slowly changing dimension is really common in the data warehousing world!!!
If the data you're loading correspond to current data, it's easy... all you need to do is extract only the current image of your dimension and lookup against the key... in other words, in the SQL statement that extract the data of your type 2 dimension, you use #current_date# between start_date and end_date.... and in your lookup, you match only the key.
But if you are loading historical data... and some records of your sources are related to old images of your dimension... I don't think there is an easy way to retreive the good record...
Hope this will help.
Type 2 slowly changing dimension is really common in the data warehousing world!!!
If the data you're loading correspond to current data, it's easy... all you need to do is extract only the current image of your dimension and lookup against the key... in other words, in the SQL statement that extract the data of your type 2 dimension, you use #current_date# between start_date and end_date.... and in your lookup, you match only the key.
But if you are loading historical data... and some records of your sources are related to old images of your dimension... I don't think there is an easy way to retreive the good record...
Hope this will help.
The Brute
Unfortunately for me, I'm dealing with the historical version. Transaction dates vary for each incremental load making selection of just the current record id a non-option.tardifma wrote:Hi mczern.
Type 2 slowly changing dimension is really common in the data warehousing world!!!
If the data you're loading correspond to current data, it's easy... all you need to do is extract only the current image of your dimension and lookup against the key... in other words, in the SQL statement that extract the data of your type 2 dimension, you use #current_date# between start_date and end_date.... and in your lookup, you match only the key.
But if you are loading historical data... and some records of your sources are related to old images of your dimension... I don't think there is an easy way to retreive the good record...
Hope this will help.
Mike Czerniawski
I've already seen this problem.
My solution was ( i'm not sure it was the most efficient, but it worked and i implemented it on schedule) :
a join between 2 flows with the key
- your flow
- the lookup flow ( contains, id, effective_dt and expiration_dt)
then you filter the data and you keep the record whose date is between effective_dt and expiration_dt.
You solve the problem of historical data.
the lookup allows outer joins but not theta joins!
My solution was ( i'm not sure it was the most efficient, but it worked and i implemented it on schedule) :
a join between 2 flows with the key
- your flow
- the lookup flow ( contains, id, effective_dt and expiration_dt)
then you filter the data and you keep the record whose date is between effective_dt and expiration_dt.
You solve the problem of historical data.
the lookup allows outer joins but not theta joins!
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
kumar_s wrote:Hi,
If you are ready to use Lookup stage, you can perform range lookup from the same. You can make use of the 'condition' option avaialable in the lookup stage and check for your condition trans_date >= lkp.effective_dt and trans_date < lkp.expiration_dt :D
-Kumar
I'm not sure I understand your reply... If I select 'Input Column' inside the Condition I can see the trans_date column coming from the input flow. However, I don't see the lkp.effective_dt or lkp.expiration_dt that is coming from the lookup table input into the transform. If I try to type it into the expression, I get an error stating that lkp.effective_dt is not found.
Mike Czerniawski
-
- Participant
- Posts: 437
- Joined: Fri Oct 21, 2005 10:00 pm
That's because your input links are segregated from each other. I cannot use the data from one input link as a condition in another input link. The output link on the lookup does not allow a condition. All you can really do using this method is to allow the lookup to create a cartesian product (because it will pickup more than one validity period) and then use a transformer or a filter stage to narrow your records with the condition that kumar suggested.
It would be nice if the output of the lookup allowed for a condition.
It would be nice if the output of the lookup allowed for a condition.
Keith Williams
keith@peacefieldinc.com
keith@peacefieldinc.com