Re[2]: Lookup question

Archive of postings to DataStageUsers@Oliver.com. This forum intended only as a reference and cannot be posted to.

Moderators: chulett, rschirm

Locked
admin
Posts: 8720
Joined: Sun Jan 12, 2003 11:26 pm

Re[2]: Lookup question

Post by admin »

Hi guyes!

What do you think about such decision:
I use Uv table satge as lookup stage and user-defined sql:

SELECT HashFile.lookup_code, HashFile.name
FROM "HashFile"
WHERE HashFile.lookup_code = (SELECT MAX(HashFile.lookup_code)
FROM "HashFile"
WHERE HashFile.lookup_code =
SUBSTRING(? FROM 1 FOR CHAR_LENGTH(HashFile.lookup_code)));



Best regards,Alexandre Stoulov
------------------------------
Consultant. Business Intelligence Projects
TERN Company
mailto:alexs@tern.ru

TC> Sorry, in my sample code, it should have been a "Not(IsNull(...))".
TC> Tony Curcio wrote:
TC> The way I would do this is to define several hash lookups wach for the same file while varying the "Key Derivation". The first lookups derivation would be "inputlink.inputfield[1,1]" to check
TC> for a match for the first character. The second lookups derivation would be "inputlink.inputfield[1,2]" to check for a match for the first two characters. You can repeat this for as many
TC> elements as you would like to match. In your ouput link, your code would check for a NULL condition for each of the links until you get a match. If you had five lookup links and they were named
TC> something like link_match_1char, link_match_2char, etc, the derivation would look something like the following
TC> if IsNull(link_match_5char.key) then link_match_5char.key else if IsNull(link_match_4char.key) then link_match_4char.key else if IsNull(link_match_3char.key) then link_match_3char.key else if
TC> IsNull(link_match_2char.key) then link_match_2char.key else if IsNull(link_match_1char.key) then link_match_1char.key else "Not Found".
TC> ---Tony

TC> Alexandre Stoulov wrote: Hi all,

TC> Does anybody knows how to solve this task:
TC> I have an input stream (flat file) that contains character field
TC> STREAM_CODE. One value for this field for example "823102". Also I
TC> have a lookup table (hash file) with lookup key field LOOKUP_CODE
TC> that contains values: "821", "82","823","8233". The task is to get
TC> the string from lookup table which LOOKUP_CODE is a substring of the
TC> STREAM_CODE and has maximum matching. In our example this string has
TC> LOOKUP_CODE value "823".

TC> Best regards,Alexandre Stoulov
TC> ------------------------------
TC> Consultant. Business Intelligence Projects
TC> TERN Company
TC> mailto:alexs@tern.ru




TC> ---------------------------------
TC> Do You Yahoo!?
TC> Make a great connection at Yahoo! Personals.


TC> ---------------------------------
TC> Do You Yahoo!?
TC> Make a great connection at Yahoo! Personals.
admin
Posts: 8720
Joined: Sun Jan 12, 2003 11:26 pm

Post by admin »

It will work, but it will not be fast. It *might* be improved by creating an index on the lookup_code column of "HashFile".

-----Original Message-----
From: Alexandre Stoulov [mailto:alexs@tern.ru]
Sent: Thursday, 18 October 2001 17:27
To: Tony Curcio
Subject: Re[2]: Lookup question


Hi guyes!

What do you think about such decision:
I use Uv table satge as lookup stage and user-defined sql:

SELECT HashFile.lookup_code, HashFile.name
FROM "HashFile"
WHERE HashFile.lookup_code = (SELECT MAX(HashFile.lookup_code)
FROM "HashFile"
WHERE HashFile.lookup_code =
SUBSTRING(? FROM 1 FOR CHAR_LENGTH(HashFile.lookup_code)));



Best regards,Alexandre Stoulov
------------------------------
Consultant. Business Intelligence Projects
TERN Company
mailto:alexs@tern.ru

TC> Sorry, in my sample code, it should have been a "Not(IsNull(...))".
TC> Tony Curcio wrote:
TC> The way I would do this is to define several hash lookups wach for
TC> the
same file while varying the "Key Derivation". The first lookups derivation would be "inputlink.inputfield[1,1]" to check
TC> for a match for the first character. The second lookups derivation
would be "inputlink.inputfield[1,2]" to check for a match for the first two characters. You can repeat this for as many
TC> elements as you would like to match. In your ouput link, your code
TC> would
check for a NULL condition for each of the links until you get a match. If you had five lookup links and they were named
TC> something like link_match_1char, link_match_2char, etc, the
TC> derivation
would look something like the following
TC> if IsNull(link_match_5char.key) then link_match_5char.key else if
IsNull(link_match_4char.key) then link_match_4char.key else if
IsNull(link_match_3char.key) then link_match_3char.key else if
TC> IsNull(link_match_2char.key) then link_match_2char.key else if
IsNull(link_match_1char.key) then link_match_1char.key else "Not Found".
TC> ---Tony

TC> Alexandre Stoulov wrote: Hi all,

TC> Does anybody knows how to solve this task:
TC> I have an input stream (flat file) that contains character field
STREAM_CODE. One
TC> value for this field for example "823102". Also I have a lookup
TC> table (hash file) with lookup key field LOOKUP_CODE that contains
TC> values: "821", "82","823","8233". The task is to get the string from
TC> lookup table which LOOKUP_CODE is a substring of the STREAM_CODE and
TC> has maximum matching. In our example this string has LOOKUP_CODE
TC> value "823".

TC> Best regards,Alexandre Stoulov
TC> ------------------------------
TC> Consultant. Business Intelligence Projects
TC> TERN Company
TC> mailto:alexs@tern.ru




TC> ---------------------------------
TC> Do You Yahoo!?
TC> Make a great connection at Yahoo! Personals.


TC> ---------------------------------
TC> Do You Yahoo!?
TC> Make a great connection at Yahoo! Personals.
Locked