Ignore one or more keys of an hashed file in a lookup
Moderators: chulett, rschirm, roy
-
- Participant
- Posts: 56
- Joined: Mon Mar 26, 2007 8:48 am
- Location: Blue Bay (La Spezia)
Ignore one or more keys of an hashed file in a lookup
Hi all,
I've an hashed file generated by another process. In my job I need to use this hashed file but not all its keys are useful to me.
i.e.
My stream has 2 fields:
- code1
- descriptionField
My reference (the hashed file) has:
- code1 (key)
- code2 (key)
- other column (not a key)
Is it possible to link my stream with the hashed file only by using the key "code1" (the value of code2 of the hashed file is not relevant in this lookup)?
I tried to put 1=1 in the Key Expression of a transformer stage for code2 but It doesn't run.
Thanks a lot for any help.
Ale
I've an hashed file generated by another process. In my job I need to use this hashed file but not all its keys are useful to me.
i.e.
My stream has 2 fields:
- code1
- descriptionField
My reference (the hashed file) has:
- code1 (key)
- code2 (key)
- other column (not a key)
Is it possible to link my stream with the hashed file only by using the key "code1" (the value of code2 of the hashed file is not relevant in this lookup)?
I tried to put 1=1 in the Key Expression of a transformer stage for code2 but It doesn't run.
Thanks a lot for any help.
Ale
...
-
- Participant
- Posts: 56
- Joined: Mon Mar 26, 2007 8:48 am
- Location: Blue Bay (La Spezia)
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
UV compatibility is only about date-type data. Every hashed file can be accessed via a UV stage, though you may need to create a pointer to a directory-based hashed file in the VOC (search the forum for SETFILE).
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
-
- Participant
- Posts: 56
- Joined: Mon Mar 26, 2007 8:48 am
- Location: Blue Bay (La Spezia)
D'oh, I was in fault.
Thanks Ray, UV stage works fine!
Actually I create the VOC pointer in the before-stage subroutine of the transformer where I perform the lookup.
Let's suppose I need to create there a second pointer, is it possible to define both the two pointers into the same value field of the before stage subroutine statement of the transformer or I need a workaround?
Thanks again,
Ale.
Thanks Ray, UV stage works fine!
Actually I create the VOC pointer in the before-stage subroutine of the transformer where I perform the lookup.
Let's suppose I need to create there a second pointer, is it possible to define both the two pointers into the same value field of the before stage subroutine statement of the transformer or I need a workaround?
Thanks again,
Ale.
...
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
Are you using a custom before-job subroutine or one of the out-of-the-box ones? If the latter you can easily queue as many commands as you like, with @FM between them. If you are using, say, ExecTCL, then you need a workaround. The easiest is to execute each of the commands in turn in Administrator client Command window, then multi-select those commands and save them under a meaningful name. That name is then the command you might execute from ExecTCL.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Usually there is more to it than simply putting a UV stage over the hashed file. As noted, you need a VOC entry and for many people that's unknown territory. It can also involve building / maintaining indexes over the remaining keys, otherwise lookup performance can be pitiful.
Glad you got it sorted out, but why do you feel the need to create a 'second pointer'? A little lost there. And in your routine, do you remove the VOC record each time, or OVERWRITE the old one... or?
Glad you got it sorted out, but why do you feel the need to create a 'second pointer'? A little lost there. And in your routine, do you remove the VOC record each time, or OVERWRITE the old one... or?
-craig
"You can never have too many knives" -- Logan Nine Fingers
"You can never have too many knives" -- Logan Nine Fingers
-
- Participant
- Posts: 56
- Joined: Mon Mar 26, 2007 8:48 am
- Location: Blue Bay (La Spezia)
Hi Craig,
you're right, I tried to simplify my question in my first post.
The situation is the following one.
My stream is a sequential file containing some economical data regarding branches or other kind of offices. The fields:
- a code identifying the single branch or office
- other values not relevant for the lookup
Branches and offices are two different entities and each branch (just as each office) is identified by an unique id. Even if these are two different entities, the identifiers have been defined in order to be unique (an intersection between the set of branches and the set of offices on their identifier gives a null set).
Other procedures generate two different hashed files (one for the branches an one for the offices), and both these hashed files have more key columns then those I need.
Branch hashed file:
- branch code (key)
- institute code (key, superfluous in my case)
- surrogate code of the branch in the dwh
Office hashed file:
- office code (key)
- institute code (key, superfluous in my case)
- surrogate code of the office in the dwh
In a transformer stage I need to perform a referential integrity check with both these two hashed files, in order to reject the rows of my sequential file that does not identify neither a branch nor an office.
This is the reason why I need two UV stages instead of one, and I have to define two pointers instead of one.
Actually I define a pointer with SETFILE in the before-job subroutine (an execTCL) and one in the before-stage subroutine of the transformer where I perform the lookup, but I know that it's not very nice to view . I will try the solution suggested by Ray (using @FM) as soon as our server become available again (it's down for maintenance now ).
Being the format of the fields of the two hashed files the same, an alternative I tried was to perform a SELECT UNION into a single UV stage, but I had several problems with the definition of the where condition.
I tried:
and
but both failed..
ps: Here I overwrite the pointers before using them each time.
you're right, I tried to simplify my question in my first post.
The situation is the following one.
My stream is a sequential file containing some economical data regarding branches or other kind of offices. The fields:
- a code identifying the single branch or office
- other values not relevant for the lookup
Branches and offices are two different entities and each branch (just as each office) is identified by an unique id. Even if these are two different entities, the identifiers have been defined in order to be unique (an intersection between the set of branches and the set of offices on their identifier gives a null set).
Other procedures generate two different hashed files (one for the branches an one for the offices), and both these hashed files have more key columns then those I need.
Branch hashed file:
- branch code (key)
- institute code (key, superfluous in my case)
- surrogate code of the branch in the dwh
Office hashed file:
- office code (key)
- institute code (key, superfluous in my case)
- surrogate code of the office in the dwh
In a transformer stage I need to perform a referential integrity check with both these two hashed files, in order to reject the rows of my sequential file that does not identify neither a branch nor an office.
This is the reason why I need two UV stages instead of one, and I have to define two pointers instead of one.
Actually I define a pointer with SETFILE in the before-job subroutine (an execTCL) and one in the before-stage subroutine of the transformer where I perform the lookup, but I know that it's not very nice to view . I will try the solution suggested by Ray (using @FM) as soon as our server become available again (it's down for maintenance now ).
Being the format of the fields of the two hashed files the same, an alternative I tried was to perform a SELECT UNION into a single UV stage, but I had several problems with the definition of the where condition.
I tried:
Code: Select all
SELECT code, surrogate_code FROM
(
SELECT branch_code AS code, branch_surrogate_code AS surrogate_code FROM branch_pointer
UNION ALL
SELECT office_code AS code, office_surrogate_code AS surrogate_code FROM office_pointer
) AS T
WHERE code = ?;
Code: Select all
SELECT branch_code, branch_surrogate_code FROM branch_pointer WHERE branch_code = ?
UNION ALL
SELECT office_code, office_surrogate_code FROM office_pointer WHERE office_code = ?;
ps: Here I overwrite the pointers before using them each time.
...
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
What error message do you get when the query "failed"? Just "it failed" does not help anyone to diagnose it. Have you tried
There are two parameter markers in this query; have you provided two key values to it from your job?
Code: Select all
SELECT branch_code, branch_surrogate_code FROM branch_pointer WHERE branch_code = '? '
UNION ALL
SELECT office_code, office_surrogate_code FROM office_pointer WHERE office_code = '?';
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
-
- Participant
- Posts: 56
- Joined: Mon Mar 26, 2007 8:48 am
- Location: Blue Bay (La Spezia)
The errors I get from the statements I posted are:
SQLSTATE=S0002, DBMS.CODE=950390
[DataStage][SQL Client][UNIVERSE]DataStage/SQL: Table "(" does not exist.
SQLSTATE=S1000, DBMS.CODE=950125
[DataStage][SQL Client][UNIVERSE]DataStage/SQL: QUESTIONMARK not supported in TCL.
This second one puzzles me because if I write only:
It runs well.
The query in the primary input (with the union) runs well, I can view the data from the two hashed files correctly united. The problem is only in the query for reference, when I specify the where condition..
SQLSTATE=S0002, DBMS.CODE=950390
[DataStage][SQL Client][UNIVERSE]DataStage/SQL: Table "(" does not exist.
SQLSTATE=S1000, DBMS.CODE=950125
[DataStage][SQL Client][UNIVERSE]DataStage/SQL: QUESTIONMARK not supported in TCL.
This second one puzzles me because if I write only:
Code: Select all
SELECT office_code, office_surrogate_code FROM office_pointer WHERE office_code = ?;
The query in the primary input (with the union) runs well, I can view the data from the two hashed files correctly united. The problem is only in the query for reference, when I specify the where condition..
...
Try
Code: Select all
SELECT code, surrogate_code FROM
(
SELECT branch_code AS code, max(branch_surrogate_code) AS surrogate_code FROM branch_pointer
UNION ALL
SELECT office_code AS code, max(office_surrogate_code) AS surrogate_code FROM office_pointer
) AS T
WHERE code = ?;
Wolfgang Hürter
Amsterdam
Amsterdam
-
- Participant
- Posts: 56
- Joined: Mon Mar 26, 2007 8:48 am
- Location: Blue Bay (La Spezia)
I get the same error (Table "(" does not exist.)WoMaWil wrote:Try
Code: Select all
SELECT code, surrogate_code FROM ( SELECT branch_code AS code, max(branch_surrogate_code) AS surrogate_code FROM branch_pointer UNION ALL SELECT office_code AS code, max(office_surrogate_code) AS surrogate_code FROM office_pointer ) AS T WHERE code = ?;
...
When you have two question marks, do you have two columns marked as Key fields as well?Alethesnake wrote:This second one puzzles me because if I write only:It runs well.Code: Select all
SELECT office_code, office_surrogate_code FROM office_pointer WHERE office_code = ?;
-craig
"You can never have too many knives" -- Logan Nine Fingers
"You can never have too many knives" -- Logan Nine Fingers
-
- Participant
- Posts: 56
- Joined: Mon Mar 26, 2007 8:48 am
- Location: Blue Bay (La Spezia)
Yes, I do.
I build my statement as follow.
Output Columns:
- OFFICE_CODE , key, not null, integer
- BRANCH_CODE , key, not null, integer
- OFFICE_BRANCH_SURR_CODE, not null, integer
And the reference sql statement is:
@WoMaWil: I can solve also by performing a double lookup in the same transformer without the creation of a new hashed file but I'd like to understand why the union as primary sql runs well and as reference no.
I build my statement as follow.
Output Columns:
- OFFICE_CODE , key, not null, integer
- BRANCH_CODE , key, not null, integer
- OFFICE_BRANCH_SURR_CODE, not null, integer
And the reference sql statement is:
Code: Select all
SELECT office_code, NULL as tempField, office_surrogate_code FROM office_pointer WHERE (office_code = ?)
UNION ALL
SELECT branch_code, NULL as tempField, branch_surrogate_code FROM branch_pointer WHERE (branch_code = ?);
...