Page 1 of 2

Posted: Wed Aug 20, 2003 9:56 am
by tonystark622
Emma,

When I ran into this need, I created an extra field on the hash file that was a "composite" index, i.e. it contained the contents of both "key" fields, concatenated together and separated by some character (I used a tilde). So creating the field looked something like this:

Field1:"~":Field2

Then I created the secondary index on this new field (I called it Key1).

Then I would do the same thing when composing the key for this field in the lookup.

Hope this helps,
Tony

Posted: Wed Aug 20, 2003 9:57 am
by kduke
Emma

What are your trying to do? You can always combine two columns in an I-descriptor and create an index off of that. You need to use this new field in your lookup and not the two separate fields. You can add this new field by using the editor. Here is how it would look. You need to compile the dictionary item afterwards.

At TCL:

ED DICT MyHashFile CombinedFieldName
1: I
2: Field1 : @TM : Field2
3:
4: CombinedFieldName
5: 40L
6: S

Save it and:

CD MyHashFile CombinedFieldName

This will compile it. Then CREATE.INDEX. I put a @TM between the 2 fields. It does not have to be there.

Kim.

Kim Duke
DwNav - ETL Navigator
www.Duke-Consulting.com

Posted: Wed Aug 20, 2003 4:39 pm
by ray.wurlod
CREATE INDEX (the SQL variant) will create an index on multiple fields, but you have to name them both in the statement. You can also specify the sort order for each of the indexed columns' values. For example:

CREATE INDEX indexname ON tablename (colname1 ASC, colname2 DESC);

However, for this to work, the hashed file will need to have been created as a table (using a CREATE TABLE statement rather than CREATE.FILE or mkdbfile). That is, it can not have been created using a Hashed File stage; it would need to have been created using a UV stage.

Ray Wurlod
Education and Consulting Services
ABN 57 092 448 518

Posted: Thu Aug 21, 2003 9:47 am
by datastagedummy
This is not a reply but a follow up question.

I have a job in which I look up a UniverseStage which is being referenced with multi row result set and this job takes forever to run so what I did was I added an INDEX on the table at the before job subroutine
ExecTCL=CREATE INDEX TMPINDEX ON SAP_SALES_BOM_OUTPUT (MATERIAL_NBR, PLANT_ID) ;

This did improve the performance to 1 row/second [:(] from 0 rows/sec

Do you guys have any suggestions on how I can improve the performance by using indexes or anything else.

Thanks

Posted: Thu Aug 21, 2003 9:59 am
by kduke
Dummy

I sure would not build the index in the before routine. I would do it in a separate job or in the batch job or sequence job. The before routines are not really run before the job starts. They are sort of run at the same time as all the files or tables are opened. I am not sure that your index is ever used. Try creating it before the job and let us know what happens. A one to many lookup is a complicated thing to pull off. This is a new feature. I expect it to improve performance in new releases. Call it into support. You may cause it to be faster in new releases.

Kim.

Kim Duke
DwNav - ETL Navigator
www.Duke-Consulting.com

Posted: Thu Aug 21, 2003 11:30 am
by datastagedummy
Tried to change from Pre Job to a new job that runs before the job in question as suggested but it still is one row per second.

[V]

Posted: Thu Aug 21, 2003 12:42 pm
by kcbland
Yes, stop doing relational work on a storage structure optimized for 1-to-1 reference lookups.

Why are you doing multi-row lookups? Please state your functional requirements. If you're doing a ranging (BETWEEN) style select, there are ways to construct your lookup so as to give you the optimized reference lookup design for your hash file.

Thanks

Kenneth Bland

Posted: Thu Aug 21, 2003 12:47 pm
by kduke
Ken is right. Hash files are not the best choice here. Explain why you need this functionality and we can offer a better solution.

Kim.

Kim Duke
DwNav - ETL Navigator
www.Duke-Consulting.com

Posted: Thu Aug 21, 2003 3:52 pm
by kcbland
Sorry Emma, I was responding to "datastagedummy", who was usurping your message forum.

Kenneth Bland

Posted: Fri Aug 22, 2003 10:55 am
by datastagedummy
I am very Sorry Emma for using your forum.

Do I have your permission or should I open a new one for myself as my problem was INDEX related.

Sorry once again

Posted: Sat Aug 23, 2003 10:13 pm
by datastagedummy
My situation is that I have a COMPANY_CODE field(input) which returns me multiple PLANT_ID (from hash lookup and UVStage) and then I have to create an output of n rows for that particular COMPANY_CODE is there any better way of doing this.

eg :
Input :
CO_CODE = 110

Hash File/UV Lookup

Reference:
CO_CODE PLANT_ID
110 1030
110 1045
110 3090

Output :
MATERIAL_NO CO_CODE PLANT_ID
ABCDEF 110 1030
ABCDEF 110 1045
ABCDEF 110 3090

Any help that would get me up from the present speed of 6 rows/sec would be appreciated.

Thanks

Posted: Sun Aug 24, 2003 1:16 am
by spracht
Hi

is the job logic and the structure of the hash file the same as originally posted? If so, you would need a secondary index on CO_CODE rather than MATERIAL_NO, PLANT_ID? Where is your stream input coming from? If it's a hash file or uv table as well, you could interchange stream and reference. Or join the two tables within a uv stage.

Stephan

Posted: Sun Aug 24, 2003 3:48 am
by spracht
Dummy, (I really hesitated to call you like that), I hope you don't mind that I ask Ken on the range (between) style select within this topic?

Ken

I had a couple of situations when I had to do such a range style lookup and couldn't help but to do a 'multi row result set' reference and subsequenly excluding the records not wanted using a constraint. I would be very glad to hear how an optimized lookup design could look like.

Thanks in advance

Stephan