I am using an UV stage for inserting multiple records for a single record in the source . This turns out to be quite slow hence i have got the
multiple records in the source by doing a join (using sql). Is there anyway we can optimize the performance of a UV stage
Regards
UV stage slow
Moderators: chulett, rschirm, roy
-
- Participant
- Posts: 3593
- Joined: Thu Jan 23, 2003 5:25 pm
- Location: Australia, Melbourne
- Contact:
Which part of the process is slow? The population of the UV table, the join statement or the processing of the multiplied rows?
What are your requirements for turning a single row into multiple rows? Perhaps there is a more efficient way to do it without touching data down to a UV table.
What are your requirements for turning a single row into multiple rows? Perhaps there is a more efficient way to do it without touching data down to a UV table.
Certus Solutions
Blog: Tooling Around in the InfoSphere
Twitter: @vmcburney
LinkedIn:Vincent McBurney LinkedIn
Blog: Tooling Around in the InfoSphere
Twitter: @vmcburney
LinkedIn:Vincent McBurney LinkedIn
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
Do a run of your job from Director. In the Job Run Options dialog (where you enter parameters, etc.) choose the Tracing tab, select the active stage (Transformer stage) and enable collection of statistics.
One extra event will be logged; it will tell you what proportion of the time was spend doing what, so you can identify the "hot spots".
Why did you choose a UV stage? Surely a hashed file stage would be faster - you can cache it in memory for writing, for one thing. (It's not clear from your post what you're trying to accomplish.)
You can optimize selection using a join in a UV stage in the same way you would in any other database, by indexing the columns participating in the join. But I'm not sure this is what you need.
One extra event will be logged; it will tell you what proportion of the time was spend doing what, so you can identify the "hot spots".
Why did you choose a UV stage? Surely a hashed file stage would be faster - you can cache it in memory for writing, for one thing. (It's not clear from your post what you're trying to accomplish.)
You can optimize selection using a join in a UV stage in the same way you would in any other database, by indexing the columns participating in the join. But I'm not sure this is what you need.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
-
- Premium Member
- Posts: 892
- Joined: Thu Oct 16, 2003 5:18 am
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
No, not clear at all. A hashed file will take all records, but only the last will remain, since every update to a hashed file is a destructive overwrite.
A UV stage will only take matching records if you use one of the slow, double-operation, rules (insert or update, or update or insert, or replace (delete then insert)). After all, the UV stage uses SQL, which you can view.
What - exactly - are you trying to accomplish?
A UV stage will only take matching records if you use one of the slow, double-operation, rules (insert or update, or update or insert, or replace (delete then insert)). After all, the UV stage uses SQL, which you can view.
What - exactly - are you trying to accomplish?
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
-
- Premium Member
- Posts: 892
- Joined: Thu Oct 16, 2003 5:18 am
Ray,
I am using the Hash file for lookup (not for writing into it). Hence if there
are multiple matching records it would take the first matching record.
If i use a UV stage for lookup then i am able to transfer more than
one matching record to the target. But the issue is that the UV stage
is slow.
Regards
I am using the Hash file for lookup (not for writing into it). Hence if there
are multiple matching records it would take the first matching record.
If i use a UV stage for lookup then i am able to transfer more than
one matching record to the target. But the issue is that the UV stage
is slow.
Regards
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
If you're performing a lookup against a hashed file, you're getting the only match. Not the first, not the last, the only match.
Create an index on the UV table underlying the UV stage, on the key column(s). If there are any other constrained columns, index these too.
Get to a TCL prompt (or use the Adminstrator client's Command window) and execute the generated SQL with the EXPLAIN keyword appended (ahead of the semi-colon). This will show you that the default SELECT uses a table scan. Of course it's slow. You haven't done anything to assist the query. UV tables are not automatically indexed on any column; if you're selecting on the entire primary key it can use a hashing algorithm; your requirement suggests that you are selecting on a secondary key. Index it.
If you really want slow, substitute an ODBC stage!
You will never, ever, get the performance out of a UV stage (which is using SQL to query a disk-based table) that you will get out of a Hashed File stage (which uses hashing algorithm to directly access the required record in memory). Memory access speed is at least 1000 times faster than disk access speed.
Create an index on the UV table underlying the UV stage, on the key column(s). If there are any other constrained columns, index these too.
Get to a TCL prompt (or use the Adminstrator client's Command window) and execute the generated SQL with the EXPLAIN keyword appended (ahead of the semi-colon). This will show you that the default SELECT uses a table scan. Of course it's slow. You haven't done anything to assist the query. UV tables are not automatically indexed on any column; if you're selecting on the entire primary key it can use a hashing algorithm; your requirement suggests that you are selecting on a secondary key. Index it.
If you really want slow, substitute an ODBC stage!
You will never, ever, get the performance out of a UV stage (which is using SQL to query a disk-based table) that you will get out of a Hashed File stage (which uses hashing algorithm to directly access the required record in memory). Memory access speed is at least 1000 times faster than disk access speed.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.