Adavantages of using db2 query or data stage??????
Moderators: chulett, rschirm, roy
-
- Premium Member
- Posts: 145
- Joined: Fri Sep 15, 2006 12:23 pm
Adavantages of using db2 query or data stage??????
Hi,
Could any one of you let me know what is more advantageous? joining in the query itself, or using a lookup in data stage? having multiple case statements in DB2 or if conditions using transformers???
Please support your answers with factual information or do let me know the right place to dig this information.
Many Thanks,
Waseem
Could any one of you let me know what is more advantageous? joining in the query itself, or using a lookup in data stage? having multiple case statements in DB2 or if conditions using transformers???
Please support your answers with factual information or do let me know the right place to dig this information.
Many Thanks,
Waseem
Its difficult to generalise about this question, it depends on the query, data demographics, and the flavour and set-up of the database etc. Generally I'd say its better to do joins and lookups in the RDBMS where possible. Its what they're designed for and it means there's less moving around of data between environments etc. That's providing the data you want to join or use as a lookup reference is in the DB already. If its in a file or input stream then you would have to weigh up the overheads of loading data to a temp table and joining in the DB versus using a transformer stage. If its just a few rows then a transformer may well be quicker, while if its a lot then the row-by-row lookup could be very slow and it might be better to do a bulk-load and inner join.
------------------
John Thompson
John Thompson
-
- Premium Member
- Posts: 145
- Joined: Fri Sep 15, 2006 12:23 pm
Again it depends. Ask your DBA. Depending upon how heavily the database server is loaded and/or depending upon the temp space your DBA will give you and Ok or not. If the DBA agrees, nothing like it. You can pass your sql join and get the result for further processing. Else do it in datastage.
Creativity is allowing yourself to make mistakes. Art is knowing which ones to keep.
Well it is difficult it generilize and would depend on case to case basis. Following is some guideline which I follow:
If you are joing table A and table B and the ratio of records in between the tables:
1. Is 1:1 definitly do the job in database.
2. Is 1000:1 consider doing the join in DataStage.
3. Is less than 1000:1 consider doing the join in database.
Hope it helps....
If you are joing table A and table B and the ratio of records in between the tables:
1. Is 1:1 definitly do the job in database.
2. Is 1000:1 consider doing the join in DataStage.
3. Is less than 1000:1 consider doing the join in database.
Hope it helps....
Assume everything I say or do is positive
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
Anything that can leverage indexes do in the database. This can include restrictions, joins, some groupings and sorting (if there is a B-tree index on the column being sorted).
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
-
- Premium Member
- Posts: 145
- Joined: Fri Sep 15, 2006 12:23 pm
Well, I know my question is vague and not precise. However, I would still want to compare if it is better to have 50 case statements in one query or to have a transformer where i can define those 50 IF conditions??? and what would be the max case statements should be used incase, if the data is of high volume.
For 8-10 million rows
In ideal world, it would be better to use a lookup stage in such scenario.
But based on how your database/table is loaded, you can also code it in the SQL statement.
However do not use a transformer as it has no specific advantage over a lookup.
But based on how your database/table is loaded, you can also code it in the SQL statement.
However do not use a transformer as it has no specific advantage over a lookup.
-
- Participant
- Posts: 3593
- Joined: Thu Jan 23, 2003 5:25 pm
- Location: Australia, Melbourne
- Contact:
I dislike any transformations in the SQL select statement as they are mostly invisible to metadata reporting and can be hard to document and maintain. I would argue both the 50 case statement and 50 IF statement are both flawed. You should be looking at creating some type of business rule mapping table on your ETL server that transposes the cases to values stored in a table. If you really need 50 lines of code for a single transformation then consider a custom coded operator in C++ that can be properly built and tested.
I do support joining tables on a SQL select in order to reduce the volume of rows being brought onto the ETL server. Just as good is putting a layer of database views over the source tables that have been properly designed and optimised.
I do support joining tables on a SQL select in order to reduce the volume of rows being brought onto the ETL server. Just as good is putting a layer of database views over the source tables that have been properly designed and optimised.
Certus Solutions
Blog: Tooling Around in the InfoSphere
Twitter: @vmcburney
LinkedIn:Vincent McBurney LinkedIn
Blog: Tooling Around in the InfoSphere
Twitter: @vmcburney
LinkedIn:Vincent McBurney LinkedIn