I have scenarion, where I have to join 6 different tables as the source.
The joining condtion includes several outer joins and self joins.
I am planning to write whole SQL in an ODBC stage which will be the source.
Is that the best option in DataStage?
Is there much we can do from DataStage?
Regards
Join of multiple tables
Moderators: chulett, rschirm, roy
-
- Participant
- Posts: 3337
- Joined: Mon Jan 17, 2005 4:49 am
- Location: United Kingdom
There are lots of options. Hash files are easy to do outer joins with. Depends on lots of factors on which option you choose. If your source systems are overloaded then you want to do it in hash files or off line as much as possible. Hash files may out perform any other method. It may not as well. It is not hard to test both. The SQL solution is the most common because a lot of users refuse to learn the flexibility and power of all of DataStage. You can also choose to use less of DataStage and more of what you know and understand or you can challenge youself to learn the best way which is whatever performs best and fits in with the customer's needs.
Mamu Kim
I think Sainath is correct in saying we need to see your SQL.
Each SELECT may need to update a hash file. The number of rows in all SQL statements would be nice too.
Each SELECT may need to update a hash file. The number of rows in all SQL statements would be nice too.
Last edited by kduke on Fri Aug 26, 2005 10:43 am, edited 1 time in total.
Mamu Kim
Doing all inner joins at the source level with Sql and use Hashfiles as lookups with right constraints to achive outer joins would be one way of getting good performance.arindami wrote:Thanks
Kduke , Do you mean to say using multiple hash files as I have 7 tables?
Sainnath, I did not get you about creating new stages.
I am creating a server job.
Again, depend on the volumes of the data which would go into hashfiles and the size they can grow.
~Kris