Hi All,
Can u any body explain how can we do the joins in the Server jobs if the data is to be extracted from multiple sources like Oracle, Sql Server and Sybase,flat files etc.,
Thanks in Advance.
Regds
How to Join the multiple Database sources
Moderators: chulett, rschirm, roy
-
- Premium Member
- Posts: 82
- Joined: Fri Jun 03, 2005 5:23 am
- Location: Bangalore
- Contact:
How to Join the multiple Database sources
Rajasekhar
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
In server jobs you use the Transformer stage with one stream input link and other reference input links.
The stream input link is delivered a stream of rows from one data source (some joins may have been performed there). Each reference input link connects to another data source and is provided with one or more key values (per row) with which it performs a "get row that has this key" function.
If the data sources are remote, or do not support key-based lookups (such as flat files), you would prefer making a local copy for the reference inputs. This is precisely the function of the Hashed File stage; hashed files provide the fastest lookup-by-key capability.
The stream input link is delivered a stream of rows from one data source (some joins may have been performed there). Each reference input link connects to another data source and is provided with one or more key values (per row) with which it performs a "get row that has this key" function.
If the data sources are remote, or do not support key-based lookups (such as flat files), you would prefer making a local copy for the reference inputs. This is precisely the function of the Hashed File stage; hashed files provide the fastest lookup-by-key capability.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
-
- Participant
- Posts: 107
- Joined: Wed Sep 29, 2004 10:15 am
-
- Premium Member
- Posts: 82
- Joined: Fri Jun 03, 2005 5:23 am
- Location: Bangalore
- Contact:
Thanks ray for your valuable input
ray.wurlod wrote:In server jobs you use the Transformer stage with one stream input link and other reference input links.
The stream input link is delivered a stream of rows from one data source (some joins may have been performed there). Each reference input link connects to another data source and is provided with one or more key values (per row) with which it performs a "get row that has this key" function.
If the data sources are remote, or do not support key-based lookups (such as flat files), you would prefer making a local copy for the reference inputs. This is precisely the function of the Hashed File stage; hashed files provide the fastest lookup-by-key capability.
Rajasekhar
-
- Premium Member
- Posts: 82
- Joined: Fri Jun 03, 2005 5:23 am
- Location: Bangalore
- Contact:
Re: Thanks ray for your valuable input
Thank you verymuch.
irajasekharhexa wrote:ray.wurlod wrote:In server jobs you use the Transformer stage with one stream input link and other reference input links.
The stream input link is delivered a stream of rows from one data source (some joins may have been performed there). Each reference input link connects to another data source and is provided with one or more key values (per row) with which it performs a "get row that has this key" function.
If the data sources are remote, or do not support key-based lookups (such as flat files), you would prefer making a local copy for the reference inputs. This is precisely the function of the Hashed File stage; hashed files provide the fastest lookup-by-key capability.
Rajasekhar
-
- Participant
- Posts: 3593
- Joined: Thu Jan 23, 2003 5:25 pm
- Location: Australia, Melbourne
- Contact:
The IILive2005 had a session on using the full IBM Information Integration suite (not just the Ascential products). With Information Integrator you can create a join across heterogeneous data sources and then process this data with DataStage. You can also use Information Integrator replication to trickle feed a data warehouse.
Certus Solutions
Blog: Tooling Around in the InfoSphere
Twitter: @vmcburney
LinkedIn:Vincent McBurney LinkedIn
Blog: Tooling Around in the InfoSphere
Twitter: @vmcburney
LinkedIn:Vincent McBurney LinkedIn