Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.
Moderators: chulett , rschirm , roy
seanc217
Premium Member
Posts: 188 Joined: Thu Sep 15, 2005 9:22 am
Post
by seanc217 » Thu Aug 17, 2006 3:30 pm
I have 2 datasets in an Oracle database that I want to combine. So for example let's say table 1 has the following values:
A
B
C
and table 2 has the following values:
D
E
F
I want the result dataset to be
A
B
C
D
E
F
What is the best way of going about this?
Thanks
narasimha
Charter Member
Posts: 1236 Joined: Fri Oct 22, 2004 8:59 am
Location: Staten Island, NY
Post
by narasimha » Thu Aug 17, 2006 3:42 pm
Do a 'UNION ALL' between Table 1 and Table 2
Narasimha Kade
Finding answers is simple, all you need to do is come up with the correct questions.
seanc217
Premium Member
Posts: 188 Joined: Thu Sep 15, 2005 9:22 am
Post
by seanc217 » Thu Aug 17, 2006 3:46 pm
Thanks for the reply,
Is there a way for DataStage to do this kind of operation?
Or is utilizing the database the best way?
narasimha
Charter Member
Posts: 1236 Joined: Fri Oct 22, 2004 8:59 am
Location: Staten Island, NY
Post
by narasimha » Thu Aug 17, 2006 3:51 pm
It is best if you do it on the database side. It is always the fastest method.
There are multiple ways of doing it in datastage, but do you want to go that path?
Narasimha Kade
Finding answers is simple, all you need to do is come up with the correct questions.
kris007
Charter Member
Posts: 1102 Joined: Tue Jan 24, 2006 5:38 pm
Location: Riverside, RI
Post
by kris007 » Thu Aug 17, 2006 3:52 pm
Look into the Funnel Stage.
Kris
Where's the "Any" key?-Homer Simpson
DSguru2B
Charter Member
Posts: 6854 Joined: Wed Feb 09, 2005 3:44 pm
Location: Houston, TX
Post
by DSguru2B » Thu Aug 17, 2006 3:59 pm
Union All is the best way as advised by others. Its a simple sql that you need to write and get it done. Are the two tables in different databases? If yes then use the funnel stage. You can also create two files and combine them at the OS level.
Creativity is allowing yourself to make mistakes. Art is knowing which ones to keep.
meena
Participant
Posts: 430 Joined: Tue Sep 13, 2005 12:17 pm
Post
by meena » Thu Aug 17, 2006 4:02 pm
Yes,it is best to do in oracle database directly instead of going to other stage..When we are designing a job we consider the performance of the job..I prefer to do it on database side..
prabu
Participant
Posts: 146 Joined: Fri Oct 22, 2004 9:12 am
Post
by prabu » Fri Aug 18, 2006 3:52 am
narasimha wrote: Do a 'UNION ALL' between Table 1 and Table 2
You also should undertand that
will preserve the duplicates, say value 'A' is available in both datasets, your generated result set will have
will suppress duplicates
UNION ALL will make use of parallelism but UNION cannot.
devnull
Premium Member
Posts: 37 Joined: Wed Mar 29, 2006 11:17 am
Location: Minneapolis, MN USA
Post
by devnull » Fri Aug 18, 2006 11:22 am
prabu wrote: narasimha wrote: Do a 'UNION ALL' between Table 1 and Table 2
You also should undertand that
will preserve the duplicates, say value 'A' is available in both datasets, your generated result set will have
will suppress duplicates
UNION ALL will make use of parallelism but UNION cannot.
And if that is the case, then the Funnel with a Sort is your best option.
Michael Mabin
Minneapolis, MN USA
d3vvnull@com.gmail
(Reverse com and gmail to send email)
narasimha
Charter Member
Posts: 1236 Joined: Fri Oct 22, 2004 8:59 am
Location: Staten Island, NY
Post
by narasimha » Fri Aug 18, 2006 12:04 pm
devnull wrote:
And if that is the case, then the Funnel with a Sort is your best option.
May I know why you are suggesting this? May be I am missing something...
Narasimha Kade
Finding answers is simple, all you need to do is come up with the correct questions.
prabu
Participant
Posts: 146 Joined: Fri Oct 22, 2004 9:12 am
Post
by prabu » Fri Aug 18, 2006 12:10 pm
narasimha wrote: devnull wrote:
And if that is the case, then the Funnel with a Sort is your best option.
May I know why you are suggesting this? May be I am missing something...
only reason i can think of is , you may get a satisfaction of using more stages in your job [if you think for some more time, you may also 'feel' remove duplicate stage is required]