We are using the Oracle10 g as our source database. When we extract the data using OCI stage we sort the data by applying "Order By asc" on Keys and the Date at the end of query. After the data is extracted, we split the incoming data into 2 streams on the basis of dates where if records are having Date < 01-01-2005 00:00:00 then it should go to "Stream 1" where as if records are having Date >= 01-01-2005 00:00:00 then it should go in "Stream 2".
Now in Stream 1 we apply Remove Duplicate stage on the Keys and retain the last record.
Is it possible that after spliting the data the Stream 1 may get records in Un - Sorted manner and thus the Remove Duplicate stage will yield incorrect results?
Preserving the Sorting after FILTER stage
Moderators: chulett, rschirm, roy
-
- Participant
- Posts: 221
- Joined: Fri Feb 17, 2006 3:38 am
- Location: India
- Contact:
Preserving the Sorting after FILTER stage
Thanks & Regards
Parag Saundattikar
Certified for Infosphere DataStage v8.0
Parag Saundattikar
Certified for Infosphere DataStage v8.0
-
- Participant
- Posts: 25
- Joined: Fri Jan 11, 2008 12:49 am
- Location: Pune, India
-
- Participant
- Posts: 221
- Joined: Fri Feb 17, 2006 3:38 am
- Location: India
- Contact:
Actually we are getting review comments from the external contractors who are working as a DataStage professionals about removing the Sort stage. They are saying that Sort stage is not needed as the partition will be preserved.
Thanks & Regards
Parag Saundattikar
Certified for Infosphere DataStage v8.0
Parag Saundattikar
Certified for Infosphere DataStage v8.0
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
Re: Preserving the Sorting after FILTER stage
No.parag.s.27 wrote:Is it possible that after spliting the data the Stream 1 may get records in Un - Sorted manner and thus the Remove Duplicate stage will yield incorrect results?
There is nothing in what you are doing that would affect the sorted order.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
Partitioning and sorting are not the same thing. The comments are correct, but miss the point that a Remove Duplicates stage expects sorted input for most efficient operation.parag.s.27 wrote:Actually we are getting review comments from the external contractors who are working as a DataStage professionals about removing the Sort stage. They are saying that Sort stage is not needed as the partition will be preserved.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.