logic
Moderators: chulett, rschirm, roy
logic
hi,
I have to implement below logic in datastage(not with custom query)
SELECT COL1 FROM DBASE GROUP BY COL1 HAVING COUNT(*) > 1 .
as of now iam following below approach.
Use a copy stage to split the incoming records into two streams. One stream goes to an Aggregator stage that groups the records by key field(s) and counts the number of records in each group and outputs the results to the COUNT field. The output from Aggregator stage is then joined to the other stream using a Join stage on key field(s) and the results are then passed to Transformer stage. In Transformer, you could put a constraint like COUNT>2.
Is there any best way to do this?
I have to implement below logic in datastage(not with custom query)
SELECT COL1 FROM DBASE GROUP BY COL1 HAVING COUNT(*) > 1 .
as of now iam following below approach.
Use a copy stage to split the incoming records into two streams. One stream goes to an Aggregator stage that groups the records by key field(s) and counts the number of records in each group and outputs the results to the COUNT field. The output from Aggregator stage is then joined to the other stream using a Join stage on key field(s) and the results are then passed to Transformer stage. In Transformer, you could put a constraint like COUNT>2.
Is there any best way to do this?
sorry ,Small correction
Use a copy stage to split the incoming records into two streams. One stream goes to an Aggregator stage that groups the records by key field(s) and counts the number of records in each group and outputs the results to the COUNT field. The output from Aggregator stage is then joined to the other stream using a Join stage on key field(s) and the results are then passed to Transformer stage. In Transformer, you could put a constraint like COUNT>1
Use a copy stage to split the incoming records into two streams. One stream goes to an Aggregator stage that groups the records by key field(s) and counts the number of records in each group and outputs the results to the COUNT field. The output from Aggregator stage is then joined to the other stream using a Join stage on key field(s) and the results are then passed to Transformer stage. In Transformer, you could put a constraint like COUNT>1
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
Your solution is probably the best way for versions of IS prior to 8.5. Make certain that you partition and sort correctly prior to the copy stage.
For 8.5 and above, a sort stage with keyChange enabled, followed by a transformer is the simplest. Set a stage variable to @FALSE when keyChange=1 and LastRowInGroup()=1 (this means the group has only one row), otherwise set it to @TRUE. Then in the constraint for your output link only allow the rows where the stage variable is @TRUE.
Regards,
For 8.5 and above, a sort stage with keyChange enabled, followed by a transformer is the simplest. Set a stage variable to @FALSE when keyChange=1 and LastRowInGroup()=1 (this means the group has only one row), otherwise set it to @TRUE. Then in the constraint for your output link only allow the rows where the stage variable is @TRUE.
Regards,
- james wiles
All generalizations are false, including this one - Mark Twain.
All generalizations are false, including this one - Mark Twain.
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
It is always the case that a solution that works on two nodes will work on four, or 2000. Such is guaranteed by the way that the parallel execution engine works.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
@Jwilesjwiles wrote:Your solution is probably the best way for versions of IS prior to 8.5. Make certain that you partition and sort correctly prior to the copy stage.
For 8.5 and above, a sort stage with keyChange enabled, followed by a transformer is the simplest. Set a stage variable to @FALSE when keyChange=1 and LastRowInGroup()=1 (this means the group has only one row), otherwise set it to @TRUE. Then in the constraint for your output link only allow the rows where the stage variable is @TRUE.
Regards,
I have below clarification .Could you please clarify
when i gone through the below link
http://publib.boulder.ibm.com/infocente ... eaks2.html
we need to use LastRowInGroup() function with argument but u used without argument[ LastRowInGroup()=1 ].Please explain how it works without argument and correct me if i am wrong?
My example was not a literal example of how to use the function, it was an example with the intent of indicating that LastRowInGroup is the function you should use. Use it as the documentation indicates (i.e. with the required arguments).
"function_name()" is a common method of writing the name of a function in a text conversation. Sorry for the confusion...
Regards,
"function_name()" is a common method of writing the name of a function in a text conversation. Sorry for the confusion...
Regards,
- james wiles
All generalizations are false, including this one - Mark Twain.
All generalizations are false, including this one - Mark Twain.