Record counts per partition

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
crystal_pup
Participant
Posts: 62
Joined: Thu Feb 08, 2007 6:01 am
Location: Pune

Post by crystal_pup »

You can try something like this :-

1) Row generator as a source (generate let's say 200 rows)
2) Use a transformer stage and pass on @PARTITIONNUM system variable value to some output column for eg:- Part. Use Round robin partition on the input link.
3) Use an Aggregator stage and use Count method on the Part column and get the output in some output column for eg:- Cnt.

I tried it and I got the following result :-

Peek_6,0: Part:0 Count:50
Peek_6,3: Part:1 Count:50
Peek_6,3: Part:2 Count:50
Peek_6,1: Part:3 Count:50
srinivas.nettalam
Participant
Posts: 134
Joined: Tue Jun 15, 2010 2:10 am
Location: Bangalore

Post by srinivas.nettalam »

crystal_pup wrote:You can try something like this :-
2) Use a transformer stage and pass on @PARTITIONNUM system variable value to some output column for eg:- Part. Use Round robin partition on the input link.
Round Robin may not necessarily used..depends on the data in specific cases and hence aggregation on the @PARTITIONNUM group is the solution and the rest varies as per the requirement.
N.Srinivas
India.
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

I suspect that the interviewers were seeking more in-depth knowledge of the DataStage API. In particular, the function DSGetLinkInfo() can retrieve the total row count for the link (DSJ.LINKROWCOUNT) or a list of row counts per instance (node) using DSJ.INSTROWCOUNT.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
kduke
Charter Member
Charter Member
Posts: 5227
Joined: Thu May 29, 2003 9:47 am
Location: Dallas, TX
Contact:

Post by kduke »

dsjob command would be the other way.
Mamu Kim
Post Reply