Any way to get the count of particular column

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
gowrishankar_h
Participant
Posts: 42
Joined: Wed Dec 26, 2012 1:13 pm

Any way to get the count of particular column

Post by gowrishankar_h »

Hi,


I have requirement like to read a different dataset and filter a different column with hard coded value and write the count of those column in a separate dataset.


example:

I have 2 dataset,
1) dataset1 (col1,col2,col3,col4)
2) dataset2 (col21,col22,col33,col44)

i need to read those 2 dataset and filter col1 with some hardcode value 'A' col2 to 'B' etc.. and write get the count of individual column and write in a dataset 3 as below.

dataset3(count_col1,count_col2,count_col3 etc)


Note: i cant go for aggregate stage since i have to read many dataset and many col to count it will affect performance.so first i filtered the every individual colum and write in separate dataset.I used external source stage to count the no of record in the dataset but by this meathod i have to use many dataset to write the indiviual column in a separate dataset. its there any other way to reduce the no of dataset and without affecting performance.


Thanks in advance
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

You CAN use Aggregator stage. If you precede it by a Sort stage, the Aggregator stage will be fast.

You could perform the counts in a Transformer stage, but that would involve setting up stage variables for each which you may regard as tedious.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Post Reply