How to find the most dominant valuex

Post questions here relative to DataStage Server Edition for such areas as Server job design, DS Basic, Routines, Job Sequences, etc.

Moderators: chulett, rschirm, roy

Post Reply
rafidwh
Participant
Posts: 179
Joined: Mon Oct 10, 2005 11:30 pm

How to find the most dominant valuex

Post by rafidwh »

Hi All,

I have the data as shown below and I need to pull the most dominant value, pl give me the best logic applicable

ID DT
1 X
1 X
1 Y
1 Y
1 Y
1 Z

There are two fields primary and secondary and my data should look like

ID DT Primary Secondary

1 X Y X
1 Y Y X
1 Z Y X

Here Y is the most domianant and X is the second most dominant.

Thanks in advance.
jhmckeever
Premium Member
Premium Member
Posts: 301
Joined: Thu Jul 14, 2005 10:27 am
Location: Melbourne, Australia
Contact:

Post by jhmckeever »

Assuming ...
1. By 'dominant' you mean 'frequent'
2. Your input data is sorted (as it is in your example)

Without designing the job in its entirety for you, here are a few suggestions ...

- Use an aggregator to generate a count of each DT value within each ID group

- Sort and propagate the first two rows (ie the two most frequent) on individual links - 'primary' and 'secondary' as apppropriate. You may want to use a key-change column and transformer stage variables to identify these.

- Rejoin primary and secondary columns to the input using the ID

That should hopefully be enough to get you started!

John.
Last edited by jhmckeever on Mon Dec 14, 2009 1:28 am, edited 1 time in total.
<b>John McKeever</b>
Data Migrators
<b><a href="https://www.mettleci.com">MettleCI</a> - DevOps for DataStage</b>
<a href="http://www.datamigrators.com/"><img src="https://www.datamigrators.com/assets/im ... l.png"></a>
rafidwh
Participant
Posts: 179
Joined: Mon Oct 10, 2005 11:30 pm

Post by rafidwh »

Is there a way to check without using a trnasformer stage.
rafidwh
Participant
Posts: 179
Joined: Mon Oct 10, 2005 11:30 pm

Post by rafidwh »

The first point ie the aggregator is bit confusing, how can i get count on DT within group ID
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

rafidwh wrote:Is there a way to check without using a trnasformer stage.
Probably. But why?
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
jhmckeever
Premium Member
Premium Member
Posts: 301
Joined: Thu Jul 14, 2005 10:27 am
Location: Melbourne, Australia
Contact:

Post by jhmckeever »

1. What's the objection/restriction around using Transformer stages?

2. Specify 'ID' and 'DT' as grouping keys and use an aggregation type of 'Count Rows'. This will count unique ID/DT combinations. Define a 'Count Output Column' and add a descending sort by this column.
<b>John McKeever</b>
Data Migrators
<b><a href="https://www.mettleci.com">MettleCI</a> - DevOps for DataStage</b>
<a href="http://www.datamigrators.com/"><img src="https://www.datamigrators.com/assets/im ... l.png"></a>
rafidwh
Participant
Posts: 179
Joined: Mon Oct 10, 2005 11:30 pm

Post by rafidwh »

Hi Ray,

I am bit confused on how to use aggregator if I want to use the third dominant also apart from first and second dominant value.
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Why are you asking me? John McKeever suggested it, and explained correctly how to use it.

Third level = third grouping key (imagine you're doing it with an SQL query).
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Post Reply