Identifying different groups based on column valeus
Moderators: chulett, rschirm, roy
Identifying different groups based on column valeus
Hi All,
I have a requirement to identify groups which are having different values in a specific column,
suppose if we have data like as below
G1 M1 ST
G1 M2 PB
G1 M3 PD
G2 M4 P1
G2 M5 P2
G2 M6 P3
Then I need to identify group G1 since It is having the values ST and PB in column 3.
In the same way I have given different values for Col3, if Col3 contain any of those values then I need to filter out that group.
Is there any good idea to implement this logic in Parallel extender?
I have a requirement to identify groups which are having different values in a specific column,
suppose if we have data like as below
G1 M1 ST
G1 M2 PB
G1 M3 PD
G2 M4 P1
G2 M5 P2
G2 M6 P3
Then I need to identify group G1 since It is having the values ST and PB in column 3.
In the same way I have given different values for Col3, if Col3 contain any of those values then I need to filter out that group.
Is there any good idea to implement this logic in Parallel extender?
--
Swathi Ch
Swathi Ch
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
Not sure it's possible with just just stage variables as you'll basically need to check the data twice - once to see if you find those values in a group and then a second time to pass the group only if the first check didn't find anything.
I've done this with a hashed file and a Server job, that is pretty straight-foward. Not quite sure of a PX approach, although I'm sure others will have ideas. The words 'fork' and 'join' come to mind.
I've done this with a hashed file and a Server job, that is pretty straight-foward. Not quite sure of a PX approach, although I'm sure others will have ideas. The words 'fork' and 'join' come to mind.
-craig
"You can never have too many knives" -- Logan Nine Fingers
"You can never have too many knives" -- Logan Nine Fingers
You create a reference hashed file with a key for your 'group' column and a data column that is just an indicator - yes/no, true/false, whatever - to show if that group includes the target values. Do a lookup for each record and then write the result back to the same uncached hashed so it stays in sync.
Set the indicator to true when it succeeds and the incoming value is one of your target values. If the lookup suceeds and the incoming value is not one of the target values, do not update the hashed file. On the first lookup miss for each group, set the data value to true/false accordingly. When you are done you will have a hashed file which records which groups should pass through (flag is false) or should be filtered out (flag is true).
Then a second job (or another section in the same job after a process break) would check each record against the hashed file and constrain accordingly.
I'm sure an equivalent(ish) fork join PX design could be made to work as well. Set a flag to zero or one, aggregator max on that per group, fork join your original data stream and constrain out the ones. Something like that.
Set the indicator to true when it succeeds and the incoming value is one of your target values. If the lookup suceeds and the incoming value is not one of the target values, do not update the hashed file. On the first lookup miss for each group, set the data value to true/false accordingly. When you are done you will have a hashed file which records which groups should pass through (flag is false) or should be filtered out (flag is true).
Then a second job (or another section in the same job after a process break) would check each record against the hashed file and constrain accordingly.
I'm sure an equivalent(ish) fork join PX design could be made to work as well. Set a flag to zero or one, aggregator max on that per group, fork join your original data stream and constrain out the ones. Something like that.
-craig
"You can never have too many knives" -- Logan Nine Fingers
"You can never have too many knives" -- Logan Nine Fingers