Grouping Logic - How to Re-group similar records

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
DSFreddie
Participant
Posts: 130
Joined: Wed Nov 25, 2009 2:16 pm

Grouping Logic - How to Re-group similar records

Post by DSFreddie »

Hi All,

I have a requirement in my project as below - I need to group the same customer records into one group by assigning GROUP_ID's.

See the example below,

SSN FIRST_NAME MIDDLE_NAME LAST_NAME DOB
1234 DAVE W JONES 09/12/1970
0 DAVE JONES 09/12/1970
2345 JIM C NICHOLS 08/08/1967
2345 JIM C NICHOLS NULL
4567 NULL F FARMS 12/12/1956
4567 KEEN F FARMS 12/12/1956

I am currently doing a Tight Grouping initially (through Sort Stage - Key Change column option/Soundex/Transformer Stage variables) using the above key fields. The output will look as follows,

SSN FIRST_NAME MIDDLE_NAME LAST_NAME DOB GROUP_ID
1234 DAVE W JONES 09/12/1970 G1
0 DAVE JONES 09/12/1970 G2
2345 JIM C NICHOLS 08/08/1967 G3
2345 JIM C NICHOLS NULL G4
4567 NULL F FARMS 12/12/1956 G5
4567 KEEN F FARMS 12/12/1956 G6

But the requirement is to group the similar records even if the above fields doesnt give a 100% match.

I am trying to re-group the already grouped records by changing the key fields in the 2nd job(Loose grouping), but i am loosing the already assigned grouping.

Can you pls help me resolve this scenario.

Thank You,
Freddie
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

This is an ideal situation for using QualityStage.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Post Reply