how to group data and check some constraint

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
smitageorge
Participant
Posts: 37
Joined: Fri Sep 30, 2005 10:05 am
Location: va

how to group data and check some constraint

Post by smitageorge »

Hi,

Does anybody had a idea to check the constraint for a group of data?

I am having input as:

ID MO DA CA
1 0 M COOL
1 0 D COOL
1 0 R FOOL
1 1 M COOL
1 1 D COOL
1 1 R COOL
1 2 D COOL
1 2 R COOL
1 3 D COOL
2 0 M COOL
2 1 R COOL
3 2 M COOL
3 2 R FOOL


and so on.........

and i need to obtain my output as

ID MO DA CA
1 1 M COOL
1 1 D COOL
1 1 R COOL
1 2 D COOL
1 2 R COOL
1 3 D COOL
2 0 M COOL
2 1 R COOL

Here for the group(ID) we had to check the MO,if MO is same for incoming rows check the CA and if the CA is same for all the MO's then pass that to the output orelse reject that particular group.

Please anybody can pass some light on this.It is highly appreciated.

Thanks
Smita
vsurap
Participant
Posts: 6
Joined: Sun Apr 25, 2004 5:24 pm

Re: how to group data and check some constraint

Post by vsurap »

I guess the below logic should work.

Input----->Remove Dups(Key: ID,MO and CA)--->Agregater (Key: ID,MO) get the Count rows--->
Transformer( Constraint: if count Rows=1)---> To Lookup(Reference Stream)


Input -----> Lookup(Key: ID and MO) ---> Ouptput
kumar_s
Charter Member
Charter Member
Posts: 5245
Joined: Thu Jun 16, 2005 11:00 pm

Post by kumar_s »

What you mean if 'MO is same for incoming rows '... Do you have any reference data?
Impossible doesn't mean 'it is not possible' actually means... 'NOBODY HAS DONE IT SO FAR'
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Use stage variables to identify the group and the first value of CA in each group, and set a flag if a different CA value is found in the same group. Keep a list of groups and flags, then post-process the data so that groups can be kept or rejected.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
smitageorge
Participant
Posts: 37
Joined: Fri Sep 30, 2005 10:05 am
Location: va

Post by smitageorge »

ray.wurlod wrote:Use stage variables to identify the group and the first value of CA in each group, and set a flag if a different CA value is found in the same group. Keep a list of groups and flags, then post-process ...
Can you elaborate it little more.

Thanks
smita
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

A stage variable to record whether you're in a new group. A stage variable to "remember" the previous group. A stage variable to record the CA value on group change only. A stage variable to compare CA against that one (this is the "flag"). Output all input columns and flag. Run through Aggregator capturing last flag value. Switch stage to segregate based on flag.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
smitageorge
Participant
Posts: 37
Joined: Fri Sep 30, 2005 10:05 am
Location: va

Post by smitageorge »

kumar_s wrote:What you mean if 'MO is same for incoming rows '... Do you have any reference data?
No there is no reference data.I believe i had not given the proper description:

There is a source file with all these fields and i need to load the transformed data in the sequential file.

I want the output in the given format.

1. Need to group the data depending on the ID and MO
2. For each group check if CA is same for all the possible combinations of DA.

Possible conditions(M/D/R, M/D, M/R, D/R, M, D, R )
3. If it is same for the group pass the data to sequential file or else drop the data in the reject file.

I need to give the constraints in the Transformer.

Thanks
Smita
us1aslam1us
Charter Member
Charter Member
Posts: 822
Joined: Sat Sep 17, 2005 5:25 pm
Location: USA

Post by us1aslam1us »

ray.wurlod wrote:A stage variable to record whether you're in a new group. A stage variable to "remember" the previous group. A stage variable to record the CA value on group change only. A stage variable to compare CA against that one (this is the "flag"). Output all input columns and flag. Run through Aggregator capturing last flag value. Switch stage to segregate based on flag.
Hi Ray i am having the similar requirement.I understood what u r trying to say here but i am not very familiar with stage variables use. so could you please help me in the job design at your convinience.

Thanks
sam
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Given that I'm fully booked through to the end of June "at my convenience" may be a little long.
Why not play with them? It's a great way to learn. It's how I learn.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
us1aslam1us
Charter Member
Charter Member
Posts: 822
Joined: Sat Sep 17, 2005 5:25 pm
Location: USA

Post by us1aslam1us »

ray.wurlod wrote:Given that I'm fully booked through to the end of June "at my convenience" may be a little long.
Why not play with them? It's a great way to learn. It's how I learn.
:idea:
Post Reply