Aggregating Data

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
ppp
Participant
Posts: 21
Joined: Mon Aug 31, 2009 11:53 am

Aggregating Data

Post by ppp »

I have 2 fields (id and name) such as

Id Name
1 abc
2 def
3 ghi
1 jkl
2 mno
3 pqr
1 stu
2 vwx
3 123
1 456
2 789
3 111

The above records should be grouped into multiple batches with max batch size 3 records. And the combination of id and batchno should be unique.
(Let's say I start my batch no.with 5 as below)
For example, my output should be as below.

Id Name BatchNo
1 abc 5
2 def 5
3 ghi 5
1 jkl 6
2 mno 6
3 pqr 6
1 stu 7
2 vwx 7
3 123 7
1 456 8
2 789 8
3 111 8

How do I achieve the batchno field result?

[/u]
    [/u]
    chulett
    Charter Member
    Charter Member
    Posts: 43085
    Joined: Tue Nov 12, 2002 4:34 pm
    Location: Denver, CO

    Post by chulett »

    Stage variables. Simplest answer would be a value that is incremented only when Id = 1. What happens when there are more than three ids in a 'batch'?
    -craig

    "You can never have too many knives" -- Logan Nine Fingers
    ppp
    Participant
    Posts: 21
    Joined: Mon Aug 31, 2009 11:53 am

    Post by ppp »

    There will not be more than 3 id's in a batch. A batch can have less than 3 ids but not more.
    ray.wurlod
    Participant
    Posts: 54607
    Joined: Wed Oct 23, 2002 10:52 pm
    Location: Sydney, Australia
    Contact:

    Post by ray.wurlod »

    Stage variables.

    Write out the logic in English, and the stage variable initialization and derivation expressions will magically appear in your brain.
    IBM Software Services Group
    Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
    Post Reply