How to Compare Current Record with Previous record

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
pavithravenky
Participant
Posts: 4
Joined: Fri Apr 10, 2009 2:34 am
Location: India

How to Compare Current Record with Previous record

Post by pavithravenky »

Hi,

I have a Key Columns that contains duplicates records. I need to generate Key whenever the value of the Key Column Changes. How to achive this using transformer stage and not surrogate key stage.

Example:
Id Addr
1 a1
1 a2
1 a3
2 b1

Result Expected:
Id Addr AId
1 a1 01
1 a2 01
1 a3 01
2 b1 02

R
Pav
jwiles
Premium Member
Premium Member
Posts: 1274
Joined: Sun Nov 14, 2004 8:50 pm
Contact:

Post by jwiles »

The simplest method would be to generate a keyChange column in a sort stage preceding the transformer. keyChange = 1 when a record has a new key value, 0 when the same as the previous record.

When you see keyChange = 1, generate a new Key otherwise use the same key as for the previous record.

Regards,
- james wiles


All generalizations are false, including this one - Mark Twain.
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Welcome aboard.

There are over 250,000 answers to be found on DSXchange, including the answer to your question. In the menu at the top of the page is a link to the Search facility. (Tip: prefer Exact Search whenever possible.)
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
srds2
Premium Member
Premium Member
Posts: 66
Joined: Tue Nov 29, 2011 6:56 pm

Post by srds2 »

you can get expected output using transformer stage variables. create 3 stage variables.

var1: if ID = var2 then A else B
var2: ID
var3: if var1 = B then var3+1 else var3

assign default value for var3 as 0
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

That's an overly complex solution. It can be done readily with two stage variables. The only thing that is important is the order of execution (a) of the stage variables and (b) of the data (what does "previous" mean when the data are partitioned?).
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Based on the example posted, it looks to me like you've already got your "AId" field in the "Id" field. Can we get a more realistic example, please? One that shows why you would need an additional field.
-craig

"You can never have too many knives" -- Logan Nine Fingers
Post Reply