Page 1 of 1

Transformation Logic

Posted: Wed Jun 29, 2016 3:35 pm
by devsonali
Hello all,

I have a job that does some calculations based on row numbers , lets say Column A has these values , I want another column B which counts based on change in value of A

So Input say

Column A , Column B


1 , 1
1 , 2
2 , 1
2 , 2
2 , 3
55 , 1
55 , 2
95 , 1
98 , 1

Basically Column B is a counter counting Occurrences of Column A , i did some searches on Key words "loops" "counter" but unable to find similar logic in here.
Thank you for looking .

Posted: Wed Jun 29, 2016 4:07 pm
by rkashyap

Posted: Wed Jun 29, 2016 4:46 pm
by ray.wurlod
If the data are accurate and accurately sorted you could use a Remove Duplicates stage and keep just the Last value from Column B for each group of Column A values.

Posted: Thu Jun 30, 2016 8:09 am
by devsonali
Thank you for the pointers
However , the scenario (I think) is slightly different .

Col 1 values above are calculated based on row numbers (a different requirement) , for example lets say I am calculating the value of Col 1 as floor(row number/5) in a stage variable .
As a result of that first 5 records will have a certain value , next 5 another value and so on and so forth depending on how many input records are being processed ( I am ensuring that i running transformer in sequential mode )

For every incoming record , I would need Col A (calculated value) , Column B (counter based on Col A)

Lets say I calculate Col A in in stage variable 'svColA' -- > floor(@inrownum/5)
For col B
I declare stage variables svCurrentRow --> svPreviousRow
svPreviousRo --> Inputlink.svColA
svColAChangeCheck --> if svCurrentRow = svPreviousRow then 0 else 1
svCounter ---> if svColAChangeCheck = 1 then svCounter + 1 else svCounter

And I am trying to assign svCounter to Col B

**Correction - I think the above logic is flawed because as soon as Col A value changes the Counter should reset to 1

So probably
svCounter ---> if svColAChangeCheck = 0 then svCounter + 1 else 1

Am I coding this right ?

Posted: Thu Jun 30, 2016 10:09 am
by UCDI
you can do something like a previous & current SV pair. This is rough concept of it..

svars look like
result = calculation of a
counter = if result = prev, counter + 1, 1 (hopefully close on the syntax)
prev = result

...
derive
result | col a
counter | col b

and you might need a special handler for the first row but the idea should work .. it just resets the counter whenever your calculation is not the same as the last time around. The assumption here is that the data is sorted on the input to calculation of a & that the result of the calculation is unique. If not, more info is needed...

Posted: Thu Jun 30, 2016 10:16 am
by devsonali
Yeah , I think I realized the error (corrected the logic )
However something I am yet to understand is why do we
need to declare at the end
prev = result (according to your code )

instead of declaring it the first variable ?
svPreviousRo --> Inputlink.svColA

Posted: Thu Jun 30, 2016 1:43 pm
by UCDI
the stage variables look at the current record only. Its not looping here.
So in order to compare the previous to the current, you need to save a copy of the previous record. Then compare it. Then update it to current for the next record, and so on.

or visually, for records 1 and 2 and 3...

result = 1
counter = something
previous = 1

...
result = 2
compare previous (still set to 1!) against result (now 2!)
..
prev = 2
...

result = 3
compare 3 and 2
prev = 3
...

Or that was the idea anyway. Assuming I understood your data example and what you were trying to do.