Transformer

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
kittu.raja
Premium Member
Premium Member
Posts: 175
Joined: Tue Oct 14, 2008 1:48 pm

Transformer

Post by kittu.raja »

Hi,

Ca any body tell me how to count the number of records in a transformer.

For example I am sending 100 rows into the transformer and I want the output to be 1 row only having the count as 100.

Thanks
Rajesh Kumar
Mike
Premium Member
Premium Member
Posts: 1021
Joined: Sun Mar 03, 2002 6:01 pm
Location: Tampa, FL

Post by Mike »

Why transformer stage? Much easier when you use the stage that is designed for the task (e.g. Aggregator Stage).

Mike
ArndW
Participant
Posts: 16318
Joined: Tue Nov 16, 2004 9:08 am
Location: Germany
Contact:

Post by ArndW »

The Transformer stage cannot be used for your intended purpose, because there is no way to detect that the last row is being read. As mentioned before, an Aggregator stage will do that.
dsuser_cai
Premium Member
Premium Member
Posts: 151
Joined: Fri Feb 13, 2009 4:19 pm

Post by dsuser_cai »

Yest thats right you can use an aggregator stage to get the row count.

use the following options:

in the aggregator stage:
aggregation type: CountRows.

this should work.
Thanks
Karthick
kittu.raja
Premium Member
Premium Member
Posts: 175
Joined: Tue Oct 14, 2008 1:48 pm

Post by kittu.raja »

[quote="Mike"]Why transformer stage? Much easier when you use the sta

yes we can do that in aggregator stage, but if I use some functions already in the transformer and I want to implement this in the same transformer and dont want to add an extra stage.

Can we handle this in transformer???
Rajesh Kumar
ArndW
Participant
Posts: 16318
Joined: Tue Nov 16, 2004 9:08 am
Location: Germany
Contact:

Post by ArndW »

For the second time in this thread, it cannot be done in a transformer becuase you never know when the last row has arrived.
kittu.raja
Premium Member
Premium Member
Posts: 175
Joined: Tue Oct 14, 2008 1:48 pm

Post by kittu.raja »

ArndW wrote:For the second time in this thread, it cannot be done in a transformer becuase you never know when the last row has arrived.
K thank you for your response
Rajesh Kumar
Mike
Premium Member
Premium Member
Posts: 1021
Joined: Sun Mar 03, 2002 6:01 pm
Location: Tampa, FL

Post by Mike »

And don't worry about extra stages. That's what pipeline parallelism is all about... and with operator combination, multiple stages may be combined into a single process anyway.

Mike
kittu.raja
Premium Member
Premium Member
Posts: 175
Joined: Tue Oct 14, 2008 1:48 pm

Post by kittu.raja »

[quote="Mike"]And don't worry about extra stages. That's what pipeline parallelism is all about...

And just for knowledge I am asking cant we count the number of rows in atranformer?? I thought we can do any transformations or process in the transformer.

Thanks
Rajesh Kumar
Kryt0n
Participant
Posts: 584
Joined: Wed Jun 22, 2005 7:28 pm

Post by Kryt0n »

Yes, you can count in the transformer, you just won't know the total count until you run it through the aggregator... so effectively pointless
BugFree
Participant
Posts: 82
Joined: Wed Dec 13, 2006 6:02 am

Post by BugFree »

kittu.raja wrote:
Mike wrote:And don't worry about extra stages. That's what pipeline parallelism is all about...

And just for knowledge I am asking cant we count the number of rows in atranformer?? I thought we can do any transformations or process in the transformer.

Thanks
Yes. We can count the number of rows in transformer but you will get 100 rows with counts as 1 to 100.
The only key point here is your requirement to get only 1 row with count as 100. That is where the problem is.
The only way to achieve this is through transformer constraint. But to derive this constraint we need to know the row count before the transformer either through aggregator or the count should be static which is not in your case.
Ping me if I am wrong...
priyadarshikunal
Premium Member
Premium Member
Posts: 1735
Joined: Thu Mar 01, 2007 5:44 am
Location: Troy, MI

Post by priyadarshikunal »

I don't know why I am writing this
but I think its just to let you know how can you do that (is it for an interview?)

you need two transformers running in sequential mode

1. Using stage variable count the number of records and put it in a derived column say Count or use @OUTROWNUM
2.sort the data before second transformer using link sort on Count.
3. put a constraint in second transformer to release only one record(first) and pass only one column called count.

I don't think any one will use this technique except explaining it in an interview as this will create a slow running job ignoring the advantages of PX.
Priyadarshi Kunal

Genius may have its limitations, but stupidity is not thus handicapped. :wink:
Post Reply