Output Record count in a transformer

Post questions here relative to DataStage Server Edition for such areas as Server job design, DS Basic, Routines, Job Sequences, etc.

Moderators: chulett, rschirm, roy

Post Reply
ady
Premium Member
Premium Member
Posts: 189
Joined: Thu Oct 12, 2006 12:08 am

Output Record count in a transformer

Post by ady »

I am choosing some rows from a column in the transformer with a constraint and writing the output to a sequential file.


The constraint is l_to_Tform.MG_NO [1,1] = "#" ie: when a row starts with #... reject it.

I need to get the output row count. Is there a way to do that in the same transformer ?
ArndW
Participant
Posts: 16318
Joined: Tue Nov 16, 2004 9:08 am
Location: Germany
Contact:

Post by ArndW »

You should be using "<>" instead of "=" to reject those records. You do have a stage mnemonic called @OUTROWNUM that you can use in the transform.
ady
Premium Member
Premium Member
Posts: 189
Joined: Thu Oct 12, 2006 12:08 am

Post by ady »

The constraint was a TYPO. i am using "<>"

I just need the highest value of the row count . ie: if there are 12345667 rows in teh column then the value returned should be "12345667". How can i use @OUTROWNUM to get this ?
ArndW
Participant
Posts: 16318
Joined: Tue Nov 16, 2004 9:08 am
Location: Germany
Contact:

Post by ArndW »

beaditya - let's take a step back. Where do you want this row value returned? A transformer gets executed for each incoming row. You could do an after-stage subroutine call and pass it the value of @OUTROWNUM. But all this depends what you want to do with the maximum row number.
DSguru2B
Charter Member
Charter Member
Posts: 6854
Joined: Wed Feb 09, 2005 3:44 pm
Location: Houston, TX

Post by DSguru2B »

Add an two extra columns to your output. One with derivation @OUTROWNUM and specify any dummy key on the second. Let that go down to the aggregator, group on the dummy key and choose max or last for the column which had derivation of @OUTROWNUM.
Also, if you want all rows that start with '#', dont you want '=' in the constraint instead of '<>' :?
Creativity is allowing yourself to make mistakes. Art is knowing which ones to keep.
baglasumit21
Participant
Posts: 132
Joined: Wed Mar 01, 2006 11:12 pm
Location: Pune

Re: Output Record count in a transformer

Post by baglasumit21 »

beaditya wrote:I am choosing some rows from a column in the transformer with a constraint and writing the output to a sequential file.


The constraint is l_to_Tform.MG_NO [1,1] = "#" ie: when a row starts with #... reject it.

I need to get the output row count. Is there a way to do that in the same transformer ?
Add a dummy column of type integer to the output link with a derivation as 1 and then introduce a aggregator after the transformer that will give you the sum of the dummy column. this sum will be the row count
SMB
ArndW
Participant
Posts: 16318
Joined: Tue Nov 16, 2004 9:08 am
Location: Germany
Contact:

Post by ArndW »

Using the aggregator stage in this case will not really be necessary (or efficient). The method will depend on what the original poster needs this for. If this number of rows is needed from another job or sequence, then a call to DSGetLinkInfo() will get that value. But that method doesn't make sense if the value is needed in a subsequent stage in the same job. So we need to wait for an explanation and then come up with a simple and functional solution.
ady
Premium Member
Premium Member
Posts: 189
Joined: Thu Oct 12, 2006 12:08 am

Post by ady »

I am using this to get the link row count


DSGetLinkInfo (DSJ.ME, "Transformer_56", "l_to_sq", DSJ.LINKROWCOUNT)


But in one job it gives the output as -9. I really dont get why it happens ?
DSguru2B
Charter Member
Charter Member
Posts: 6854
Joined: Wed Feb 09, 2005 3:44 pm
Location: Houston, TX

Post by DSguru2B »

You cannot do that while the link is open and data is being processed. You need to do that in the after job routine or something.
Creativity is allowing yourself to make mistakes. Art is knowing which ones to keep.
thebird
Participant
Posts: 254
Joined: Thu Jan 06, 2005 12:11 am
Location: India
Contact:

Post by thebird »

You would need to give more input regarding your requirement as Arnd has said - so that someone here can actually help out ..

But it seems like you are trying to get the output row count from a Transformer. If so why not just set @OUTROWNUM in the Transformer and find the MAX() in an aggregator stage?
aakashahuja
Premium Member
Premium Member
Posts: 210
Joined: Wed Feb 16, 2005 7:17 am

Post by aakashahuja »

Or you can use a Stage Variable (starting friom 1 and incremented by 1 every time) and then you can check if the input link is finished then you can pass the value of that stage variable to the output whereever required.
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Not sure why you would do that when it is exactly the same as the provided System Variable @OUTROWNUM. :?

As noted, we'd really need to know the 'why' behind the 'what' - why are they doing this and what do they plan doing with the number? That knowledge will help get them a proper answer.
-craig

"You can never have too many knives" -- Logan Nine Fingers
ady
Premium Member
Premium Member
Posts: 189
Joined: Thu Oct 12, 2006 12:08 am

Post by ady »

@Chulett


@OUTROWNUM gives all the row numbers as a seperate column, but I just need the total number of rows in a column. Then again I have to plugin a AGGREGATOR to take the last value of that column ( and this does not work with the constraints I am giving).

@DSguru2B

You cannot do that while the link is open and data is being processed. You need to do that in the after job routine or something.

I have this problem with one job where the transformer I have given DSGetLinkInfo (DSJ.ME, "Transformer_56", "l_to_sq", DSJ.LINKROWCOUNT) is run before the stage "Transformer_56" or the link "l_to_sq".


So it does give the value. Can I in anyway delay the the execution of this command so that the Transfor_56 is executed first and i can get the row count ?
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

beaditya wrote:@OUTROWNUM gives all the row numbers as a seperate column, but I just need the total number of rows in a column. Then again I have to plugin a AGGREGATOR to take the last value of that column ( and this does not work with the constraints I am giving).
I'm sorry, but this makes absolutely no sense to me. I don't see any difference between what you say @OUTROWNUM 'gives' and what you 'just need'. And I use Aggregators to do stuff like this 'all the time'. :?

I guess I'll just back away and let others dance this dance...
-craig

"You can never have too many knives" -- Logan Nine Fingers
ady
Premium Member
Premium Member
Posts: 189
Joined: Thu Oct 12, 2006 12:08 am

Post by ady »

@Chulett

Sorry for the confusion but...

As far as my knowledge goes when we give @ROWNUM in the stage variables and map it to a column we get the output values in that column as 1,2,3,4,5,6,7,8,9.........14567. ( The number of that particular row )

I just need the value 14567 ( the number of the last row).


When I use an aggregator to count the rows I give Count(IN_Row) as the derivation (this works pretty well for me) but I cannot give a aggregator in my current job because then I have a problem of columns coming from various stages ( I have to write them together in a Oracle table, so i'll have a new problem to join them ) .... and thats why I am using DSGetLinkInfo (DSJ.ME, "Transformer_56", "l_to_sq", DSJ.LINKROWCOUNT).


Hope I could communicate well this time ... :?
Post Reply