Page 1 of 2

Problem with Aggregator

Posted: Mon Feb 06, 2006 3:30 pm
by somu_june
Hi,

Please help me in getting out from this error . My input has a validfromdate columm with SQL Type is char and length of that field is 8.

When Iam using an aggregator Stage for calculatng Min(validfromdate). Iam getting an error

Aggr_Min_VFD: Error when checking operator: It is not legal to set a reducer (ValidFromDate) to be the same as a key field,
when the type of that field (string) is not a double.


How to over come this. Please help me . Iam using Auto partition and Groupby ValidFromDate
Aggregation Type : calculation and

another question is there any way so that I can remove Groupby and Ineed only Aggregation like Min(validfromDate)


Thanks,
Somaraju

Re: Problem with Aggregator

Posted: Mon Feb 06, 2006 4:27 pm
by aramachandra
The aggregator stage in the parellel engine does not work with the date and char datatype as far as i know. It works well with decimals and numbers only...

I have had that same issue as well, where i was trying to get the max date for a given record set..

In our environment folks have used the sort stage to achieve the same effect

Re: Problem with Aggregator

Posted: Mon Feb 06, 2006 4:43 pm
by kwwilliams
You should create a static column, something like DUMMY. Give all of the rows for this column the same value. Sort by Dummy and your date column, with the date being a ascending order. Stream this output to a remove duplicate stage. Specify the key as dummy and to only retain the first one. Then you would have your min date. There may be other ways to do it, but that is the method I have used when all I really want is a min or max column.

Re: Problem with Aggregator

Posted: Mon Feb 06, 2006 5:40 pm
by somu_june
Hi armachandra,

Can you explain in detail how to sort and get min Validfromdate.




Thanks,
Somaraju

Re: Problem with Aggregator

Posted: Mon Feb 06, 2006 5:51 pm
by somu_june
Hi Williams,

Thanks for reply, Let's see if I've got this right. Please correct me if Iam wrong. I have to create a static columm dummy in aggregator stage output and I should assaign all the input rows validfromdate columm to dummy columm. Then I have to sort in aggregator with dummy and validfromdate columm. Then I have to connect the aggregator output to remove duplicate stage and I have to specify key as dummy then I will have my min Validfromdate


Thanks,
somaraju

Re: Problem with Aggregator

Posted: Mon Feb 06, 2006 6:03 pm
by Ultramundane
You could also get your dsadm to download patch 61040 from Ascential which allows you to perform min and max on char and varchar columns. If you get the patch and apply it a new option called preserveType will become available when choosing the columns to aggregate. Select this property and set it to yes to perform a min or max on a char or varchar column.

Posted: Mon Feb 06, 2006 9:11 pm
by kumar_s
Hi Somuraj,

But the actuall compliant of the Aggregator stage is due to the use of field
ValidFromDate as grouping key as well as value field.
You are not allowd to do this. If you just want to just remove the duplicates of the date filed and retain the first record, you can sort using sort stage and make use of the Allow duplicated option to set it to False so that it retains the first value.

-Kumar

Re: Problem with Aggregator

Posted: Mon Feb 06, 2006 9:12 pm
by kwwilliams
I think you should forget the aggregator. Create a column in the modify stage (should be less overhead than a transofrmer), and populate this column with a string, you could use the jobname if you want. Your modify stage should have two output columns, stream these to a sort stage and sort by the column you created and the date you want the min on in ascending order. Output the sort to a remove dupicate stage. In the properties set the key equal to the column you created (not the column properties) and tell it to retain the first row. Because the data is presorted, it will take the first row and output it. You should be able to handle it from there. Let me know if you have any problems with it.

Posted: Tue Feb 07, 2006 10:38 am
by somu_june
Hi Kumar,

My Actuall requirement is to get min value from (ValidFromdatecolumm)

Ex I have dates like 19942008
19942004
19942004
19800205
I want 19800205 as my output which is minum value and I used sort stage and remove duplicate stage but Iam getting min values for each duplicate values. I want min value from entire input. Is there any stage so that I can get min val with out an aggregator stage.

Thanks,
Somaraju

Posted: Tue Feb 07, 2006 11:14 am
by kwwilliams
Did you create the dummy value??? Without it, you have no way to remove your duplicate. You need to create a static column so that your data would look like this:

Dummy Date
------------------------------------------
DUMMY 19942008
DUMMY 19942004
DUMMY 19942004
DUMMY 19800205

In the sort stage sort by Dummy and Date. Then remove duplicates based upon dummy and you will get 19942008. You need to have a value that you can use as a false key. You tried using date as the key, remove duplicates removed all of the duplicate dates, which is not what you want. Aggregator needs a column to group by as well. Because you should have all ready sorted before the data set before the aggregator, I said don't use an aggregator. the remove duplicate option will be more efficient.

Posted: Tue Feb 07, 2006 1:40 pm
by MTA
Somu jume,
Be sure to keep the value in the dummy field as a constant so that when you specify group by then all values in the vertual dataset are considered ( assuming you have removed duplicated in the sort stage or remove duplicate stage) then utilise the inbuilt function in Agg EE stage to get the minimum value of your input.

Posted: Tue Feb 07, 2006 2:19 pm
by kwwilliams
Removing the duplicate will produce the minimum date you shouldn't an aggregator it would just be an extra step. The remove duplicate stage will keep the very first record it finds with DUMMY in the dummy column, which by having sorted the date in asending order will give you the min date.

Posted: Tue Feb 07, 2006 2:43 pm
by somu_june
Hi williams,

your logic worked thanks for your help in getting out from this problem




Thanks,
srinuraju

Posted: Tue Feb 07, 2006 2:45 pm
by kwwilliams
Not a problem. Thanks for posting the solution works so next time someone else can search for minimum field and get an answer to their question.

Posted: Tue Feb 07, 2006 3:34 pm
by Ultramundane
As was mentioned earlier there are multiple solutions. I am just offering another solution (on Unix) for those that might be interested.

Use an external filter stage with the below code for MIN or MAX:

Code: Select all

## Author: Ryan Putnam
## Intent: Get MAX
awk 'BEGIN {getline; MAX=$0;} {if (( $0 > MAX )) { MAX=$0; } } END{print MAX;}'

Code: Select all

## Author: Ryan Putnam
## Intent: Get MIN
awk 'BEGIN {getline; MIN=$0;} {if (( $0 < MIN )) { MIN=$0; } } END{print MIN;}'