DSXchange

Posted: **Tue Apr 27, 2004 2:19 pm**

I am aggregating/grouping on the first 4 keys to avoid the duplicate value
but for some reason I am getting a duplicate rows. Some rows are eliminted and some are still there.

In the O/P tab I have specified the grouping on first 4 columns

Duplicate record
1026|1024|1000|1010|82|2004-04-27 12:48:33
1026|1024|1000|1010|82|2004-04-27 12:48:33
1026|1024|1000|1010|82|2004-04-27 12:48:33

1006|1022|1000|1010|0|2004-04-27 16:01:27
1006|1022|1000|1010|0|2004-04-27 16:01:27

Does somebody can give some advise.

-Thanks

Posted: **Wed Apr 28, 2004 2:06 am**

Hi,
try to see if there is no other column that has the group by checked on it in any other column/s of your table definition.

IHTH,

Posted: **Wed Apr 28, 2004 3:52 am**

Have you specified grouping on just these four columns, or others as well?

Posted: **Wed Apr 28, 2004 8:18 am**

Ray,

I have only grouping on those 4 columns and they are the Primary key in table which I am trying to insert through SQL loader.

Roy,
I am using SQL loader to load the data in database but in the aggregator I load the column information from table defination. I have seen that the table defination is in sink with aggregator.

The reason why I have this aggregator is to avoid duplicate value. If their is some other way to achive it, do let me know

Thanks for your help.

Please advise.

Posted: **Wed Apr 28, 2004 4:19 pm**

If the input data are sorted on the four grouping columns, you can use stage variables to detect change. This is a much more efficient mechanism than the Aggregator stage. Search the Forum for examples.
Another approach is to use a hashed file keyed on these four columns. All writes to hashed files are destructive overwrites (based on key) so it's a fairly slick means of de-duplication.

Posted: **Thu Apr 29, 2004 8:48 am**

Thanks Ray. I am using hash file and the process is working fine.