Aggregator problem!!
Moderators: chulett, rschirm, roy
Aggregator problem!!
In my aggregator, I have 10 columns coming in. Under the Grouping Keys
section, I have:
Group = Col1
Group = Col2
Group = Col3
Group = Col4
Group = Col5
Under Aggregations I have:
Column For Calculation = Col6
Sum Output Column = OutputCol6
Column For Calculation = Col7
Sum Output Column = OutputCol7
Column For Calculation = Col8
Sum Output Column = OutputCol8
Column For Calculation = Col9
Sum Output Column = OutputCol9
Column For Calculation = Dummy
Non-Missing Values Count Output Column = tot_cnt
Column For Calculation = Col10
Sum Output Column = OutputCol10
Col1 through Col5 are also keys. I was forced to create a Dummy column because I could not include Col1 in the Group and also for 'Column For Calculation'. I was getting the error:
"It is not legal to set a reducer (Col1) to be the same as a key field"
Now, my problem is, the aggregator is producing 5 rows and I want it to produce 1 row. How do I produce 1 row from the aggregator? Also, I would like to use Col1 to do a 'Non-missing values count output column'. I cannot do it because it is in the group. How do I do the count on the Col1 column? The Dummy column has the exact same value as Col1 when it comes from the transformer. I tried do it on the Dummy column but the result is the same. Once interesting thing, the number of rows out of the aggregator is sometimes different and sometimes same.
Thanks to anyone who responds.
section, I have:
Group = Col1
Group = Col2
Group = Col3
Group = Col4
Group = Col5
Under Aggregations I have:
Column For Calculation = Col6
Sum Output Column = OutputCol6
Column For Calculation = Col7
Sum Output Column = OutputCol7
Column For Calculation = Col8
Sum Output Column = OutputCol8
Column For Calculation = Col9
Sum Output Column = OutputCol9
Column For Calculation = Dummy
Non-Missing Values Count Output Column = tot_cnt
Column For Calculation = Col10
Sum Output Column = OutputCol10
Col1 through Col5 are also keys. I was forced to create a Dummy column because I could not include Col1 in the Group and also for 'Column For Calculation'. I was getting the error:
"It is not legal to set a reducer (Col1) to be the same as a key field"
Now, my problem is, the aggregator is producing 5 rows and I want it to produce 1 row. How do I produce 1 row from the aggregator? Also, I would like to use Col1 to do a 'Non-missing values count output column'. I cannot do it because it is in the group. How do I do the count on the Col1 column? The Dummy column has the exact same value as Col1 when it comes from the transformer. I tried do it on the Dummy column but the result is the same. Once interesting thing, the number of rows out of the aggregator is sometimes different and sometimes same.
Thanks to anyone who responds.
I would try with creating dummyKeys as dKey1=col1, dKey2=col2 etc.Col1 through Col5 are also keys. I was forced to create a Dummy column because I could not include Col1 in the Group and also for 'Column For Calculation'. I was getting the error:
"It is not legal to set a reducer (Col1) to be the same as a key field"
Now, my problem is, the aggregator is producing 5 rows and I want it to produce 1 row
thank you
- prasad
- prasad
Yes, I did create only one column named Dummy. Can the partioning mode of the previous stage have anything to do with it? My previous stage to the aggregator is a Peek stage. After running, the link between the Peek and Aggregator stage shows a count of like this:
10000, 2000 rows/sec
and this causes 5 rows to be output. Sometime it is:
10000, 1666 rows/sec
and this causes 6 rows to be output. This is pretty strange!!
10000, 2000 rows/sec
and this causes 5 rows to be output. Sometime it is:
10000, 1666 rows/sec
and this causes 6 rows to be output. This is pretty strange!!
What's interesting is, I added one more aggregator stage after the first aggregator. In the 2nd one, I put all 9 columns in the group accept the total_count. I did a sum on the total_count. I still get the same number of rows, i.e., multiple rows. I checked sort. Still the same problem.
I am posting first 5 rows of the data.
---------------------------------------------------------------------------------
OHCLARK 010-00-00000-000-032 PENN CENTRAL TRANS CORP CO 00000000000000000000000000000002004 10 0000000000 0000000000 0000000000 32 BETHEL TOWNSHIP PUBLIC UTILITY REAL PROPERTY U 9216 000000000 39023052005 0000000000000000000000000000000 10 N N N
OHCLARK 010-00-00000-000-041 COLUMBIA GAS OF OHIO INC CO TAX DEPT 200 CIVIC CENTER DR 200 CIVIC CENTER DR COLUMBUS OH432154138 00000000000000000000000000000002004 10 0000000000 0000000000 0000000000 41 BETHEL TOWNSHIP PUBLIC UTILITY PERS PROPERTY P 0010 000000000 COLUMBUS, OH 43215 39023052005 0000000000000000000000000000000 10 N N Y C001
OHCLARK 010-00-00000-000-045 COLUMBIA GAS TRANSMISSION TAX DEPT 200 CIVIC CENTER DR 200 CIVIC CENTER DR COLUMBUS OH432154138 00000000000000000000000000000002004 10 0000000000 0000000000 0000000000 45 BETHEL TOWNSHIP PUBLIC UTILITY PERS PROPERTY P 0010 000000000 COLUMBUS, OH 43215 39023052005 0000000000000000000000000000000 10 N N Y C001
OHCLARK 010-00-00000-000-053 DAYTON POWER & LIGHT TAX DEPT PO BOX 8825 DAYTON OH45401 00000000000000000000000000000002004 10 0000000000 0000000000 0000000000 53 BETHEL TOWNSHIP PUBLIC UTILITY PERS PROPERTY P 0010 000000000 DAYTON, OH 45401 39023052005 0000000000000000000000000000000 10 N N N
OHCLARK 010-00-00000-000-073 OHIO BELL TELEPHONE SBC COMMUNICATIONS INC TAX DPT 1 BELL CTR 1 BELL CTR SAINT LOUIS MO631013004 00000000000000000000000000000002004 10 0000000000 0000000000 0000000000 73 BETHEL TOWNSHIP PUBLIC UTILITY PERS PROPERTY P 0010 000000000 ST LOUIS, MO 63101 39023052005 0000000000000000000000000000000 10 N N YRM 36-M-01 C074
---------------------------------------------------------------------------------
Thank you very much for your response.
I am posting first 5 rows of the data.
---------------------------------------------------------------------------------
OHCLARK 010-00-00000-000-032 PENN CENTRAL TRANS CORP CO 00000000000000000000000000000002004 10 0000000000 0000000000 0000000000 32 BETHEL TOWNSHIP PUBLIC UTILITY REAL PROPERTY U 9216 000000000 39023052005 0000000000000000000000000000000 10 N N N
OHCLARK 010-00-00000-000-041 COLUMBIA GAS OF OHIO INC CO TAX DEPT 200 CIVIC CENTER DR 200 CIVIC CENTER DR COLUMBUS OH432154138 00000000000000000000000000000002004 10 0000000000 0000000000 0000000000 41 BETHEL TOWNSHIP PUBLIC UTILITY PERS PROPERTY P 0010 000000000 COLUMBUS, OH 43215 39023052005 0000000000000000000000000000000 10 N N Y C001
OHCLARK 010-00-00000-000-045 COLUMBIA GAS TRANSMISSION TAX DEPT 200 CIVIC CENTER DR 200 CIVIC CENTER DR COLUMBUS OH432154138 00000000000000000000000000000002004 10 0000000000 0000000000 0000000000 45 BETHEL TOWNSHIP PUBLIC UTILITY PERS PROPERTY P 0010 000000000 COLUMBUS, OH 43215 39023052005 0000000000000000000000000000000 10 N N Y C001
OHCLARK 010-00-00000-000-053 DAYTON POWER & LIGHT TAX DEPT PO BOX 8825 DAYTON OH45401 00000000000000000000000000000002004 10 0000000000 0000000000 0000000000 53 BETHEL TOWNSHIP PUBLIC UTILITY PERS PROPERTY P 0010 000000000 DAYTON, OH 45401 39023052005 0000000000000000000000000000000 10 N N N
OHCLARK 010-00-00000-000-073 OHIO BELL TELEPHONE SBC COMMUNICATIONS INC TAX DPT 1 BELL CTR 1 BELL CTR SAINT LOUIS MO631013004 00000000000000000000000000000002004 10 0000000000 0000000000 0000000000 73 BETHEL TOWNSHIP PUBLIC UTILITY PERS PROPERTY P 0010 000000000 ST LOUIS, MO 63101 39023052005 0000000000000000000000000000000 10 N N YRM 36-M-01 C074
---------------------------------------------------------------------------------
Thank you very much for your response.
Here is the output from the aggregator:
"IDMA ","OH","0.0","0.0","0.0","0.0","2006-08-29 11:15:39 ","PX_Extract_IDMA_Common ","023","0.0","1.66230000000000000E+04"
"IDMA ","OH","0.0","0.0","0.0","0.0","2006-08-29 11:15:37 ","PX_Extract_IDMA_Common ","023","0.0","1.65800000000000000E+04"
"IDMA ","OH","0.0","0.0","0.0","0.0","2006-08-29 11:15:38 ","PX_Extract_IDMA_Common ","023","0.0","1.66020000000000000E+04"
"IDMA ","OH","0.0","0.0","0.0","0.0","2006-08-29 11:15:36 ","PX_Extract_IDMA_Common ","023","0.0","1.26510000000000000E+04"
"IDMA ","OH","0.0","0.0","0.0","0.0","2006-08-29 11:15:40 ","PX_Extract_IDMA_Common ","023","0.0","4.94300000000000000E+03"
All lines start with "IDMA". Hope this helps. Thank you very much.
"IDMA ","OH","0.0","0.0","0.0","0.0","2006-08-29 11:15:39 ","PX_Extract_IDMA_Common ","023","0.0","1.66230000000000000E+04"
"IDMA ","OH","0.0","0.0","0.0","0.0","2006-08-29 11:15:37 ","PX_Extract_IDMA_Common ","023","0.0","1.65800000000000000E+04"
"IDMA ","OH","0.0","0.0","0.0","0.0","2006-08-29 11:15:38 ","PX_Extract_IDMA_Common ","023","0.0","1.66020000000000000E+04"
"IDMA ","OH","0.0","0.0","0.0","0.0","2006-08-29 11:15:36 ","PX_Extract_IDMA_Common ","023","0.0","1.26510000000000000E+04"
"IDMA ","OH","0.0","0.0","0.0","0.0","2006-08-29 11:15:40 ","PX_Extract_IDMA_Common ","023","0.0","4.94300000000000000E+03"
All lines start with "IDMA". Hope this helps. Thank you very much.
-
- Charter Member
- Posts: 822
- Joined: Sat Sep 17, 2005 5:25 pm
- Location: USA
I am grouping on the date column. When I removed it, number of rows went down to 2. It is still 2 rows. I am not doing a sum on the timestamp column. The keys I am grouping by are SrcTag, State, CtyFips, JobName. The interesting thing is, the exact same job on the Server version allows grouping on timestamp and still returns 1 row only. There must be something different on the PX side.
Actually, I executed this job in parallel and it is working fine. us1aslam1us 's point was right. The 5 columns were coming due to grouping on date. Now, I am getting 1 column. The only problem, I need to show the timestamp in the output as well. I am using 'Minimum value output column' but that is just displaying the year. I would like to display the minimum of the 5 unique timstamps.
-
- Charter Member
- Posts: 822
- Joined: Sat Sep 17, 2005 5:25 pm
- Location: USA