Selecting Max(date) column with multiple rows

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

adityavarma
Premium Member
Premium Member
Posts: 104
Joined: Thu Jul 12, 2007 11:32 pm
Location: Canada

Selecting Max(date) column with multiple rows

Post by adityavarma »

Hi,

I have a requirement to select only records with latest date for a code.
Below is the example:

code seq_no ref_date
------ -------- ----------
101 100 20100803
101 100 20100804
100 101 20100803
200 100 20100805
201 100 20100804

The ouput should be like below:

100 101 20100803
101 100 20100804
200 100 20100805
201 100 20100804

Can you please help with this?
satyanarayana
Participant
Posts: 13
Joined: Fri Jul 15, 2005 12:01 am

Re: Selecting Max(date) column with multiple rows

Post by satyanarayana »

SELECT code, seq_no, ref_date from <TABLE NAME> OUTT
WHERE ref_date =(SELECT MAX( ref_date ) from <TABLE NAME> INN
WHERE INN.code=OUTT.code )
mouthou
Participant
Posts: 208
Joined: Sun Jul 04, 2004 11:57 pm

Re: Selecting Max(date) column with multiple rows

Post by mouthou »

If you meant in Datastage and your source is a file, it can be done using Aggregator stage by taking of the max(ref_date) and grouping by code & seq_no
adityavarma
Premium Member
Premium Member
Posts: 104
Joined: Thu Jul 12, 2007 11:32 pm
Location: Canada

Re: Selecting Max(date) column with multiple rows

Post by adityavarma »

mouthou wrote:If you meant in Datastage and your source is a file, it can be done using Aggregator stage by taking of the max(ref_date) and grouping by code & seq_no
Thanks for your reply.

Yes, I meant in Datastage and source is a file. I have tried with Aggregator Stage, but got errors.

Aggregator_22: Error when checking operator: It is not legal to set a reducer (ref_date) to be the same as a key field,
when the type of that field (string) is not a double.
mouthou
Participant
Posts: 208
Joined: Sun Jul 04, 2004 11:57 pm

Re: Selecting Max(date) column with multiple rows

Post by mouthou »

What are the key columns you defined in the column definition. I guess you kept ref_date also a key. You cant keep ref_date since it is part of your aggregation function (max())
adityavarma
Premium Member
Premium Member
Posts: 104
Joined: Thu Jul 12, 2007 11:32 pm
Location: Canada

Re: Selecting Max(date) column with multiple rows

Post by adityavarma »

No, I have not kept ref_date as Key column. Only code is the key column.
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

An exact search for "It is not legal to set a reducer" should answer your question.
-craig

"You can never have too many knives" -- Logan Nine Fingers
mouthou
Participant
Posts: 208
Joined: Sun Jul 04, 2004 11:57 pm

Post by mouthou »

hello chulett,

The search didnt yield much of a resolution to this issue. One post talked about using the key which has been clarified by adityavarma as the column 'ref_date' is not defined as the key.

The other post talked about a workaround of using Sort and De-dup etc. Do you think it is the only solution. :roll:

thanks,
priyadarshikunal
Premium Member
Premium Member
Posts: 1735
Joined: Thu Mar 01, 2007 5:44 am
Location: Troy, MI

Post by priyadarshikunal »

It would be nice if you give some details on what you have done to achive this.
Priyadarshi Kunal

Genius may have its limitations, but stupidity is not thus handicapped. :wink:
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

A "reducer" is a function which causes fewer rows to exit than entered, sum() for example. The error should mean you are trying to sum on a grouping (key) field.

That's my understanding anyway.
-craig

"You can never have too many knives" -- Logan Nine Fingers
allavivek
Premium Member
Premium Member
Posts: 211
Joined: Sat May 01, 2010 5:07 pm

Post by allavivek »

Hi ,

As per my knowledge the output of Aggregate stage will be double ...

As the PO is using max for date field , should we enable preseve type to true in aggr stage, so that the out put will be date data type...

.....
PhilHibbs
Premium Member
Premium Member
Posts: 1044
Joined: Wed Sep 29, 2004 3:30 am
Location: Nottingham, UK
Contact:

Post by PhilHibbs »

allavivek wrote:As the PO is using max for date field , should we enable preseve type to true in aggr stage, so that the out put will be date data type...
What's this about a "preserve type" option? I'm trying to do exactly this - get the max value of a date over a set of data - and I'm getting a Double field as the result. Any tips on how I get this back into a date? Should I do it the longhand way, by converting the date to a number (days since somewhen) and then back into a date afterwards?
Phil Hibbs | Capgemini
Technical Consultant
priyadarshikunal
Premium Member
Premium Member
Posts: 1735
Joined: Thu Mar 01, 2007 5:44 am
Location: Troy, MI

Post by priyadarshikunal »

Output of min() and max() functions are by default dfloat (double) and hence if it needs to be done of columns of different datatype, preserve datatype flag is used. The flag is just to make datastage preserve the datatype instead of defaulting it to dfloat.

It can be found in sub property of 'Column for calculation'.
Priyadarshi Kunal

Genius may have its limitations, but stupidity is not thus handicapped. :wink:
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

You should be able to find that option buried in the stage somewhere, Phil, I wish I could be more specific and tell you exactly where. Another option for your max date would be to convert it to external ISO format, a string in YYYY-MM-DD format for example and then max() that in the stage. You could always convert it back afterwards.
-craig

"You can never have too many knives" -- Logan Nine Fingers
PhilHibbs
Premium Member
Premium Member
Posts: 1044
Joined: Wed Sep 29, 2004 3:30 am
Location: Nottingham, UK
Contact:

Post by PhilHibbs »

priyadarshikunal wrote:It can be found in sub property of 'Column for calculation'.
Nope, no "Preserve type" attribute...
Image
Phil Hibbs | Capgemini
Technical Consultant
Post Reply