Logic to get the ID with last date

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
taral
Participant
Posts: 16
Joined: Fri Mar 26, 2010 1:41 am

Logic to get the ID with last date

Post by taral »

Have a sequential file
Columns:
EMP_ID EMP_SAL EMP_DATE
1 100 01/01/2009
4 200 01/01/2005
2 300 01/01/2011
2 400 01/01/2009
1 100 01/01/2010
3 200 01/01/2009
4 100 01/01/2007
3 300 01/01/2006
.
.
.
Output should contain the the emp_ID who has the latest date
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Sort by date and capture the last row (sort and tail commands).
Sort by date in descending order and capture the first row (sort and head commands).
Because this is a parallel job, you do have MKS Toolkit installed and can therefore use UNIX commands such as sort, tail and head.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
taral
Participant
Posts: 16
Joined: Fri Mar 26, 2010 1:41 am

Post by taral »

The out should contain Emp_id which contains latest date.
And what is head/tail commands?
devesh_ssingh
Participant
Posts: 148
Joined: Thu Apr 10, 2008 12:47 am

Post by devesh_ssingh »

Head and tail are unix commands.

head -n 1 filename--display 1st line of file
tail -n 1 filename----display last line of file
taral
Participant
Posts: 16
Joined: Fri Mar 26, 2010 1:41 am

Post by taral »

The input sequential file contains same emp_id but can have different date. The output should have single row containing emp_id, sal , date which is latest.
Looking at the example it should have output dataset as:
EMP_ID EMP_SAL EMP_DATE
1 100 01/01/2010
2 300 01/01/2011
3 200 01/01/2009
4 100 01/01/2007
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Please re-read my earlier post. It contains two solutions.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
taral
Participant
Posts: 16
Joined: Fri Mar 26, 2010 1:41 am

Post by taral »

did a sorting on both the fields emp_id and emp_date(secondary sort key)
then after use a remove duplicate stage (emp_id as a key column) to remove the duplicate.
Post Reply