Page 1 of 1

Logic help

Posted: Fri Mar 30, 2012 11:32 am
by vamsi.4a6
The source is oracle

id name St_Dt ed_Dt
-- --- ---- -----
1 a 10-jan-10 10-aug-11
1 a 11-jan-10 15-aug-11
2 b 11-jan-11 Null
2 b 12-feb-11 Null

O/p
--

id name St_Dt ed_Dt
-- --- ---- -----
1 a 10-jan-10 15-aug-11
2 b 12-feb-11 Null



For a particular id i need to fetch the record from source which is having maximum ed_dt,if it is null for all the records for particular key then i need to fetch the record which is having maximum St_Dt for that key

Posted: Fri Mar 30, 2012 11:35 am
by qt_ky
Try using Oracle's COALESCE() function in your SELECT statement.

Posted: Fri Mar 30, 2012 11:44 am
by vamsi.4a6
I want implement through Datastage

1)extracted fields from source
2)In sort stage
Hashparition on id with both sorting and partition
ed_Dt with only sorting in desc
key column for stage is ed_Dt with desc

If it is null means then i need to fetch based upon St_Dt.How to achieve this logic through Datastage?

Posted: Fri Mar 30, 2012 12:09 pm
by Jboyd
Could you put through a Remove Duplicates stage. Sort and partition on The key field (ID). Then Sort by End_Dt Desc then by Start_Date Desc, then select to retain the first record.

I believe that should work.

Posted: Fri Mar 30, 2012 2:41 pm
by chulett
Perhaps an Aggregator with min(start dt) and max(end date).

Re: Logic help

Posted: Fri Mar 30, 2012 6:14 pm
by qt_ky
vamsi.4a6 wrote:The source is oracle
I guess the Oracle tidbit is irrelevant.

Not sure if this helps or not.... In sorting you can set the Nulls position so nulls will sort first or last.

Posted: Sat Mar 31, 2012 10:14 am
by mobashshar
Do this.
1. Read Oracle source.
2. Use Copy stage to split the rows in Two links as St_Dt and Ed_Dt.
3. Use Remove Duplicate Stage. Sort and Partition on ID and Sort only on St_Dt with Ascending and keep First Row.
4. Use Remove Duplicate Stage. Sort and Partition on ID and Sort only on Ed_Dt with Ascending and keep Last Row.
5. Use Join Stage to join the rows from St_Dt and Ed_Dt link.
You will get the desired result.
Hope this helps.

Posted: Sat Mar 31, 2012 10:23 am
by srinivas.g
seperate the data using filter stage. first link is having ed_date not null records and second link is having ed_date null records after that use remove duplicate stage(first link) sort desc based on ed_date and keep first and same do it for second link sort based on st_date then use funnel to combine the both the links.
hope this will help you.

Posted: Sat Mar 31, 2012 2:46 pm
by ray.wurlod
Craig's solution is the most elegant offered. Group by id, min start_date, max end_date.

Re: Logic help

Posted: Sun Apr 01, 2012 4:59 pm
by Kryt0n
vamsi.4a6 wrote:The source is oracle

id name St_Dt ed_Dt
-- --- ---- -----
1 a 10-jan-10 10-aug-11
1 b 11-jan-10 15-aug-11
2 a 11-jan-11 Null
2 b 12-feb-11 Null

O/p
--

id name St_Dt ed_Dt
-- --- ---- -----
1 a 10-jan-10 15-aug-11
2 b 12-feb-11 Null



For a particular id i need to fetch the record from source which is having maximum ed_dt,if it is null then i need to fetch the record based on St_Dt
Your output doesn't match your rule, if you wanted the record the with max end date, you would have

Code: Select all

1   b     11-jan-10  15-aug-11
and not

Code: Select all

1   a     10-jan-10  15-aug-11
Sorting by end date desc with nulls first and start date desc then remove duplicates, keeping first would meet the rule you stated. Keeping as per your output example would require a combination of aggregator and something that determines which "name" you keep

Posted: Sun Apr 01, 2012 9:32 pm
by kandyshandy
Good catch.. the requirement itself is not clear ;)

Re: Logic help

Posted: Sun Apr 01, 2012 11:09 pm
by vamsi.4a6
Thanks for pointing the issue and please check the input and output.Now i hope it is correct.Can you please tell me know how to implement the above requirement

Posted: Mon Apr 02, 2012 6:51 am
by chulett
All you changed were the 'a' and 'b', so (to me anyway) that doesn't really change anything as I assumed they were wrong in the first place. However, you really haven't done anything to clarify your requirement.

"if it is null then i need to fetch the record based on St_Dt"

No idea what that means. We can see that for a 'normal' set you want the earliest start date with the latest end date, correct? However, when the end date is null it looks like you want the latest start date for that group, is that true?

Posted: Tue Apr 03, 2012 3:42 am
by vamsi.4a6
For a particulat key(id and name) i need to select the record which is having maximum start date and if it(start date) is null for all the records for that key i need to select based on end date

For better understanding i am explaining my requirement with two following examples

For ex-1

In this case St_Dt is not all for all the records for the key(1 and a)
id name St_Dt ed_Dt
-- --- ---- -----
1 a 10-jan-10 10-aug-11
1 a 11-jan-10 15-aug-11
1 a null 10-aug-11

o/p
--
id name St_Dt ed_Dt
-- --- ---- -----
1 a 11-jan-10 15-aug-11

For ex-2

id name St_Dt ed_Dt
-- --- ---- -----
2 b null 10-aug-11
2 b null 15-aug-11
2 b null 11-aug-11

here (start date) is null for all the records for the key ( 2 and b)so i need the fetch the record which is having maximum ed_dt

o/p
--
id name St_Dt ed_Dt
-- --- ---- -----
2 b null 15-aug-11

Posted: Tue Apr 03, 2012 5:33 am
by chulett
vamsi.4a6 wrote:For a particulat key(id and name) i need to select the record which is having maximum start date and if it(start date) is null for all the records for that key i need to select based on end date
Suddenly, the direction & nullability of the dates have reversed. :?

Now we're just at an Aggregator with max() on each date.