Combine Dates
Moderators: chulett, rschirm, roy
Combine Dates
I have an input file: Key, EffDt, ExpDt
The data looks like
1111 2002-01-01 2002-07-01
1111 2002-07-01 2003-01-01
1111 2003-01-01 2004-01-01
2222 2002-01-01 2003-01-01
2222 2003-01-01 2004-01-01
2222 2006-01-01 2007-01-01
I need the final result looks like
1111 2002-01-01 2004-01-01
2222 2002-01-01 2004-01-01
2222 2006-01-01 2007-01-01
I hashed input file by Key and sorted it by Key and EffDt. After that I used transform. The output looks good, but last records were not written (or there were written with the Key = " ".
Is it any other technique appropriate for this process? Please help!!!
The data looks like
1111 2002-01-01 2002-07-01
1111 2002-07-01 2003-01-01
1111 2003-01-01 2004-01-01
2222 2002-01-01 2003-01-01
2222 2003-01-01 2004-01-01
2222 2006-01-01 2007-01-01
I need the final result looks like
1111 2002-01-01 2004-01-01
2222 2002-01-01 2004-01-01
2222 2006-01-01 2007-01-01
I hashed input file by Key and sorted it by Key and EffDt. After that I used transform. The output looks good, but last records were not written (or there were written with the Key = " ".
Is it any other technique appropriate for this process? Please help!!!
Sort your input by all three columns in the order they appear. Create a transform stage which stores the lastKey, MinimumEffDt and MaximumExpDt as stage variables. Put a constraint to output a row only when the Key changes and output the stored values.
N.B. You will need to add a dummy row to the input stream in order to ensure that the last value is output as well.
N.B. You will need to add a dummy row to the input stream in order to ensure that the last value is output as well.
<a href=http://www.worldcommunitygrid.org/team/ ... TZ9H4CGVP1 target="WCGWin">
</a>
</a>
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
You only need to partition on Key. Including EffDt in the partitioning may lead to inconsistent results. Though you still may want to sort on EffDt.
Could you, instead, use an Aggregator stage, grouping on Key and generating Min(EffDt) and Max(ExpDt) ?
The same partitioning considerations as above apply.
Could you, instead, use an Aggregator stage, grouping on Key and generating Min(EffDt) and Max(ExpDt) ?
The same partitioning considerations as above apply.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Re: Combine Dates
hello dear, i am little doubtful about your output data, if you could be little precise on output data, we can find some proper solution.
-
- Participant
- Posts: 612
- Joined: Thu May 03, 2007 4:59 am
- Location: Melbourne
Re: Combine Dates
You can get desired out put if you use a combination of Sort stage, Transformer stage and Remove Duplicate stage. In sort stage hash partition and sort the incoming records by Key. In sort properties, specify 'key' as previously sorted and specify EffDt and ExpDt to be sorted in asc order. Also set cluster key change column as true and map all these columns to the output. In the transformer assign stage variables:
1) SvEffDt --> if In.clusterKeyChange = 1 then In.EffDt Else SvEffDt
2) SvPrevSvEffDt --> SvCurrEffDt
3) SvCurrEffDt --> In.EffDt
4) SvPrevSvExpDt --> SvCurrExpDt
5) SvCurrExpDt --> In.ExpDt
6) SvNewRowInd --> if In.clusterKeyChange = 1 then SvNewRowInd+1 Else if CurrStDt <> PrevEndDt Then SvNewRowInd+1 Else SvNewRowInd
In Transformer main derivation part:
In.Key --> Out.Key
SvEffDt --> Out.EffDt
In.ExpDt --> Out.ExpDt
RowNum --> SvNewRowInd
Now in remove duplicate stage specify key as RowNum and retain last.
1) SvEffDt --> if In.clusterKeyChange = 1 then In.EffDt Else SvEffDt
2) SvPrevSvEffDt --> SvCurrEffDt
3) SvCurrEffDt --> In.EffDt
4) SvPrevSvExpDt --> SvCurrExpDt
5) SvCurrExpDt --> In.ExpDt
6) SvNewRowInd --> if In.clusterKeyChange = 1 then SvNewRowInd+1 Else if CurrStDt <> PrevEndDt Then SvNewRowInd+1 Else SvNewRowInd
In Transformer main derivation part:
In.Key --> Out.Key
SvEffDt --> Out.EffDt
In.ExpDt --> Out.ExpDt
RowNum --> SvNewRowInd
Now in remove duplicate stage specify key as RowNum and retain last.
Joshy George
<a href="http://www.linkedin.com/in/joshygeorge1" ><img src="http://www.linkedin.com/img/webpromo/bt ... _80x15.gif" width="80" height="15" border="0"></a>
<a href="http://www.linkedin.com/in/joshygeorge1" ><img src="http://www.linkedin.com/img/webpromo/bt ... _80x15.gif" width="80" height="15" border="0"></a>
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
-
- Participant
- Posts: 612
- Joined: Thu May 03, 2007 4:59 am
- Location: Melbourne
That would be :
1111 2002-01-01 2004-01-01
2222 2002-01-01 2007-01-01
What OP posted in requirement is :
1111 2002-01-01 2004-01-01
2222 2002-01-01 2004-01-01
2222 2006-01-01 2007-01-01
1111 2002-01-01 2004-01-01
2222 2002-01-01 2007-01-01
What OP posted in requirement is :
1111 2002-01-01 2004-01-01
2222 2002-01-01 2004-01-01
2222 2006-01-01 2007-01-01
Joshy George
<a href="http://www.linkedin.com/in/joshygeorge1" ><img src="http://www.linkedin.com/img/webpromo/bt ... _80x15.gif" width="80" height="15" border="0"></a>
<a href="http://www.linkedin.com/in/joshygeorge1" ><img src="http://www.linkedin.com/img/webpromo/bt ... _80x15.gif" width="80" height="15" border="0"></a>
-
- Premium Member
- Posts: 783
- Joined: Mon Jan 16, 2006 10:17 pm
- Location: Sydney, Australia
a simple solution as suggested by ArndW
Sort your input by all three columns in the order they appear
and then have this stage variables:
1) svNewEffDt---> If PrevExpDt<=CurrExpDt Then In.EffDt Else svNewEffDt
2) PrevExpDt=CurrExpDt
3) CurrEffDt = In.EffDt
4) CurrExpDt =In.ExpDt
5) Prevkey=CurrKey
6) CurrKey = In.key
in contraint, condition is
(PrevExpDt<=CurrExpDt) or ( Prevkey=CurrKey )
Sort your input by all three columns in the order they appear
and then have this stage variables:
1) svNewEffDt---> If PrevExpDt<=CurrExpDt Then In.EffDt Else svNewEffDt
2) PrevExpDt=CurrExpDt
3) CurrEffDt = In.EffDt
4) CurrExpDt =In.ExpDt
5) Prevkey=CurrKey
6) CurrKey = In.key
in contraint, condition is
(PrevExpDt<=CurrExpDt) or ( Prevkey=CurrKey )
Re: Combine Dates
Thank you Josh. It works.JoshGeorge wrote:You can get desired out put if you use a combination of Sort stage, Transformer stage and Remove Duplicate stage. In sort stage hash partition and sort the incoming records by Key. In sort properties, specify 'key' as previously sorted and specify EffDt and ExpDt to be sorted in asc order. Also set cluster key change column as true and map all these columns to the output. In the transformer assign stage variables:
1) SvEffDt --> if In.clusterKeyChange = 1 then In.EffDt Else SvEffDt
2) SvPrevSvEffDt --> SvCurrEffDt
3) SvCurrEffDt --> In.EffDt
4) SvPrevSvExpDt --> SvCurrExpDt
5) SvCurrExpDt --> In.ExpDt
6) SvNewRowInd --> if In.clusterKeyChange = 1 then SvNewRowInd+1 Else if CurrStDt <> PrevEndDt Then SvNewRowInd+1 Else SvNewRowInd
In Transformer main derivation part:
In.Key --> Out.Key
SvEffDt --> Out.EffDt
In.ExpDt --> Out.ExpDt
RowNum --> SvNewRowInd
Now in remove duplicate stage specify key as RowNum and retain last.
the only thing i changed was:
If SvCurrEffDt <> SvPrevSvExpDt Then SvCurrEffDt Else SvEffDt --> Out.EffDt
-
- Participant
- Posts: 612
- Joined: Thu May 03, 2007 4:59 am
- Location: Melbourne
keshav0307's solution won't work because
(PrevExpDt<=CurrExpDt) or ( Prevkey=CurrKey ) won't restrict records to pass the way you want. You will get all the first 5 records.
(PrevExpDt<=CurrExpDt) or ( Prevkey=CurrKey ) won't restrict records to pass the way you want. You will get all the first 5 records.
keshav0307 wrote:a simple solution as suggested by ArndW
Sort your input by all three columns in the order they appear
and then have this stage variables:
1) svNewEffDt---> If PrevExpDt<=CurrExpDt Then In.EffDt Else svNewEffDt
2) PrevExpDt=CurrExpDt
3) CurrEffDt = In.EffDt
4) CurrExpDt =In.ExpDt
5) Prevkey=CurrKey
6) CurrKey = In.key
in contraint, condition is
(PrevExpDt<=CurrExpDt) or ( Prevkey=CurrKey )
Last edited by JoshGeorge on Fri Jun 15, 2007 8:32 pm, edited 1 time in total.
Joshy George
<a href="http://www.linkedin.com/in/joshygeorge1" ><img src="http://www.linkedin.com/img/webpromo/bt ... _80x15.gif" width="80" height="15" border="0"></a>
<a href="http://www.linkedin.com/in/joshygeorge1" ><img src="http://www.linkedin.com/img/webpromo/bt ... _80x15.gif" width="80" height="15" border="0"></a>
-
- Participant
- Posts: 612
- Joined: Thu May 03, 2007 4:59 am
- Location: Melbourne
Replied / Edited inline Pointing out the obvious from the code is not indicting.
Joshy George
<a href="http://www.linkedin.com/in/joshygeorge1" ><img src="http://www.linkedin.com/img/webpromo/bt ... _80x15.gif" width="80" height="15" border="0"></a>
<a href="http://www.linkedin.com/in/joshygeorge1" ><img src="http://www.linkedin.com/img/webpromo/bt ... _80x15.gif" width="80" height="15" border="0"></a>