Date Format - Fatal error

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
ds_search2008
Participant
Posts: 75
Joined: Thu Nov 27, 2008 10:12 am

Date Format - Fatal error

Post by ds_search2008 »

Dear experts,

The date value from source is expected to be in "yyyy-mm-dd" format. But one of the source files has different formats "yyyy-mm-dd", "dd-mm-yyyy", "mm-dd-yyyy".(Mixture of correct and incorrect formats). I'm trying to eliminate the incorrect formats using the below derivation

Source field - CRTDST_DATE - VarChar (10)
Target field - CRTDST_DATE - Date(4)

Code: Select all

If Len (TransRec.CRTDST_DATE) < 10 OR Not(IsValid('Date',TransRec.CRTDST_DATE))  Then SetNull() Else  StringToDate(TransRec.CRTDST_DATE,"%yyyy-%mm-%dd")

Code: Select all

If Len (TransRec.CRTDED_DATE) < 10 OR Not(IsValid('Date',TransRec.CRTDED_DATE))  Then SetNull() Else  StringToDate(TransRec.CRTDED_DATE,"%yyyy-%mm-%dd")
The log is showing the following fatal error message

Transformer23,0: Data string '05-05-2007' does not match format '%yyyy-%mm-%dd': the value for tag %yyyy has fewer characters than expected.
Data string '05-31-2008' does not match format '%yyyy-%mm-%dd': the value for tag %yyyy has fewer characters than expected.

Any suggestions please.

Many Thanks & Kind Regards
ArndW
Participant
Posts: 16318
Joined: Tue Nov 16, 2004 9:08 am
Location: Germany
Contact:

Post by ArndW »

You need to use the isvalid function with a date and a format string. Note that you cannot detected whether 2009-03-01 is YYYY-MM-DD or YYYY-DD-MM format.
Sainath.Srinivasan
Participant
Posts: 3337
Joined: Mon Jan 17, 2005 4:49 am
Location: United Kingdom

Post by Sainath.Srinivasan »

You can try to locate the position of '-' to decide the format.

But how will you know whether '03-04-2009' is 3rd April 2009 or 4th March 2009?
Sainath.Srinivasan
Participant
Posts: 3337
Joined: Mon Jan 17, 2005 4:49 am
Location: United Kingdom

Post by Sainath.Srinivasan »

You can try to locate the position of '-' to decide the format.

But how will you know whether '03-04-2009' is 3rd April 2009 or 4th March 2009?
ds_search2008
Participant
Posts: 75
Joined: Thu Nov 27, 2008 10:12 am

Post by ds_search2008 »

Thanks a ton for your time and replies.

Sainath, our requirement is, any format other than "yyyy-mm-dd" is invalid.

To locate the position of '-' I can think of the following:

using 3 stage variables

Code: Select all

svDate=Left(TransRec.CRTDST_DATE,1,3) 
svMonth=Left(TransRec.CRTDST_DATE,4,6) 
svYear=Left(TransRec.CRTDST_DATE,7,10)
svConcat=svDate:svMonth:svYear
But I'm not sure how to check the correct format using this and reject the invalid date values. Kindly help me in this regard.

Thanks a lot & Regards
jcthornton
Premium Member
Premium Member
Posts: 79
Joined: Thu Mar 22, 2007 4:58 pm
Location: USA

Post by jcthornton »

Is this a free-text field that you are trying to convert (manual entry)? At first, I thought you were indicating that it had different date formats for different fields - but it sounds like this is all in one field.

As others have indicated, you are not going to always be able to tell what the correct conversion should be. Every date with the day part <= 12 is going to be suspect.

Without knowing what you are using this date for, it is hard to tell the impact of this on the quality of your results (and if this lack of quality is not material, why is the field being populated...). Is there any way you can get it from source in a consistent format? I am guessing not.

This leaves you stuck - you need to handle all cases in your conversion logic - which means it is going to have to be more complex than you have provided.

What I have done below is broken up the logic into a couple of steps within a transformer. I checked this and it works for dates in any of the dd-mm-yyyy, mm-dd-yyyy, and yyyy-mm-dd formats.

Stage Variables

Code: Select all

svYMD    if IsValid("date",DATE_IN) then 'Y' else 'N'
svDMY   if IsValid("date",DATE_IN[7,4]:'-':DATE_IN[4,2]:'-':DATE_IN[1,2]) then 'Y' else 'N'
sv MDY   if IsValie("date",DATE_IN[7,4]:'-':DATE_IN[1,2]:'-':DATE_IN[4,2]) then 'Y' else 'N'
Then my target field derivation is:

Code: Select all

if svYMD = 'Y' then StringToDate(DATE_IN,"%yyyy-%mm-%dd") else
if svDMY = 'Y' then StringToDate(DATE_IN,"%dd-%mm-%yyyy") else
if svMDY = 'Y' then StringToDate(DATE_IN,"%mm-%dd-%yyyy") else
setnull()
In ambiguous date cases, the day is assumed to be first instead of the month. Rearranging the order of the if-else clauses in the target field derivation can change this. Additional date formats can also be added with additional stage variables and a longer if-else statement.
Jack Thornton
----------------
Spectacular achievement is always preceded by spectacular preparation - Robert H. Schuller
ds_search2008
Participant
Posts: 75
Joined: Thu Nov 27, 2008 10:12 am

Post by ds_search2008 »

Hi Jack,

Thanks a lot for your kind help and time.
I tried this logic. But as my data is really bad I was not able to get the desired output. I tried this way:

Stage Variable : svValidYMD

Stage Variable Derivation

Code: Select all

if IsValid("date", TransRec.CRTDST_DATE[1,4]:'-':TransRec.CRTDST_DATE[6,2]:'-':TransRec.CRTDST_DATE[9,2]) 
then 'Y' else 'N'
Stage Variable: svChkNull
Stage Variable Derivation

Code: Select all

if IsNull(TransRec.CRTDST_DATE) Then 0 Else 1
Constraint - Valid path:

Code: Select all

svValidYMD ='Y' AND svChkNull=1
Constraint - Reject path:

Code: Select all

OTHERWISE
I'm able to reject the bad data and pass the valid records to target. However, the log still shows the following fatal error.

Code: Select all

Transformer23,0: Data string '05-0--2-07' does not match format '%yyyy-%mm-%dd': the value for tag %yyyy has fewer characters than expected.

Transformer23,0: Data string '--' does not match format '%yyyy-%mm-%dd': an integer was expected to match tag %yyyy.
Now the only concern is the warning messages. Could you please suggest me to suppress these warning message. Kindly may I request dear experts to guide me if I have made any mistake in the code.

Thanks a ton & Kind Regards
DWH-M
Premium Member
Premium Member
Posts: 46
Joined: Thu Sep 06, 2007 5:26 am

Post by DWH-M »

Can you try this? Hope this will avoid the warning.

Stage Variable : svValidYMD

if TransRec.CRTDST_DATE[3,1] <> '-' then if IsValid("date", TransRec.CRTDST_DATE[1,4]:'-':TransRec.CRTDST_DATE[6,2]:'-':TransRec.CRTDST_DATE[9,2])
then 'Y' else 'N' else 'N'


Stage Variable: svChkNull

if IsNull(TransRec.CRTDST_DATE) Then 0 Else 1


Constraint - Valid path:

svValidYMD ='Y' AND svChkNull=1
ds_search2008
Participant
Posts: 75
Joined: Thu Nov 27, 2008 10:12 am

Post by ds_search2008 »

Dear all,

I have resolved this issue. :D

I did the following:

Step:1 Included two transformers

Source column
CRTDST_DATE - Varchar(10) Nullable YES

TRANSFORMER1
-------------------

Stage Variable:

Definition: svValidYMD char(10)

Stage Variable Derivation:

Code: Select all

(Trans1Rec.CRTDST_DATE[1,4]:'-':Trans1Rec.CRTDST_DATE[6,2]:'-':Trans1Rec.CRTDST_DATE[9,2])
In column derivation

Code: Select all

If svValidYMD =Trans1Rec.CRTDST_DATE Then Trans1Rec.CRTDST_DATE Else SetNull()
Target column
CRTDST_DATE - Varchar(10) Nullable YEs


****************
TRANSFORMER2
-------------------
Source column
CRTDST_DATE - Varchar(10) Nullable YES


Stage Variable:

svChkNull - Integer 1

Stage Variable Derivation

Code: Select all

svChkNull - If IsNull(Trans2Rec.CRTDST_DATE) Then 0 Else 1
Contraint - Valid path
-----------

Code: Select all

IsNotNull(Trans2Rec.CRTDST_DATE) 
and set the otherwise/log option checked.
Also set the link to the down stream (Reject path) to Reject link.

Column derivation in against Trans2Rec.CRTDST_DATE - valid path

Code: Select all

StringToDate(Trans2Rec.CRTDST_DATE,"%yyyy-%mm-%dd")
Target column
CRTDST_DATE - DATE(10) Nullable NO

Job is working fine and the log is green.

Thanks a ton for all the valuable suggestions and replies.

Kind Regards
Post Reply