Date Conversion

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

abhilashnair
Participant
Posts: 284
Joined: Fri Oct 13, 2006 4:31 am

Date Conversion

Post by abhilashnair »

I have a column in source DB2 table which is defined as Varchar, It contains Date of Birth in the format '1st January 2011' (without quotes)

My target DB2 table has the same column but data type is Date. I need the above source data in this column in the same format as above..How can I do this ?
jwiles
Premium Member
Premium Member
Posts: 1274
Joined: Sun Nov 14, 2004 8:50 pm
Contact:

Post by jwiles »

A date column in DB2 will have an internal numeric format representing the date, not a varchar such as your source column. It is when the date is extracted and converted to character that the visible format is set, whether it be YYYYMMDD, YYYYDDD, MMDDYYYY, DD MMM YYYY or whatever.

You need to clarify whether you are required to:

a) Simply load the source value (varchar) into the target column (date), converting datatypes in the process, or
b) Provide the date in the format you mention

If a: you can use a transformer to convert the 1st, 2nd, 12th, 23rd and so on to 1, 2, 12 and 23, followed by a call to StringToDate() with the appropriate format string (see the Parallel Job Developer Guide). Alternate: Split the day into a separate column and use a lookup table to do the conversion, followed by a transformer for the conversion to a Date datatype.

If b: You can't load a date datatype with the format you mention. You must either convert the source to a date datatype OR load it into a varchar column as it came in.

If the client requires that specific format in a downstream process, then they can extract the date from the database in that format using SQL. The DBDs and DBAs should know how to do that.

Regards,
- james wiles


All generalizations are false, including this one - Mark Twain.
abhilashnair
Participant
Posts: 284
Joined: Fri Oct 13, 2006 4:31 am

Post by abhilashnair »

My requirement is exactly the same as mentioned by you in point a).

That is Simply load the source value (varchar) into the target column (date), converting datatypes in the process

The downstream processes can extract the required format using SQL


But the issue here is that the source table has all columns defined as a varchar since it is a staging table used only for DataStage. The target table is the one used for downstream processes. So how do I convert data such as 1st January 2011 to date data type. The main issue issue here is length is variable as in if the date is single digit then it is like 1st January but if it is 2 digit it is like 23rd January...so I cannot use the substring function to elimimate the st, rd ,th and filter out digits

will iconv() oconv() work in this case ?
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

abhilashnair wrote:will iconv() oconv() work in this case ?
These functions belong in server jobs or BASIC Transformer stage. They are extremely clever at handling date formats, but are not available in the parallel Transformer stage.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

abhilashnair wrote:I cannot use the substring function to elimimate the st, rd ,th and filter out digits
Sure you can... it won't be fun but it can be done. You can use Field() to pull the day and then month/year out of the date and then you'll know the last two characters of the day can always go. Pad out the remainder to two digits if one and then build up a 'more proper' date format from the pieces, one you can mask properly in a StringToDate() function.
-craig

"You can never have too many knives" -- Logan Nine Fingers
abhilashnair
Participant
Posts: 284
Joined: Fri Oct 13, 2006 4:31 am

Post by abhilashnair »

I used the Field Function and did this but now I am facing a rather strange problem. My source column data is like 23rd February 2011


I used the following stage variable. Pasted the derivation below
Note that the delimiter in Field is 1 space.

StringToDate(Field(COLNAME," ",3):'-':If Field(COLNAME," ",2)='January' Then '01' Else If Field(COLNAME," ",2)='February' Then '02'
Else If Field(COLNAME," ",2)='March' Then '03'
Else If Field(COLNAME," ",2)='April' Then '04'
Else If Field(COLNAME," ",2)='May' Then '05'
Else If Field(COLNAME," ",2)='June' Then '06'
Else If Field(COLNAME," ",2)='July' Then '07'
Else If Field(COLNAME," ",2)='August' Then '08'
Else If Field(COLNAME," ",2)='September' Then '09'
Else If Field(COLNAME," ",2)='October' Then '10'
Else If Field(COLNAME," ",2)='November' Then '11'
Else If Field(COLNAME," ",2)='December' Then '12'
Else '00':'-':If Len(Field(COLNAME," ",1)) = 3 Then '0':Field(COLNAME," ",1)[1,1] Else Field(COLNAME," ",1)[1,2],"%yyyy-%mm-%dd")



i am getting error

Data string '2011-02' does not match format '%yyyy-%mm-%dd: the data string has fewer characters than expected

Somehow it is not concatenating 23 at the end..So StringToDate is not functioning. But using a peek stage I can see the data as 2011-02-23..So the derivation is working fine...Not sure why string to date is not taking it Please advise
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Data string '2011-02' does not match format '%yyyy-%mm-%dd

Try using '%yyyy-%mm' instead.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
abhilashnair
Participant
Posts: 284
Joined: Fri Oct 13, 2006 4:31 am

Post by abhilashnair »

ray.wurlod wrote:Data string '2011-02' does not match format '%yyyy-%mm-%dd

Try using '%yyyy-%mm' instead. ...
Hi Ray,

Actually I need yyyy-mm-dd. The formula inside StringToDate is just giving yyyy-mm and passing it on to StringToDate as argument

The output of the derivation should be 2011-02-23....somehow it is giving 2011-02..this is causing that error...Now my question is why is it truncating 23..?
jwiles
Premium Member
Premium Member
Posts: 1274
Joined: Sun Nov 14, 2004 8:50 pm
Contact:

Post by jwiles »

How about something a little simpler (and probably more efficient CPU-wise as well), like this:

Code: Select all

svDay:  Convert("abcdefghijklmnopqrstuvwxyz","",Field(COLNAME," ",1))
svDate: svDay : " " : Field(COLNAME," ",2) : " " : Field(COLNAME," ",3)
svCvDate: StringToDate(svDate,"%d %mmmm %yyyy")
where svDay and svDate are VarChar and svCvDate is a Date. svCvDate is the final result which you would send to the output. I broke it into three variables to make the logic more readable, but you can combine them.

Regards,
- james wiles


All generalizations are false, including this one - Mark Twain.
abhilashnair
Participant
Posts: 284
Joined: Fri Oct 13, 2006 4:31 am

Post by abhilashnair »

jwiles wrote:How about something a little simpler (and probably more efficient CPU-wise as well), like this:

Code: Select all

svDay:  Convert("abcdefghijklmnopqrstuvwxyz","",Field(COLNAME," ",1))
svDate: svDay : " " : Field(COLNAME," ",2) : " " : Field(COLNAME," ",3)
svCvDate: StringToDate(svDate,"%d %mmmm %yyyy")
where svDay and svDate are VarChar and svCvDate is a Date. svCvDate is the final result which you would send to the output. I broke it into three variables to make the logic more readable, but you can combine them.

Regards,
One question though..I have 5 such source columns. So will I need 15 stage variables ?
jwiles
Premium Member
Premium Member
Posts: 1274
Joined: Sun Nov 14, 2004 8:50 pm
Contact:

Post by jwiles »

If broken apart as in my example, yes.

As I said, you can combine the logic into fewer derivations if you wish. You don't HAVE to perform it using three stage variables per source column...you don't even need stage variables for this. How I provided my example was primarily for readability to show the individual pieces of the logic more clearly.

The concatenation logic for svDate is only one way of putting the pieces back together. Using Len(), Field(), Right() and a little math would be just as effective.

Regards,
- james wiles


All generalizations are false, including this one - Mark Twain.
jwiles
Premium Member
Premium Member
Posts: 1274
Joined: Sun Nov 14, 2004 8:50 pm
Contact:

Post by jwiles »

Additionally, breaking apart the derivations during development can help debug problems, such as the one you are experiencing (2011-12 instead of 2011-12-31).

Regards,
- james wiles


All generalizations are false, including this one - Mark Twain.
jwiles
Premium Member
Premium Member
Posts: 1274
Joined: Sun Nov 14, 2004 8:50 pm
Contact:

Post by jwiles »

One more, I promise!

In the derivation you were experiencing problems with, enclose the month If-then-else construct in parenthesis (..........). The compiler is including the day logic as part of the final Else expression of the month because you don't have the parenthesis around that to logically separate it.

The only way you would've seen the day is if the source month string was invalid.

Regards,
- james wiles


All generalizations are false, including this one - Mark Twain.
abhilashnair
Participant
Posts: 284
Joined: Fri Oct 13, 2006 4:31 am

Post by abhilashnair »

Thanks jwiles. It is working now. I did not use the stage variables. I just used the earlier concatenation formula this time enclosing the Ifs and Else in brackets.
jwiles
Premium Member
Premium Member
Posts: 1274
Joined: Sun Nov 14, 2004 8:50 pm
Contact:

Post by jwiles »

Glad it now works! Keep in mind that that particular logic is not as efficient, nor considered best practice coding, largely due to calling Field() multiple times--as many as 15 or 16--per date column per record. You might consider using stage variables to store the result of Field(COLNAME," ",2). My preference is to allow StringToDate to handle the month name as it should be more efficient than if-then-else logic in the derivation.

That's my performance tuning side sneaking out...too many years of keeping mainframe cpu usage by my programs to a minimum!

Regards,
- james wiles


All generalizations are false, including this one - Mark Twain.
Post Reply