Page 1 of 1

Sequential File - redefining a Varchar as Time

Posted: Thu Jul 29, 2010 3:19 am
by PhilHibbs
Initially, all my Sequential File fields were defined as Varchar and I converted them to proper typed fields in a Transformer with StringToDate and StringToTimestamp.

Now I want to redefine them as Date and Time fields in the source, and drop the Transformer stage.

I set the Type defaults / Time / Format string = %hh:%nn but I'm getting the following warning and no rows:

Field "StartTime" has import error and no default value; data: {2 3 : 0 0}, at offset: 27

What's wrong with 23:00 as a Time value in the format %hh:%nn? Here's my full row of data:
"Run Date","Start Date","Start Time","End Date","End Time"
"2010-07-28","2010-07-27","23:00","2010-08-08","23:00"

When I had a Transformer converting the date and time to a Timestamp with StringToTimestamp( in.StartDate : ' ' : in.StartTime, "%yyyy-%mm-%dd %hh:%nn" ) it worked fine.

I've searched and read most of the results, but those are all about nulls or incorrect contents e.g. strings in numerics or trailing delimiters.

One thing I'm not sure I've done right is to do with the Length - I've kept them as they were for the Varchar fields i.e. the StartTime is a Time field with a Length of 5. Is this correct? *update*: it doesn't seem to make any difference if I remove the Length setting.

*Update*: Or, is it just a bad idea to use data types to specify a CSV, should I just make them all Varchar and then use a Modify stage?

*update*: Back to VarChar with a Transformer, couldn't get a Modify stage to recognize "StringToDate" function.

Posted: Thu Jul 29, 2010 3:23 pm
by ray.wurlod
Modify stage functions tend to be expressed as "from", for example date_from_string().

I'm surprised about %hh:%nn not working against 23:00. This might be a bug in the Sequential File stage parser. Are you able to modify the source data so that the time has seconds (23:00:00) as a test?

Posted: Thu Jul 29, 2010 7:20 pm
by kris007
I had trouble reading the data with datatype set to Time and format set to %hh:%nn as well. However, when I changed the data to 23:00:00 I was able to read the record successfully. As Ray pointed out, could be a bug.

Posted: Fri Jul 30, 2010 2:28 am
by PhilHibbs
ray.wurlod wrote:I'm surprised about %hh:%nn not working against 23:00. This might be a bug in the Sequential File stage parser. Are you able to modify the source data so that the time has seconds (23:00:00) as a test?
Yes, that works.

Posted: Fri Jul 30, 2010 6:19 am
by PhilHibbs
ray.wurlod wrote:Modify stage functions tend to be expressed as "from", for example date_from_string().
Oh yes, I see, it has its own set of functions.

Posted: Tue Aug 03, 2010 10:22 am
by PhilHibbs
ray.wurlod wrote:I'm surprised about %hh:%nn not working against 23:00. This might be a bug in the Sequential File stage parser.
It's not just the SF parser - '%hh:%nn' doesn't work in a Transformer either. However, there you have the luxury of being able to write StringToTime( in.field : ':00', '%hh:%nn:%ss ) whereas you can't do that in a Sequential File format. So if your input format is immutable at '23:00' then you have to define the SF field as Char or VarChar and then use a Modify or Transformer to convert it. Sub-optimal, IMHO.

Posted: Tue Aug 03, 2010 3:44 pm
by ray.wurlod
Curiously, possibly not the case.

:idea: One of the tuning tips mentioned at IOD 2009 is to read with a Sequential File stage (sequentially) and parse with a Transformer or Column Import stage (in parallel) - this combination often runs faster than parsing in the Sequential File stage.