Column Import stage - Implicit conversion

Amedhyaz · Post by **Amedhyaz** » Wed Feb 19, 2014 12:05 pm

Parallel job parses a source file, using a schema and a Column Import stage. Delivers to a dataset.

62815 records involved. 4 timestamp columns to be delivered.

Job warns on a Broken pipe and ends up aborting on a SIGSEGV.

May the 62815 * 4 implicit conversions varchar to timestamp be an issue?

IBM Analytics Champion 2009 - 2020 · Post by **asorrell** » Wed Feb 19, 2014 12:32 pm

Yes - try using StringtoTimestamp function with appropriate mask.

Amedhyaz · Post by **Amedhyaz** » Wed Feb 19, 2014 1:32 pm

Unfortunately, an explicit conversion won't help, as far as my understanding goes.

The whole purpose of using a "Column Import" stage is to parse the source text file, read as a LongVarChar; break it down into a collection of records, by means of the Orchestrate schema provided; and deliver the the records to a dataset, after successfully completing all needed conversions, again by referring to the hints from the Orchestrate schema file. That is to say that the record structure is implicit at design time and only known at run time.

My question is, therefore, as follows: From your experience, do you think "Column Import" stage may have a hard time handling about 250,000 complex implicit conversions to timestamps?

A subsidiary question would be: Is it possible to give a mask to a timestamp column in an Orchestrate schema file?

IBM Analytics Champion 2009 - 2020 · Post by **asorrell** » Wed Feb 19, 2014 2:29 pm

Sorry, missed that distinction. I don't think it would have a problem with that many records.

How certain are you that all the data matches the schema? I've had SIGSEGV's when an invalid timestamp (all zero, or all blank) was in the data.

Can you check the relevant phantom file in the &PH& directory of the project for your last run? Sometimes that has more detail about what caused the SIGSEGV.

IBM Analytics Champion 2009 - 2020 · Post by **asorrell** » Wed Feb 19, 2014 2:30 pm

Sorry, missed that distinction. I don't think it would have a problem with that many records. However, you can confirm whether its a size problem by breaking the source file into pieces and then process them individually. If they all process as smaller chunks, then it is some sort of buffering / size issue. If not, then it is a data problem.

How certain are you that all the data matches the schema? I've had SIGSEGV's when an invalid timestamp (all zero, or all blank) was in the data.

Can you check the relevant DSD.RUN file in the &PH& sub-directory of the project for your last run? Sometimes that has more detail about what caused the SIGSEGV.

DSXchange

Column Import stage - Implicit conversion

Column Import stage - Implicit conversion

Re: Column Import stage - Implicit conversion