Issues with metadata copied from server job

ukyrvd · Post by **ukyrvd** » Thu Jul 14, 2005 3:46 pm

Hi there,

I have a server job that reads from a sequential file and processes the data. Recently we got EE at our site and we are now trying to convert this particular job to parallel job (for performance reasons)

I have saved metadata/schemaDefinition from the server job and used the same in parallel job. When I try to view(or run job) the data, its giving errors like "delimiter not seen, at offset XX".

This is a delimited file and I have all the settings correct. We have many char fields in this file and what its doing is eventhough i have specified it as DELIMITED file, it scans through the length number of charecters when it encounters a char fields and then looking for delimiter.

I have changed definition to varchar and its started behaving properly. But i dont think its a solution right??
This is also happening for decimal fields ..

Is there anything (settings??) I am missing??

thanks in advance

vmcburney · Post by **vmcburney** » Thu Jul 14, 2005 6:18 pm

Do you have CHAR fields that are empty? Even though it is delimited it could be the parallel job is expecting some type of blank padding character. Parallel jobs are more particular about metadata then server jobs. Sounds like changing it to varchar is the way to go.

ukyrvd · Post by **ukyrvd** » Thu Jul 14, 2005 6:31 pm

Yes. We do have CHAR fields with no values. Even if we have non-empty values, if the length of value doesnt match length specified, we are getting errors.

We can safely change char to varchar .. but what about decimal fields? its also giving similar errors

I am not sure if anybody else ran into this situation..

thank you

ukyrvd · Post by **ukyrvd** » Fri Jul 15, 2005 8:51 am

**bump**

just bringing up .. sorry abt that

ray.wurlod · Post by **ray.wurlod** » Fri Jul 15, 2005 6:45 pm

Server jobs have no data types.
Parallel jobs are strongly types. As already noted, a Char(x) column (in a schema string[x]) must have precisely x characters.
A decimal data type value must have not more than the correct number of scale digits (to the right of the decimal place).
All numeric data types must be in range. For example int8 must be between -128 and +127, uint8 must be between 0 and 255.
And so on. What exactly is the error for mismatched decimals?

ukyrvd · Post by **ukyrvd** » Sat Jul 16, 2005 11:46 am

Thank you Ray

For the decimals also it is looking for x digits, if x is the length specified. Changing definition to Real is working .. but I have just checked the documentation .. it says Real :"IEEE single-precision (32-bit) floating point value" .. so I dont think its a generic solution

ray.wurlod · Post by **ray.wurlod** » Sat Jul 16, 2005 5:57 pm

Decimal and Numeric have a fixed number of decimal places and can therefore be represented exactly in the machine using scaling (within the bounds of machine precision).

Real and Double have an arbitrary number of decimal places and no guarantee of accurate storage at the limits of precision can be made. That's why your "solution" worked.

Better would be to force your data to x decimal places early in the job, perhaps using a Modify stage with decimal_from_dfloat (or decimal_from_decimal) transformation.

ukyrvd · Post by **ukyrvd** » Sat Jul 16, 2005 6:04 pm

Thanks a lot for the info Ray.

I am reading from the delimited sequential file and that has decimal numbers.
I have the length defined as 5. But as I cant control what I get in input file, if the values is less than 5 digits .. it starts compalining .. what would be the best way to deal with this case?

I will check if I can read that value as varchar and use modify to convert it to decimal in the stream ? what do you think of this approach..

thanks again!!

elavenil · Post by **elavenil** » Sun Jul 17, 2005 1:39 am

We faced a similar issues, when our project migrated from Server to EE. What we did was, specified a default value for all columns, which are expected to have NULL value and transform back to original value before writing into output.

HTWH.

Regards
Saravanan