Say you have a sequential file as follows:
Col1 << name of column
ABC
DEF
GHI
The column is specified as char(6) in the columns tab
Say you have a database table "Lookup" as follows:
Col1 <<< name of column
ABC
GHI
JKL
The column is specified as char(6) in the columns tab
My job is using table Lookup as a reference lookup to validate the data in the sequential file. The transform code looks like:
StageVar1 = If IsNull(Lookup.Col1) Then "Invalid Col1" Else ""
Audit = StageVar1:StageVar2:...:StageVarN
InvalidRecord = Len(Audit) ne 0
So in the example above I would accept row 1 & 3 and reject row 2 from the sequential file.
My problem is that the data at runtime is:
Seq.Col1 = "ABC"
and
Lookup.Col1 = "ABC "
DS treats trailing spaces as significant (I sure wish I could set some option to force trailing spaces to be insignificant), so the lookups always fail.
Now, I know I can fiddle with TrimB to get this to work, but 1) it takes more coding to set this up, and 2) would likely affect performance.
I specified that Col1 was char(6) in the columns tab for the sequential file. So what magic incantation do I need in the Format tab to actually make the data comply with my datatype, i.e. be padded with spaces for the width of the column?
Please address answers to the question asked, not "use TrimB" or "recast your data to VarChar". If there's no way to get the sequential stage to pad the data with spaces, just let me know that.
Thanks,
Scott
Padding sequential file data with trailing spaces
Moderators: chulett, rschirm, roy
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
The Sequential File stage can not read spaces that aren't there in the data. Simply specifying Char(6) in the metadata does not magically make it so. If you want the trailing spaces you need to add them either in the file itself or in a Transformer stage within your job - a Fmt() function with "6L" as the second argument.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Hi Ray,ray.wurlod wrote:The Sequential File stage can not read spaces that aren't there in the data. Simply specifying Char(6) in the metadata does not magically make it so.
Thanks for the reply. I'm not wanting DS to "magically make it so." However, I was hoping it would honour the datatype and length settings I'd set in the sequential file stage.
In another ETL product I've used, defining the metadata characteristics for a sequential file determine the internal memory buffers that get defined at runtime. So, if I read "ABC" into a numeric buffer, I'll get a null value and error message (unless I suppress it). If I read "ABC" into a char(6) buffer, I'll get "ABC^^^". Since a sequential file is essentially just text without datatype, the datatype is determined by the metadata you associate with the columns. This strict data typing makes my life as the programmer easier, as I don't have to write code (i.e. Fmt(blah,"6L") ) to enforce data typing, I just define the correct (or desired) metadata and the product "does my work for me."
I did some experimentation/hacking, and determined that, for sequential files, the metadata is useless. I imported a combination of char, int, and float data into any type of columns - num, int, char, varchar, binary. I even defined length = 0. Nothing made a difference. Same thing on output to sequential file.
I did brief testing on output to a DRS stage (SQL Server), and there DS did a better job on enforcing data type based on the metadata. For example, I would get warning messages if I tried to write char data to an int output, and the message was definitely coming from DS and not SQL Server.
It seems the only thing that matters for a sequential file is the column name, and that's really just a convenience to map your column position to a "friendly name" for mapping.
I wish DS enforced stricter data typing for sequential files, as I feel it would make my job as a programmer easier. And, if nothing else, DS should just gray out the columns in the column definition window (i.e. key, SQL type, Length, etc) for the sequential stage if they are just ignored anyway.
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact: