Page 1 of 1

padding space to fixed width file

Posted: Tue May 26, 2009 2:34 pm
by peddidsx
hello All,

I am facing one problem while reading data from sequential file.
Here is what i am doing
I am reading data from fixed with file using sequential file stage. The file is having huge number of records but 2000+ records are getting rejected because the length is not coming as expected that means the value is not coming for some columns.
Now i have to populate the records to target table even though the lenght is shorted by padding spaces.
I tried APT_STRING_PADCHAR with 0x0 and 0x20 but still the records are getting rejected.
Any soultion?? Kindly let me know

Posted: Tue May 26, 2009 3:19 pm
by nagarjuna
Could you please provide an example ? i want to know what type of records are rejected ?

Posted: Tue May 26, 2009 5:40 pm
by Kryt0n
Is your file also delimited? If so, try setting the "Null field value" under Format->Field defaults to '' (two single quotes). If no delimiters, then how does datastage know which field is missing? Get source to fix the file. If it is fixed width it should be fixed width throughout not when they fancy.

Posted: Tue May 26, 2009 8:27 pm
by peddidsx
Kryt0n wrote:Is your file also delimited? If so, try setting the "Null field value" under Format->Field defaults to '' (two single quotes). If no delimiters, then how does datastage know which field is missing? Get source to fix the file. If it is fixed width it should be fixed width throughout not when they fancy.
Hey,
it is not a delimited file, it is fixed with file. the record length should be 2648 but for some records i am getting only 2642 and these records are getting rejected while reading data from file itself. is there any way to read these rejected records like padding the spaces, kindly let me know.
Thanks
Madhu

Posted: Tue May 26, 2009 9:41 pm
by nagarjuna
You need to convert your all records to some fixed lenghth.Try reading the entire record as varchar and put your target as char whose length equal to 2648 .within the job initialise APT_STRING_PADCHAR to space ....so now in the target file you will be having all records of fixed length =2648..from this file do whatever operations you want ...

Posted: Tue May 26, 2009 9:45 pm
by Kryt0n
nagarjuna wrote:You need to convert your all records to some fixed lenghth.Try reading the entire record as varchar and put your target as char whose length equal to 2648 .within the job initialise APT_STRING_PADCHAR to space ....so now in the target file you will be having all records of fixed length =2648..from this file do whatever operations you want ...
Does not answer which field is missing. What if it is the first column missing, you are going to assign values incorrectly. As per my first post, source needs to fix

Posted: Tue May 26, 2009 10:01 pm
by nagarjuna
opps..here i thought that the last 6 characters are missing ...

Kryt0n wrote:
nagarjuna wrote:You need to convert your all records to some fixed lenghth.Try reading the entire record as varchar and put your target as char whose length equal to 2648 .within the job initialise APT_STRING_PADCHAR to space ....so now in the target file you will be having all records of fixed length =2648..from this file do whatever operations you want ...
Does not answer which field is missing. What if it is the first column missing, you are going to assign values incorrectly. As per my first post, source needs to fix

Posted: Fri Jun 05, 2009 4:18 am
by babbu
HI,

I faced a similar scenario.

I put all my source columns as char in the seq file and it worked.

records were not rejected

Posted: Fri Jun 05, 2009 4:35 am
by miwinter
On the seq file stage, under format/record-level, try adding the fill char property with a value of space

Posted: Fri Jun 05, 2009 6:37 am
by sjfearnside
Why not process your correct records and save the rejects to a file and determine what fields are missing? This could be done by having the source file owner review/correct the records in error. If you alter the job to accept bad data what will be the downstream affects of the bad data to the system?

Posted: Fri Jun 05, 2009 9:42 am
by anu123
sjfearnside wrote:Why not process your correct records and save the rejects to a file and determine what fields are missing? This could be done by having the source file owner review/correct the records in error. If you alter the job to accept bad data what will be the downstream affects of the bad data to the system?

I think source system should correct the file. All records must be of lenght 2648 to read and map to right target column. Padding with 'spaces' will juble the positions and downstream systems if any will receive incrrect data. I have worked with similar issues but worked with surce systems to agree and stick to a file layout.

-thanks

Posted: Fri Jun 05, 2009 9:47 am
by miwinter
Maybe so... but source systems get things wrong... ETL's purpose is/can be to validate the data for its quality and either provide handling for scenarios where mistakes are made or standards aren't met, or to report out errors. "Getting the source system to sort it" isn't always feasible in the real world.

Posted: Fri Jun 05, 2009 9:55 am
by anu123
miwinter wrote:Maybe so... but source systems get things wrong... ETL's purpose is/can be to validate the data for its quality and either provide handling for scenarios where mistakes are made or standards aren't met, or to report out errors. "Getting the source system to sort it" isn't always feasible in the real world.

Agreed.
If it is not feasible then we can tap all the records which are short f 2648 bytes and handle them seperately or report them.

padding space to fixed width file

Posted: Thu Oct 21, 2010 3:12 pm
by dganeshm
awk '{printf "%-<your desired length>s\n",$0}' input_file_name > output_file_name

Example : awk '{printf "%-203s\n",$0}' OA001.M02.TERA2 > OA001.M02.TERA3


Use the above command and make the file fixed width of your desired length then you need not worry about a few records being short. This will add only spaces.