padding space to fixed width file

peddidsx · Post by **peddidsx** » Tue May 26, 2009 2:34 pm

hello All,

I am facing one problem while reading data from sequential file.
Here is what i am doing
I am reading data from fixed with file using sequential file stage. The file is having huge number of records but 2000+ records are getting rejected because the length is not coming as expected that means the value is not coming for some columns.
Now i have to populate the records to target table even though the lenght is shorted by padding spaces.
I tried APT_STRING_PADCHAR with 0x0 and 0x20 but still the records are getting rejected.
Any soultion?? Kindly let me know

nagarjuna · Post by **nagarjuna** » Tue May 26, 2009 3:19 pm

Could you please provide an example ? i want to know what type of records are rejected ?

Kryt0n · Post by **Kryt0n** » Tue May 26, 2009 5:40 pm

Is your file also delimited? If so, try setting the "Null field value" under Format->Field defaults to '' (two single quotes). If no delimiters, then how does datastage know which field is missing? Get source to fix the file. If it is fixed width it should be fixed width throughout not when they fancy.

peddidsx · Post by **peddidsx** » Tue May 26, 2009 8:27 pm

Kryt0n wrote:Is your file also delimited? If so, try setting the "Null field value" under Format->Field defaults to '' (two single quotes). If no delimiters, then how does datastage know which field is missing? Get source to fix the file. If it is fixed width it should be fixed width throughout not when they fancy.

Hey,
it is not a delimited file, it is fixed with file. the record length should be 2648 but for some records i am getting only 2642 and these records are getting rejected while reading data from file itself. is there any way to read these rejected records like padding the spaces, kindly let me know.
Thanks
Madhu

nagarjuna · Post by **nagarjuna** » Tue May 26, 2009 9:41 pm

You need to convert your all records to some fixed lenghth.Try reading the entire record as varchar and put your target as char whose length equal to 2648 .within the job initialise APT_STRING_PADCHAR to space ....so now in the target file you will be having all records of fixed length =2648..from this file do whatever operations you want ...

Kryt0n · Post by **Kryt0n** » Tue May 26, 2009 9:45 pm

nagarjuna wrote:You need to convert your all records to some fixed lenghth.Try reading the entire record as varchar and put your target as char whose length equal to 2648 .within the job initialise APT_STRING_PADCHAR to space ....so now in the target file you will be having all records of fixed length =2648..from this file do whatever operations you want ...

Does not answer which field is missing. What if it is the first column missing, you are going to assign values incorrectly. As per my first post, source needs to fix

nagarjuna · Post by **nagarjuna** » Tue May 26, 2009 10:01 pm

opps..here i thought that the last 6 characters are missing ...

Kryt0n wrote:
nagarjuna wrote:You need to convert your all records to some fixed lenghth.Try reading the entire record as varchar and put your target as char whose length equal to 2648 .within the job initialise APT_STRING_PADCHAR to space ....so now in the target file you will be having all records of fixed length =2648..from this file do whatever operations you want ...
Does not answer which field is missing. What if it is the first column missing, you are going to assign values incorrectly. As per my first post, source needs to fix

babbu · Post by **babbu** » Fri Jun 05, 2009 4:18 am

HI,

I faced a similar scenario.

I put all my source columns as char in the seq file and it worked.

records were not rejected

miwinter · Post by **miwinter** » Fri Jun 05, 2009 4:35 am

On the seq file stage, under format/record-level, try adding the fill char property with a value of space

sjfearnside · Post by **sjfearnside** » Fri Jun 05, 2009 6:37 am

Why not process your correct records and save the rejects to a file and determine what fields are missing? This could be done by having the source file owner review/correct the records in error. If you alter the job to accept bad data what will be the downstream affects of the bad data to the system?

anu123 · Post by **anu123** » Fri Jun 05, 2009 9:42 am

sjfearnside wrote:Why not process your correct records and save the rejects to a file and determine what fields are missing? This could be done by having the source file owner review/correct the records in error. If you alter the job to accept bad data what will be the downstream affects of the bad data to the system?

I think source system should correct the file. All records must be of lenght 2648 to read and map to right target column. Padding with 'spaces' will juble the positions and downstream systems if any will receive incrrect data. I have worked with similar issues but worked with surce systems to agree and stick to a file layout.

-thanks

miwinter · Post by **miwinter** » Fri Jun 05, 2009 9:47 am

Maybe so... but source systems get things wrong... ETL's purpose is/can be to validate the data for its quality and either provide handling for scenarios where mistakes are made or standards aren't met, or to report out errors. "Getting the source system to sort it" isn't always feasible in the real world.

anu123 · Post by **anu123** » Fri Jun 05, 2009 9:55 am

miwinter wrote:Maybe so... but source systems get things wrong... ETL's purpose is/can be to validate the data for its quality and either provide handling for scenarios where mistakes are made or standards aren't met, or to report out errors. "Getting the source system to sort it" isn't always feasible in the real world.

Agreed.
If it is not feasible then we can tap all the records which are short f 2648 bytes and handle them seperately or report them.

dganeshm · Post by **dganeshm** » Thu Oct 21, 2010 3:12 pm

awk '{printf "%-<your desired length>s\n",$0}' input_file_name > output_file_name

Example : awk '{printf "%-203s\n",$0}' OA001.M02.TERA2 > OA001.M02.TERA3

Use the above command and make the file fixed width of your desired length then you need not worry about a few records being short. This will add only spaces.