Page 1 of 2

how to handle newline character reading from sequential file

Posted: Thu Nov 17, 2011 2:30 am
by kirankumarreddydesireddy
Hi

We have got a situation to handle new line character for one of fields in a txt file,while reading from sequential file stage in server 7.5v jobs on windows platform.

Can any one let me know,if any possibility of handling this in datastage jobs or do I need to use some batch script to handle this.

appreciate your response.



Note :
We have actually handled this new line character in a field if the source and target are database.just that while reading from database we used replace function and replcaed with dummy value and loaded into sequential file and when loading into target,we replaced the dummy value from seqential file to new line character.

Posted: Thu Nov 17, 2011 7:26 am
by chulett
Try sliding the Columns tab over to the right to expose the "Contains Terminators" property. Enable it for that field.

Posted: Thu Nov 17, 2011 2:36 pm
by ray.wurlod
To clarify Craig's answer, there are more columns in the Columns grid. Scroll the Columns grid to the right to find the "Contains terminators" property.

Note that, for this to work properly, character strings containing terminators must be quoted.

Posted: Fri Nov 18, 2011 5:32 am
by kirankumarreddydesireddy
Thanks Chulett and Ray.

If I enable the "Contains Terminators" to yes,I am able to read the value of the field before the new line character. however the value after new line character is ignored.

Is there anyway to replace that newline character with some space so that I can read the value of the entire field.


Example : If the field contains "hi this
is rakesh"
If I enable the "Contains Terminators" to yes,I am able to read the value of this field as "hi this",the remaining text "is rakesh" is ignored.


Thanks
Kiran

Posted: Fri Nov 18, 2011 8:29 am
by chulett
Is this field the last one in your record? I'm not sure what "ignored" might mean, it really should be there as data but your tool of choice may not be showing it. Try this in a transformer after the read and see if the entire value appears:

Code: Select all

Convert(CHAR(13):CHAR(10),"",YourField)

Posted: Fri Nov 18, 2011 3:58 pm
by arunkumarmm
kirankumarreddydesireddy wrote:If I enable the "Contains Terminators" to yes,I am able to read the value of this field as "hi this",the remaining text "is rakesh" is ignored.
I believe that might be the problem with your viewer. Try double clicking the column value and check if you see more values.

Posted: Wed Dec 28, 2011 8:40 am
by kirankumarreddydesireddy
Hi Chulett,

Sorry for the late update on this.Actually it worked for us.I thought of updating this,so that it will be useful for others.

Enabling the "Contains Terminators" to yes and applying Convert(CHAR(13):CHAR(10),"",YourField) in the transformer,it has handled newline character.

But there was a drawback we found,i.e we were not able to handle newline character if the field is the the last one in our record.Is there anyway to handle this as well?


Thanks
Kiran

Posted: Wed Dec 28, 2011 9:50 am
by pandeesh
Hi,

Can your please elaborate?
As you told, the new line character is there in the last field. then what about the record delimiter?
Can you please post your actual input data and mention what's your required output?

Thanks

Posted: Wed Dec 28, 2011 1:52 pm
by dsusersaj
The new line at the end of the line can be handled by the propery 'Final Delimiter' in the format tab. Did you try that?.

Posted: Wed Dec 28, 2011 2:57 pm
by ray.wurlod
There is no Final Delimiter property in server job.
dsusersaj wrote:The new line at the end of the line can be handled by the propery 'Final Delimiter' in the format tab. Did you try that?

Posted: Thu Dec 29, 2011 12:28 am
by chandra.shekhar@tcs.com
@Craig
The ASCII value of the new line character is 10, but you have mentioned 13 also. When checked, I found that "char(13)" also means new line character.
Can you tell me that how the new line character can have two different ASCII values?

Posted: Thu Dec 29, 2011 12:52 am
by pandeesh
carriage return char(13) and line feed char(10)

Posted: Thu Dec 29, 2011 2:07 am
by chandra.shekhar@tcs.com
Hi Pandeesh,
Can you tell me the difference between the two?
I searched in wiki, but couldn't understand. :roll:

Posted: Thu Dec 29, 2011 8:00 am
by chulett
If you've read the wiki, you should have found that they relate back to two operations performed by the original manual typewriter - a carriage return and a line feed. The former moved the print position back to the beginning of the 'line' and the latter moved the print line down one. We tend to call that combination a 'newline' nowadays and it is used to mark the end of a record in sequential files. In the electronic world a CR has a decimal value of 13 while a LF has a decimal value of 10... or "d" v. "a" in hex.

The biggest thing to understand is the different between a newline on Windows versus UNIX, which is usually what trips people up when transferring files back and forth. UNIX uses a single character while Windows (DOS) uses two characters: a CR/LF pair.

Typical transfer mechanisms know how to convert one form to the other so you don't get any 'extra' characters but there are ways to transfer files where that doesn't happen. What you then up with (as one example) are DOS files on a UNIX system where the CR shows up as data. You'll also see it as a "control M" or "^M" depending on your viewer.

Hopefully that makes some sense and helps explain what you are seeing.

Posted: Fri Dec 30, 2011 1:33 am
by chandra.shekhar@tcs.com
Thanks Craig(as usual :D )