how to handle newline character reading from sequential file

kirankumarreddydesireddy · Thu Nov 17, 2011 2:30 am

Hi

We have got a situation to handle new line character for one of fields in a txt file,while reading from sequential file stage in server 7.5v jobs on windows platform.

Can any one let me know,if any possibility of handling this in datastage jobs or do I need to use some batch script to handle this.

appreciate your response.

Note :
We have actually handled this new line character in a field if the source and target are database.just that while reading from database we used replace function and replcaed with dummy value and loaded into sequential file and when loading into target,we replaced the dummy value from seqential file to new line character.

chulett · Post by **chulett** » Thu Nov 17, 2011 7:26 am

Try sliding the Columns tab over to the right to expose the "Contains Terminators" property. Enable it for that field.

ray.wurlod · Post by **ray.wurlod** » Thu Nov 17, 2011 2:36 pm

To clarify Craig's answer, there are more columns in the Columns grid. Scroll the Columns grid to the right to find the "Contains terminators" property.

Note that, for this to work properly, character strings containing terminators must be quoted.

kirankumarreddydesireddy · Fri Nov 18, 2011 5:32 am

Thanks Chulett and Ray.

If I enable the "Contains Terminators" to yes,I am able to read the value of the field before the new line character. however the value after new line character is ignored.

Is there anyway to replace that newline character with some space so that I can read the value of the entire field.

Example : If the field contains "hi this
is rakesh"
If I enable the "Contains Terminators" to yes,I am able to read the value of this field as "hi this",the remaining text "is rakesh" is ignored.

Thanks
Kiran

chulett · Post by **chulett** » Fri Nov 18, 2011 8:29 am

Is this field the last one in your record? I'm not sure what "ignored" might mean, it really should be there as data but your tool of choice may not be showing it. Try this in a transformer after the read and see if the entire value appears:

Code: Select all

Convert(CHAR(13):CHAR(10),"",YourField)

arunkumarmm · Post by **arunkumarmm** » Fri Nov 18, 2011 3:58 pm

kirankumarreddydesireddy wrote:If I enable the "Contains Terminators" to yes,I am able to read the value of this field as "hi this",the remaining text "is rakesh" is ignored.

I believe that might be the problem with your viewer. Try double clicking the column value and check if you see more values.

kirankumarreddydesireddy · Wed Dec 28, 2011 8:40 am

Hi Chulett,

Sorry for the late update on this.Actually it worked for us.I thought of updating this,so that it will be useful for others.

Enabling the "Contains Terminators" to yes and applying Convert(CHAR(13):CHAR(10),"",YourField) in the transformer,it has handled newline character.

But there was a drawback we found,i.e we were not able to handle newline character if the field is the the last one in our record.Is there anyway to handle this as well?

Thanks
Kiran

pandeesh · Post by **pandeesh** » Wed Dec 28, 2011 9:50 am

Hi,

Can your please elaborate?
As you told, the new line character is there in the last field. then what about the record delimiter?
Can you please post your actual input data and mention what's your required output?

Thanks

dsusersaj · Post by **dsusersaj** » Wed Dec 28, 2011 1:52 pm

The new line at the end of the line can be handled by the propery 'Final Delimiter' in the format tab. Did you try that?.

ray.wurlod · Post by **ray.wurlod** » Wed Dec 28, 2011 2:57 pm

There is no Final Delimiter property in server job.

dsusersaj wrote:The new line at the end of the line can be handled by the propery 'Final Delimiter' in the format tab. Did you try that?

chandra.shekhar@tcs.com · Thu Dec 29, 2011 12:28 am

@Craig
The ASCII value of the new line character is 10, but you have mentioned 13 also. When checked, I found that "char(13)" also means new line character.
Can you tell me that how the new line character can have two different ASCII values?

pandeesh · Post by **pandeesh** » Thu Dec 29, 2011 12:52 am

carriage return char(13) and line feed char(10)

chandra.shekhar@tcs.com · Thu Dec 29, 2011 2:07 am

Hi Pandeesh,
Can you tell me the difference between the two?
I searched in wiki, but couldn't understand.

chulett · Post by **chulett** » Thu Dec 29, 2011 8:00 am

If you've read the wiki, you should have found that they relate back to two operations performed by the original manual typewriter - a carriage return and a line feed. The former moved the print position back to the beginning of the 'line' and the latter moved the print line down one. We tend to call that combination a 'newline' nowadays and it is used to mark the end of a record in sequential files. In the electronic world a CR has a decimal value of 13 while a LF has a decimal value of 10... or "d" v. "a" in hex.

The biggest thing to understand is the different between a newline on Windows versus UNIX, which is usually what trips people up when transferring files back and forth. UNIX uses a single character while Windows (DOS) uses two characters: a CR/LF pair.

Typical transfer mechanisms know how to convert one form to the other so you don't get any 'extra' characters but there are ways to transfer files where that doesn't happen. What you then up with (as one example) are DOS files on a UNIX system where the CR shows up as data. You'll also see it as a "control M" or "^M" depending on your viewer.

Hopefully that makes some sense and helps explain what you are seeing.

chandra.shekhar@tcs.com · Fri Dec 30, 2011 1:33 am

Thanks Craig(as usual :D )