how to handle newline character reading from sequential file

Post questions here relative to DataStage Server Edition for such areas as Server job design, DS Basic, Routines, Job Sequences, etc.

Moderators: chulett, rschirm, roy

kirankumarreddydesireddy
Participant
Posts: 110
Joined: Mon Jan 11, 2010 4:22 am

how to handle newline character reading from sequential file

Post by kirankumarreddydesireddy »

Hi

We have got a situation to handle new line character for one of fields in a txt file,while reading from sequential file stage in server 7.5v jobs on windows platform.

Can any one let me know,if any possibility of handling this in datastage jobs or do I need to use some batch script to handle this.

appreciate your response.



Note :
We have actually handled this new line character in a field if the source and target are database.just that while reading from database we used replace function and replcaed with dummy value and loaded into sequential file and when loading into target,we replaced the dummy value from seqential file to new line character.
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Try sliding the Columns tab over to the right to expose the "Contains Terminators" property. Enable it for that field.
-craig

"You can never have too many knives" -- Logan Nine Fingers
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

To clarify Craig's answer, there are more columns in the Columns grid. Scroll the Columns grid to the right to find the "Contains terminators" property.

Note that, for this to work properly, character strings containing terminators must be quoted.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
kirankumarreddydesireddy
Participant
Posts: 110
Joined: Mon Jan 11, 2010 4:22 am

Post by kirankumarreddydesireddy »

Thanks Chulett and Ray.

If I enable the "Contains Terminators" to yes,I am able to read the value of the field before the new line character. however the value after new line character is ignored.

Is there anyway to replace that newline character with some space so that I can read the value of the entire field.


Example : If the field contains "hi this
is rakesh"
If I enable the "Contains Terminators" to yes,I am able to read the value of this field as "hi this",the remaining text "is rakesh" is ignored.


Thanks
Kiran
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Is this field the last one in your record? I'm not sure what "ignored" might mean, it really should be there as data but your tool of choice may not be showing it. Try this in a transformer after the read and see if the entire value appears:

Code: Select all

Convert(CHAR(13):CHAR(10),"",YourField)
-craig

"You can never have too many knives" -- Logan Nine Fingers
arunkumarmm
Participant
Posts: 246
Joined: Mon Jun 30, 2008 3:22 am
Location: New York
Contact:

Post by arunkumarmm »

kirankumarreddydesireddy wrote:If I enable the "Contains Terminators" to yes,I am able to read the value of this field as "hi this",the remaining text "is rakesh" is ignored.
I believe that might be the problem with your viewer. Try double clicking the column value and check if you see more values.
Arun
kirankumarreddydesireddy
Participant
Posts: 110
Joined: Mon Jan 11, 2010 4:22 am

Post by kirankumarreddydesireddy »

Hi Chulett,

Sorry for the late update on this.Actually it worked for us.I thought of updating this,so that it will be useful for others.

Enabling the "Contains Terminators" to yes and applying Convert(CHAR(13):CHAR(10),"",YourField) in the transformer,it has handled newline character.

But there was a drawback we found,i.e we were not able to handle newline character if the field is the the last one in our record.Is there anyway to handle this as well?


Thanks
Kiran
pandeesh
Premium Member
Premium Member
Posts: 1399
Joined: Sun Oct 24, 2010 5:15 am
Location: CHENNAI, TAMIL NADU

Post by pandeesh »

Hi,

Can your please elaborate?
As you told, the new line character is there in the last field. then what about the record delimiter?
Can you please post your actual input data and mention what's your required output?

Thanks
pandeeswaran
dsusersaj
Premium Member
Premium Member
Posts: 160
Joined: Mon Dec 17, 2007 10:44 am

Post by dsusersaj »

The new line at the end of the line can be handled by the propery 'Final Delimiter' in the format tab. Did you try that?.
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

There is no Final Delimiter property in server job.
dsusersaj wrote:The new line at the end of the line can be handled by the propery 'Final Delimiter' in the format tab. Did you try that?
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
chandra.shekhar@tcs.com
Premium Member
Premium Member
Posts: 353
Joined: Mon Jan 17, 2011 5:03 am
Location: Mumbai, India

Post by chandra.shekhar@tcs.com »

@Craig
The ASCII value of the new line character is 10, but you have mentioned 13 also. When checked, I found that "char(13)" also means new line character.
Can you tell me that how the new line character can have two different ASCII values?
Thanx and Regards,
ETL User
pandeesh
Premium Member
Premium Member
Posts: 1399
Joined: Sun Oct 24, 2010 5:15 am
Location: CHENNAI, TAMIL NADU

Post by pandeesh »

carriage return char(13) and line feed char(10)
pandeeswaran
chandra.shekhar@tcs.com
Premium Member
Premium Member
Posts: 353
Joined: Mon Jan 17, 2011 5:03 am
Location: Mumbai, India

Post by chandra.shekhar@tcs.com »

Hi Pandeesh,
Can you tell me the difference between the two?
I searched in wiki, but couldn't understand. :roll:
Thanx and Regards,
ETL User
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

If you've read the wiki, you should have found that they relate back to two operations performed by the original manual typewriter - a carriage return and a line feed. The former moved the print position back to the beginning of the 'line' and the latter moved the print line down one. We tend to call that combination a 'newline' nowadays and it is used to mark the end of a record in sequential files. In the electronic world a CR has a decimal value of 13 while a LF has a decimal value of 10... or "d" v. "a" in hex.

The biggest thing to understand is the different between a newline on Windows versus UNIX, which is usually what trips people up when transferring files back and forth. UNIX uses a single character while Windows (DOS) uses two characters: a CR/LF pair.

Typical transfer mechanisms know how to convert one form to the other so you don't get any 'extra' characters but there are ways to transfer files where that doesn't happen. What you then up with (as one example) are DOS files on a UNIX system where the CR shows up as data. You'll also see it as a "control M" or "^M" depending on your viewer.

Hopefully that makes some sense and helps explain what you are seeing.
-craig

"You can never have too many knives" -- Logan Nine Fingers
chandra.shekhar@tcs.com
Premium Member
Premium Member
Posts: 353
Joined: Mon Jan 17, 2011 5:03 am
Location: Mumbai, India

Post by chandra.shekhar@tcs.com »

Thanks Craig(as usual :D )
Thanx and Regards,
ETL User
Post Reply