CONVERT Function not performing as expected...
Moderators: chulett, rschirm, roy
CONVERT Function not performing as expected...
Hi All,
I have a simple test job set up with:
SF ===> TFM ===> SF
The Sequential Files have a single column (BigCol) of VarChar(25).
The input file is:
Line One
Line Two
Line Three
And the Hex representation is:
In the TransForMer I have:
Convert(CHAR(10),"",DSLink3.BigCol)
When I run the job, I was expecting that the 0A line feed characters would be replaced with blanks. Alas, they are still there in the output.
Any assistance much appreciated.
Thanks.
S
I have a simple test job set up with:
SF ===> TFM ===> SF
The Sequential Files have a single column (BigCol) of VarChar(25).
The input file is:
Line One
Line Two
Line Three
And the Hex representation is:
In the TransForMer I have:
Convert(CHAR(10),"",DSLink3.BigCol)
When I run the job, I was expecting that the 0A line feed characters would be replaced with blanks. Alas, they are still there in the output.
Any assistance much appreciated.
Thanks.
S
The 0x0a characters are the line separators used by your files, so they wouldn't show up in the data.
<a href=http://www.worldcommunitygrid.org/team/ ... TZ9H4CGVP1 target="WCGWin">
</a>
</a>
Thanks.
Is there an easy way of removing these?
The actual problem I have is that I am trying to parse an XML file. One of the elements contains lots of text that is formatted on the file as:
When I run this through the XMLInput stage, it parses ok, but the output (sequential) file shows the rows:
What I want is:
So... my idea was to use convert to remove the control characters from the input XML...
Thanks again.
Edit to add:
When I run the convert on the small test file, but use CHAR(13) rather than CHAR(10), it does remove the 0D CR characters..
Is there an easy way of removing these?
The actual problem I have is that I am trying to parse an XML file. One of the elements contains lots of text that is formatted on the file as:
Code: Select all
<textElement>
Lots of text starts here and goes on and on.
Then a new bit starts here, causing problems. The lines have
0A control characters for line feed.
</textElement>
Code: Select all
Row1:Lots of text starts here and goes on and on.
Row2:
Row3:Then a new bit starts here, causing problems. The lines have
Row4:0A control characters for line feed.
Code: Select all
Row1: Lots of text starts here and goes on and on. Then a new bit starts here, causing problems. The lines have 0A control characters for line feed.
Thanks again.
Edit to add:
When I run the convert on the small test file, but use CHAR(13) rather than CHAR(10), it does remove the 0D CR characters..
Ok, that can be done. Keep your input file definitions as they are. If I recall correctly, 0x0a is "whitespace" in XML, so you could add a transform stage:
this would only output a new data line when a ">" character is found, which usually would mean the end of an end tag. This simple logic would remove most carriage returns.
Basically you have a problem with VarChar() lines. If you were to declare a fixed line length of 100 and no terminators, the job would read 100 characters including the 0x0a and you could use your convert to remove them. The last line in the file would cause a warning, since it will probably not be 100 characters exactly. The question arises as to whether you need any 0x0a characters at all, if you declare your data string big enough, you could read the whole XML file as one string and then proceed to replace the linefeeds.
Code: Select all
SvarNewLine = IF INDEX(In.BigColumn,'>',1) > 0 THEN 1 ELSE 0
SvarOutString = IF SvarNewLine=1 THEN SvarDataString:In.BigColumn ELSE ''
SvarDataString = IF SvarNewLine=1 THEN '' ELSE SvarDataString:In.BigColumn
Constraint: SvarNewLine=1
Basically you have a problem with VarChar() lines. If you were to declare a fixed line length of 100 and no terminators, the job would read 100 characters including the 0x0a and you could use your convert to remove them. The last line in the file would cause a warning, since it will probably not be 100 characters exactly. The question arises as to whether you need any 0x0a characters at all, if you declare your data string big enough, you could read the whole XML file as one string and then proceed to replace the linefeeds.
<a href=http://www.worldcommunitygrid.org/team/ ... TZ9H4CGVP1 target="WCGWin">
</a>
</a>
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
Thanks Ray.ray.wurlod wrote:Change the format to "no line terminators" then your Convert() function will work. Of course you will then only have one line of data.
I have changed the Final Delimiter on the input file to 'none', but this made no difference. Is this what you meant by "no line terminators", or have I misunderstood?
Cheers,
S
If you do as Ray suggests you will need to increase BigCol to hold more than 25 charactes.
The "Final Delimiter" is not the same as the "Line delimiter"
The "Final Delimiter" is not the same as the "Line delimiter"
<a href=http://www.worldcommunitygrid.org/team/ ... TZ9H4CGVP1 target="WCGWin">
</a>
</a>
I tried this by setting Record Delimiter to 'null' - this produced one row of output in DataStage (appears as one row, seperated with square box (representation of the control chars)), but when I view the file in a HEX editor, it still has the 0D0A characters.ArndW wrote:If you do as Ray suggests you will need to increase BigCol to hold more than 25 charactes.
The "Final Delimiter" is not the same as the "Line delimiter"
I am confused though as to why using CHAR(13) it clears the CR, but using CHAR(10) missed the LF. Any ideas?
Thanks again.
Just delete the "Record Delimiter" completely. You are seeing the difference between a UNIX and DOS line termination, <CR>LF> versus <LF>.
<a href=http://www.worldcommunitygrid.org/team/ ... TZ9H4CGVP1 target="WCGWin">
</a>
</a>