Supress extra Carriage returns

Post questions here relative to DataStage Server Edition for such areas as Server job design, DS Basic, Routines, Job Sequences, etc.

Moderators: chulett, rschirm, roy

kris007
Charter Member
Charter Member
Posts: 1102
Joined: Tue Jan 24, 2006 5:38 pm
Location: Riverside, RI

Supress extra Carriage returns

Post by kris007 »

Hi All,

I am extracting data from SQL server into a sequential file and then loading that into an Oracle table. My problem is, for some of the rows, the data contains an extra carriage return. Can anyone guide me on how to supress the extra carriage return and also how to supress an extra pair of CR-LF from the data. I tried to use EREPLACE and replaced the character with space and then used TRIM function. But, that doesn't solve my purpose. Any thoughts?? :?
Kris

Where's the "Any" key?-Homer Simpson
kris007
Charter Member
Charter Member
Posts: 1102
Joined: Tue Jan 24, 2006 5:38 pm
Location: Riverside, RI

Post by kris007 »

I was able to resolve the Issue with the following function

Code: Select all

Trim(Ereplace(input.Link, Char(13): Char(10) ,"",-1,0))
I didn't include char(10) earlier and hence the problem.
Kris

Where's the "Any" key?-Homer Simpson
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

You could also handle it via the 'Contains Terminators' column in the Sequential File stage if you wanted to keep them in the data.
-craig

"You can never have too many knives" -- Logan Nine Fingers
kris007
Charter Member
Charter Member
Posts: 1102
Joined: Tue Jan 24, 2006 5:38 pm
Location: Riverside, RI

Post by kris007 »

Craig,

I have already set that Option to Yes. Still, an extra CR is found in my data and that was what I was worried about. I had to use the above code mentioned to get rid of them.
Kris

Where's the "Any" key?-Homer Simpson
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Ah... an extra CR, gotcha. Reading comprehension can be so elusive some days. :wink:

On the UNIX side, it's those extra LFs that can be pesky.
-craig

"You can never have too many knives" -- Logan Nine Fingers
kris007
Charter Member
Charter Member
Posts: 1102
Joined: Tue Jan 24, 2006 5:38 pm
Location: Riverside, RI

Post by kris007 »

I am back to where I was :(

I need to supress or trim only the extra carriage returns.But, what seems to happen is all the terminators within the data are being converted into a space and hence I have data which is supposed to be in three lines smushed into a single line. It would be great if anyone could throw some light on how to trim only the extra CR-LF at the end of the sentence or data and retain the one's which are in between the data.
Kris

Where's the "Any" key?-Homer Simpson
kcbland
Participant
Posts: 5208
Joined: Wed Jan 15, 2003 8:56 am
Location: Lutz, FL
Contact:

Post by kcbland »

You could always look at the last two characters and remove them if they are <CR><LF>. You could write a DS function that repeatedly strips off the last two characters of a string until the last two characters are not <CR><LF>. Something like (off the top of my head, please test):

Code: Select all

Text=Arg1
Done = @FALSE
Loop UNTIL Done Do
   If RIGHT(Text,2)=CHAR(13):CHAR(10) Then Text=TRIM(Text[1,LEN(Text)-2]) Else Done = @TRUE
Repeat
Ans = Text
Kenneth Bland

Rank: Sempai
Belt: First degree black
Fight name: Captain Hook
Signature knockout: right upper cut followed by left hook
Signature submission: Crucifix combined with leg triangle
bchau
Charter Member
Charter Member
Posts: 46
Joined: Tue Jan 18, 2005 7:39 am
Location: Japan
Contact:

Post by bchau »

I'm not sure what your data volume is, but if there aren't many records, you could read the data as one big column. In our Sequential File Stage under Stage tab. There is a Line Termination selection. Select "None" and read all lines into 1 column.

In your transformer you can use the convert function as follows

Convert(char(13):char(10):char(13):char(10),char(13):char(10),input column)
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Code: Select all

Ereplace(InLink.TheString,svCRLF,svLF,1,Count(InLink.TheString,svCRLF))
where stage variable svLF is initialized to Char(10) and stage variable svCRLF is initialized to Char(13):Char(10).
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
kris007
Charter Member
Charter Member
Posts: 1102
Joined: Tue Jan 24, 2006 5:38 pm
Location: Riverside, RI

Post by kris007 »

Ray,

I get the following error while running the Job after I used the derivation

Code: Select all

DataStage Job 185 Phantom 3412
Program "JOB.47473586.DT.1398128902.TRANS1": Line 362, Improper data type.
Attempting to Cleanup after ABORT raised in stage 
The lines around line number 362 in the RT_BP185 folder go like this

Code: Select all

V0S1.Stagevar%%22 = Char(13) : Char(10)
	V0S1.Stagevar%%23 = Char(10)
IF STAGECOM.TRACE.STATS THEN CALL $PERF.END(-2)

	STAGECOM.PINNO = 2
	    IF STAGECOM.TRACE.STATS THEN CALL $PERF.BEGIN(-3)
	    IF V0S1.Stagevar%%21 = 0 THEN
		* Column derivation code for pin out_address
		Pin%%V0S1P2.Column%%5 = (Ereplace(Pin%%V0S1P1.Column%%5, V0S1.Stagevar%%22, V0S1.Stagevar%%23, 1, Count(Pin%%V0S1P1.Column%%5, V0S1.Stagevar%%22)))
		Pin%%V0S1P2.Column%%9 = (Trim(Pin%%V0S1P1.Column%%9))
		Pin%%V0S1P2.Column%%11 = (Trim(Ereplace(Pin%%V0S1P1.Column%%11, Char(13) : Char(10), "", -1, 0)))
		Pin%%V0S1P2.Column%%17 = (Field(Pin%%V0S1P1.Column%%17, ".", 1, 1))
		Pin%%V0S1P2.Column%%19 = (Field(Pin%%V0S1P1.Column%%19, ".", 1, 1))
		Pin%%V0S1P2.Column%%23 = (Field(OCONV(DATE(), "D4-YMD[4,2,2]") : " " : OCONV(TIME(), "MTS:") : ".000", ".", 1, 1))
		Pin%%V0S1P2.Column%%24 = ('ADS')
		Pin%%V0S1P2.REJECTED = @FALSE
	    IF STAGECOM.TRACE.STATS THEN CALL $PERF.END(-3)


The field I am using the derivation is an Address field and of datatype Varchar. I don't get it why the Job is aborting. Any insight would be helpful.
Kris

Where's the "Any" key?-Homer Simpson
ArndW
Participant
Posts: 16318
Joined: Tue Nov 16, 2004 9:08 am
Location: Germany
Contact:

Post by ArndW »

Kris,

I can't see which line in the code you posted might be causing the invalid datatype error. The line number that the compiler thinks is at fault is often not correct, this is due to a limitation in the compiler (this problem has been around since over 20 years!).

If you've done this much work already and seeing the generated BASIC code didn't frighten you off I'll suggest one step further.

go into TCL in your project and the your Job's JobNo if you don't know if already. You can do this with a LIST DS_JOBS {JobName} JOBNO and then issue the command VLIST RT_BP{JobNo} JOB.47473586.DT.1398128902.TRANS1
This will invoke listing your job with the opcodes as well as the line numbers that the compiler is using. Page through until you get to your error line 362 and see if you can spot the potential problem there, or post that line.
kris007
Charter Member
Charter Member
Posts: 1102
Joined: Tue Jan 24, 2006 5:38 pm
Location: Riverside, RI

Post by kris007 »

ArndW

Looks like the Count function in the Derivation

Code: Select all

Ereplace(InLink.TheString,svCRLF,svLF,1,Count(InLink.TheString,svCRLF))
is causing the problem.

Here are the lines when I run the statement from TCL in my Project

Code: Select all


00362:  Pin%%V0S2P2.Column%5 = ( Ereplace(Pin%%V0S2P1.Column%%5, V0S2.Stagevar%%1, V0S2.Stagevar%2, 1, Count(Pin%%V0S2P1.Column%%5, V0S2.Stagevar%%1)))

00362 00234 : 0E6 matrix    STAGECOM.ARR  7 1  =>$R7
00362 0023E : 0E6 matrix    STAGECOM.ARR  1 1  =>$R8
00362 00248 : 0E6 matrix    STAGECOM.ARR  2 1  =>$R9
00362 00252 : 0E6 matrix    STAGECOM.ARR  7 1  =>$R10
00362 0025C : 0E6 matrix    STAGECOM.ARR  1 1  =>$R11
00362 00266 : 040 count    $R10 $R11  =>$R12
00362 0026E : 24A ereplace    $R7 $R8 $R9 $R12  =>$R13
00362 0027C : 0FC move_matrix    $R13  27 1  =>STAGECOM.ARR
I am not sure what is causing the problem here. Any insights would be helpful.

Thanks
Kris

Where's the "Any" key?-Homer Simpson
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

There is no way you can say that the Count() function is causing the problem.
Element 1 of STAGECOM.ARR contains InLink.TheString (by now a variable named Pin%%V0S1P1.Column%%5); this is moved into register #11 (named $R11) for the Count() function. The value of the stage variable svCRLF (by now a variable named V0S1.Stagevar%%22) is mapped into the second element of STAGECOM.ARR and is moved into register #12 (named $R12) for the Count() function. $R11 and $R12 are used as the arguments to the Count() function, the result of which is loaded into register #13 (named $R13) which is subsequently used in the Ereplace() function.

To answer the original question, if the additional CRLF is at the end of the data then you can simply return a shorter string:

Code: Select all

Left(InLink.TheString, Len(InLink.TheString-2))
This is unrelated to "invalid data type"; this usually arises from invalid use of file or subroutine variables. You would need to post the entire VLIST report for us to comment further.

BTW the expression Field(OCONV(DATE(), "D4-YMD[4,2,2]") : " " : OCONV(TIME(), "MTS:") : ".000", ".", 1, 1) is doing too much work. There is no "." in the first argument of the Field() function. Oconv(Date(),"D-YMD[4,2,2]") : " " : Oconv(Time(), "MTS:") would suffice. If you require the date and time to be unchanging, prefer @DATE to Date() and @TIME to Time() and, for maximum efficiency, calculate it once only, in initializing a stage variable, perhaps called svNowAsTimestamp.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
kcbland
Participant
Posts: 5208
Joined: Wed Jan 15, 2003 8:56 am
Location: Lutz, FL
Contact:

Post by kcbland »

If anyone cares, the original problem was MULTIPLE CR/LF at the END of the string, so you need to remove as many as you can from the end, preserving the rest of the formatting. My original answer still looks to be the only one that solves that issue.
Kenneth Bland

Rank: Sempai
Belt: First degree black
Fight name: Captain Hook
Signature knockout: right upper cut followed by left hook
Signature submission: Crucifix combined with leg triangle
kris007
Charter Member
Charter Member
Posts: 1102
Joined: Tue Jan 24, 2006 5:38 pm
Location: Riverside, RI

Post by kris007 »

Yes Ken. I am also of the same opinion. That looks like the solution as of now.

Even Ray's solution looked good ( in fact perfect) to me until I ran the job. but I am not sure why it isn't working. I don't have access to Datastage now. I shall post the whole VLIST and see if it tells us anything ASAP.

And Ray,

Code: Select all

Left(InLink.TheString, Len(InLink.TheString-2))
doesn't fullfill my requirment as not all the rows contain extra CRLF's. It's only a few of them.
Kris

Where's the "Any" key?-Homer Simpson
Post Reply