Page 1 of 3
String Parsing
Posted: Mon May 02, 2005 2:28 pm
by varshanswamy
hi,
I have an input string of the format
Field1,Field2
--------------------------
1,A=B&C=D&E=F&G=H
2,L=M&G=H
I need to write a job in datastage
to split this string which would be dynamic
and have it of the form.
Field1, Field2
---------------------
1,A=B
1,C=D
1,E=F
1,G=H
2,L=M
2,G=H
regards,
varsha
Posted: Tue May 03, 2005 1:06 am
by ArndW
Varsha,
Use a transform stage with variables:
KeyNumber FIELD(In.String,",",1)
RestOfString EREPLACE(In.String[3,9999],"&",CHAR(13):CHAR(15):KeyNumber:",")
The RestOfString variable now is formatted as you wanted, if you write to a sequential file you will have the format with <cr><lf> separation.
Posted: Tue May 03, 2005 1:19 am
by ArndW
Varsha,
I just saw that this was posted in the Parallel Extender forum, the solution I posted is for a Server job - doing the same logic in Px is somewhat more complicated, mainly because the EREPLACE function doesn't exist.
With a Px job I would use a pivot stage - but I am not on a system to check that right now.
Posted: Tue May 03, 2005 11:12 pm
by sudharsanan
Hi Arndw,
I tried the option that you gave .. i tried it in Datastage 6.0 Server Job...
Used a transform stage with variables:
KeyNumber FIELD(In.String,",",1)
RestOfString EREPLACE(In.String[3,9999],"&",CHAR(13):CHAR(15):KeyNumber:",")
But the result that i got was only 2 rows...
Field1 Field2
1 B
2 M
Not the excepted one.. can you tell me how to solve.. My job design is like this..
I/P(SequentialFile) ---> Xfm ---> O/P(Sequential File)
should i need to use a Pivot stage to convert the row to column...
Please advise me on how to approach..
Thanks
Sudharsanan
String Parsing
Posted: Wed May 04, 2005 12:04 am
by Sunshine2323
Hi Sudharshan,
I have tried Arnd's method in a sample job and it has worked for me.
Have u created the stage variables?
You need to use the value of the KeyNumber stage variable in the RestOfString stage variable
Also after you run the job open the sequential file on the server to see the Output as in my case the the Sequential file stage was not able to read the complete output data as 1 column from the input was converted into 2 columns
![Smile :)](./images/smilies/icon_smile.gif)
Posted: Wed May 04, 2005 12:17 am
by sudharsanan
Hi,
I used two stage variables in my job as told by Arnd...
You are correct in the Sequential File Stage i could see only two values(Don't know why).. But when i saw the actual file the result was like this:
Field1,"Field2"
1,"B
1,C=D
1,E=F
1,G=H"
2,"M
2,G=H"
There spaces in between the records... the starting values for 1 and 2 are missing... That is reason for two records only in the Squential file stage.. should we have change the formatting of the file in the output...
What was the result you got?...was any mistake with my job design.. please advice..
Thanks
Sudharsanan
String Parsing
Posted: Wed May 04, 2005 12:32 am
by Sunshine2323
Hi Sudharshan,
Remove
only char(15) from the derivation for the Stage Variable RestofString also replace the Quote Character to 000 in the Output Format Tab and try and let us know.
You will not be able to see the Output File as the Output file is defined for 1 column and the derivation converts it 2 columns.
If you try reading the output data using a sequential file that has 2 output columns defined you will have no problems.
Do you have any other suggestions?
Posted: Wed May 04, 2005 12:42 am
by ArndW
sudharsanan wrote:Hi Arndw,
I tried the option that you gave .. i tried it in Datastage 6.0 Server Job...
Used a transform stage with variables:
KeyNumber FIELD(In.String,",",1)
RestOfString EREPLACE(In.String[3,9999],"&",CHAR(13):CHAR(15):KeyNumber:",")
...
Thanks
Sudharsanan
I just saw that I used CHAR(13):CHAR(15); that is not correct
![Rolling Eyes :roll:](./images/smilies/icon_rolleyes.gif)
All you need is a <lf> so just use a CHAR(10). Mea culpa.
Posted: Wed May 04, 2005 12:46 am
by sudharsanan
Hi,
Thanks Sunshine for your input ....
I tried the option you told... the result is
Result Got
Field1,"Field2"
Field1,Field2
1,B
1,C=D
1,E=F
1,G=H
2,M
2,G=H
Actual Result
Field1, Field2
---------------------
1,A=B
1,C=D
1,E=F
1,G=H
2,L=M
2,G=H
The First record and Fifth records the first value is missing, what i have to change to get these values.. sorry for disturbing... please advice..
Thanks
Sudharsanan
Posted: Wed May 04, 2005 1:01 am
by ArndW
Sudhar,
please look at the logic of the program to find the error. The transform does nothing but replace the ampersand in the string with the number and line terminators. If you were to replace the CHAR(nn) with a simple "x" your line structure would not be changed and you could see what is happening in your code.
It looks to me like you are having an issue with the quoting and column separators - please remove any quoting and separators from the output and also look at the file from outside DS as well as from the "view-data".
String Parsing
Posted: Wed May 04, 2005 1:01 am
by Sunshine2323
Hi Sudharshan,
Can you paste a copy your derivation for the stage variables.
As I have achieved the desired result with the derivation specified.
Posted: Wed May 04, 2005 1:16 am
by sudharsanan
Hi All,
Thanks for the inputs...but the problem still resides in my code..
Let me put the code..
Stage Variables:
StageVar=FIELD(DSLink3.Field1,",",1)
StageVar1=EREPLACE(DSLink3.Field2[3,9999],"&",CHAR(10):StageVar:",")
In the output I am just using the StageVariables
Field1 = StageVar
Field2 = StageVar1
The Result i got is
Field1,"Field2"
1,"B
1,C=D
1,E=F
1,G=H"
2,"M
2,G=H"
I tried to look at the Data from the VIEW Data option where i am getting only
Field1 Field2
1 B
2 M
I couldn't find the Reason for that...Sorry for disturbing.. please help me..
Thanks
Sudharsanan
Posted: Wed May 04, 2005 1:19 am
by ArndW
Please replace the CHAR(10) with a "*", remove the delimiter and quotes from the output file and post your results.
String Parsing
Posted: Wed May 04, 2005 1:26 am
by Sunshine2323
Sudharshan,
Stage Variables:
StageVar=FIELD(DSLink3.Field1,",",1)
StageVar1=EREPLACE(DSLink3.Field2[3,9999],"&",CHAR(10):StageVar:",")
You have 2 fields in the input
Try reading both the fields in a single column by specifying a delimiter like '|' that is not present in your input data then cahnge the derivation as follows
StageVar=FIELD(
DSLink3.Field1,",",1)
StageVar1=EREPLACE(
DSLink3.Field1,"&",CHAR(10):StageVar:",")
Then let us know your result
Posted: Wed May 04, 2005 1:27 am
by sudharsanan
Hi Arndw,
Which value i should give for the delimiter and quotes ?..As i cannot leave that field blank...
Thanks
Sudharsanan