String Parsing

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

varshanswamy
Participant
Posts: 48
Joined: Thu Mar 11, 2004 10:32 pm

String Parsing

Post by varshanswamy »

hi,

I have an input string of the format
Field1,Field2
--------------------------
1,A=B&C=D&E=F&G=H
2,L=M&G=H

I need to write a job in datastage
to split this string which would be dynamic
and have it of the form.

Field1, Field2
---------------------
1,A=B
1,C=D
1,E=F
1,G=H
2,L=M
2,G=H


regards,
varsha
ArndW
Participant
Posts: 16318
Joined: Tue Nov 16, 2004 9:08 am
Location: Germany
Contact:

Post by ArndW »

Varsha,

Use a transform stage with variables:

KeyNumber FIELD(In.String,",",1)
RestOfString EREPLACE(In.String[3,9999],"&",CHAR(13):CHAR(15):KeyNumber:",")

The RestOfString variable now is formatted as you wanted, if you write to a sequential file you will have the format with <cr><lf> separation.
ArndW
Participant
Posts: 16318
Joined: Tue Nov 16, 2004 9:08 am
Location: Germany
Contact:

Post by ArndW »

Varsha,

I just saw that this was posted in the Parallel Extender forum, the solution I posted is for a Server job - doing the same logic in Px is somewhat more complicated, mainly because the EREPLACE function doesn't exist.

With a Px job I would use a pivot stage - but I am not on a system to check that right now.
sudharsanan
Participant
Posts: 64
Joined: Fri Jul 16, 2004 7:53 am

Post by sudharsanan »

Hi Arndw,

I tried the option that you gave .. i tried it in Datastage 6.0 Server Job...

Used a transform stage with variables:

KeyNumber FIELD(In.String,",",1)
RestOfString EREPLACE(In.String[3,9999],"&",CHAR(13):CHAR(15):KeyNumber:",")

But the result that i got was only 2 rows...

Field1 Field2
1 B
2 M

Not the excepted one.. can you tell me how to solve.. My job design is like this..

I/P(SequentialFile) ---> Xfm ---> O/P(Sequential File)

should i need to use a Pivot stage to convert the row to column...

Please advise me on how to approach..

Thanks
Sudharsanan
Sunshine2323
Charter Member
Charter Member
Posts: 130
Joined: Mon Sep 06, 2004 3:05 am
Location: Dubai,UAE

String Parsing

Post by Sunshine2323 »

Hi Sudharshan,

I have tried Arnd's method in a sample job and it has worked for me.

Have u created the stage variables?

You need to use the value of the KeyNumber stage variable in the RestOfString stage variable

Also after you run the job open the sequential file on the server to see the Output as in my case the the Sequential file stage was not able to read the complete output data as 1 column from the input was converted into 2 columns :)
Warm Regards,
Amruta Bandekar

<b>If A equals success, then the formula is: A = X + Y + Z, X is work. Y is play. Z is keep your mouth shut. </b>
--Albert Einstein
sudharsanan
Participant
Posts: 64
Joined: Fri Jul 16, 2004 7:53 am

Post by sudharsanan »

Hi,

I used two stage variables in my job as told by Arnd...

You are correct in the Sequential File Stage i could see only two values(Don't know why).. But when i saw the actual file the result was like this:

Field1,"Field2"

1,"B
1,C=D
1,E=F
1,G=H"

2,"M
2,G=H"

There spaces in between the records... the starting values for 1 and 2 are missing... That is reason for two records only in the Squential file stage.. should we have change the formatting of the file in the output...

What was the result you got?...was any mistake with my job design.. please advice..

Thanks
Sudharsanan
Sunshine2323
Charter Member
Charter Member
Posts: 130
Joined: Mon Sep 06, 2004 3:05 am
Location: Dubai,UAE

String Parsing

Post by Sunshine2323 »

Hi Sudharshan, :)

Remove only char(15) from the derivation for the Stage Variable RestofString also replace the Quote Character to 000 in the Output Format Tab and try and let us know.

You will not be able to see the Output File as the Output file is defined for 1 column and the derivation converts it 2 columns.

If you try reading the output data using a sequential file that has 2 output columns defined you will have no problems.

Do you have any other suggestions?
Warm Regards,
Amruta Bandekar

<b>If A equals success, then the formula is: A = X + Y + Z, X is work. Y is play. Z is keep your mouth shut. </b>
--Albert Einstein
ArndW
Participant
Posts: 16318
Joined: Tue Nov 16, 2004 9:08 am
Location: Germany
Contact:

Post by ArndW »

sudharsanan wrote:Hi Arndw,

I tried the option that you gave .. i tried it in Datastage 6.0 Server Job...

Used a transform stage with variables:

KeyNumber FIELD(In.String,",",1)
RestOfString EREPLACE(In.String[3,9999],"&",CHAR(13):CHAR(15):KeyNumber:",")
...

Thanks
Sudharsanan
I just saw that I used CHAR(13):CHAR(15); that is not correct :roll: All you need is a <lf> so just use a CHAR(10). Mea culpa.
sudharsanan
Participant
Posts: 64
Joined: Fri Jul 16, 2004 7:53 am

Post by sudharsanan »

Hi,

Thanks Sunshine for your input ....



I tried the option you told... the result is
Result Got

Field1,"Field2"

Field1,Field2

1,B
1,C=D
1,E=F
1,G=H

2,M
2,G=H




Actual Result

Field1, Field2
---------------------
1,A=B
1,C=D
1,E=F
1,G=H
2,L=M
2,G=H

The First record and Fifth records the first value is missing, what i have to change to get these values.. sorry for disturbing... please advice..


Thanks
Sudharsanan
ArndW
Participant
Posts: 16318
Joined: Tue Nov 16, 2004 9:08 am
Location: Germany
Contact:

Post by ArndW »

Sudhar,

please look at the logic of the program to find the error. The transform does nothing but replace the ampersand in the string with the number and line terminators. If you were to replace the CHAR(nn) with a simple "x" your line structure would not be changed and you could see what is happening in your code.
It looks to me like you are having an issue with the quoting and column separators - please remove any quoting and separators from the output and also look at the file from outside DS as well as from the "view-data".
Sunshine2323
Charter Member
Charter Member
Posts: 130
Joined: Mon Sep 06, 2004 3:05 am
Location: Dubai,UAE

String Parsing

Post by Sunshine2323 »

Hi Sudharshan,
Can you paste a copy your derivation for the stage variables.
As I have achieved the desired result with the derivation specified.
Warm Regards,
Amruta Bandekar

<b>If A equals success, then the formula is: A = X + Y + Z, X is work. Y is play. Z is keep your mouth shut. </b>
--Albert Einstein
sudharsanan
Participant
Posts: 64
Joined: Fri Jul 16, 2004 7:53 am

Post by sudharsanan »

Hi All,

Thanks for the inputs...but the problem still resides in my code..
Let me put the code..

Stage Variables:

StageVar=FIELD(DSLink3.Field1,",",1)

StageVar1=EREPLACE(DSLink3.Field2[3,9999],"&",CHAR(10):StageVar:",")

In the output I am just using the StageVariables

Field1 = StageVar
Field2 = StageVar1

The Result i got is

Field1,"Field2"

1,"B
1,C=D
1,E=F
1,G=H"

2,"M
2,G=H"


I tried to look at the Data from the VIEW Data option where i am getting only
Field1 Field2
1 B
2 M

I couldn't find the Reason for that...Sorry for disturbing.. please help me..

Thanks
Sudharsanan
ArndW
Participant
Posts: 16318
Joined: Tue Nov 16, 2004 9:08 am
Location: Germany
Contact:

Post by ArndW »

Please replace the CHAR(10) with a "*", remove the delimiter and quotes from the output file and post your results.
Sunshine2323
Charter Member
Charter Member
Posts: 130
Joined: Mon Sep 06, 2004 3:05 am
Location: Dubai,UAE

String Parsing

Post by Sunshine2323 »

Sudharshan,
Stage Variables:

StageVar=FIELD(DSLink3.Field1,",",1)

StageVar1=EREPLACE(DSLink3.Field2[3,9999],"&",CHAR(10):StageVar:",")
You have 2 fields in the input

Try reading both the fields in a single column by specifying a delimiter like '|' that is not present in your input data then cahnge the derivation as follows

StageVar=FIELD(DSLink3.Field1,",",1)

StageVar1=EREPLACE(DSLink3.Field1,"&",CHAR(10):StageVar:",")

Then let us know your result
Last edited by Sunshine2323 on Wed May 04, 2005 1:27 am, edited 1 time in total.
Warm Regards,
Amruta Bandekar

<b>If A equals success, then the formula is: A = X + Y + Z, X is work. Y is play. Z is keep your mouth shut. </b>
--Albert Einstein
sudharsanan
Participant
Posts: 64
Joined: Fri Jul 16, 2004 7:53 am

Post by sudharsanan »

Hi Arndw,

Which value i should give for the delimiter and quotes ?..As i cannot leave that field blank...

Thanks
Sudharsanan
Post Reply