I am trying to clean up a file. The file is in the following format:
40212 VA, TN
40419 CA,KY - IN
40520 RI,P.A.
The pattern report comes out like this:
00082394 63.618% S [X] | NH
00024016 18.543% SS [X] | OH,MS
00009435 7.285% SSS [X] | CA,FL,OH
It also has other entries but I would be happy with getting these
paterns cleaned up. I need to resulting file to look like:
40312 VA TN
40319 CA KY IN
40320 RI PA
Can you tell me the best way to accomplish this?
Split multiple State tokens from single field
be happy with getting these
Hi- So basically what u want is u need to eliminate the special characters from ur Data file. I am sure this can be achieved by using a Parse Stage
in Quality Stage where u can specify which character should be replaced by what. But this is slightly Cumbersome because u need to specify all the special characters that might occur and which u dont want. So the bottomline is that u can do this in a much efficient way in Datastage.
Cheers
DSkkk
in Quality Stage where u can specify which character should be replaced by what. But this is slightly Cumbersome because u need to specify all the special characters that might occur and which u dont want. So the bottomline is that u can do this in a much efficient way in Datastage.
Cheers
DSkkk
g.kiran
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
Can't you just make sure that "-", "," and "." are in your STRIPLIST ?
Not sure why you'd want to force the third character of the first column to be "3", so I'm assuming that the problem there was with the keyboard (or, at least, between the keyboard and the chair).
Not sure why you'd want to force the third character of the first column to be "3", so I'm assuming that the problem there was with the keyboard (or, at least, between the keyboard and the chair).
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
follow up
It was a problem between the keyboard and the chair :Dray.wurlod wrote:Can't you just make sure that "-", "," and "." are in your STRIPLIST ?
Not sure why you'd want to force the third character of the first column to be "3", so I'm assuming that the problem there was with the keyboard (or, at least, between the keyboard and the chair).
Running a word investigate stage determines the states without a problem using the STRIPLIST, but I am not sure how to actually move the S (state) tokens it identifies into separate fields so I can use the pivot stage in DataStage to make the file how I would like. The final product would be as follows:
Out of QualityStage somehow:
40312 VA TN
40319 CA KY IN
40320 RI PA
Out of DataStage using a pivot stage (I think):
40312 VA
40312 TN
40319 CA
40319 KY
40319 IN
40320 RI
40320 PA
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
Pivot stage is a good call. Just make sure that its input link has enough columns to cope with the largest possible list of states. (50 is probably over the top, but is fail safe!)
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.