Splitting String into different substrings of variablelength

Post questions here relative to DataStage Server Edition for such areas as Server job design, DS Basic, Routines, Job Sequences, etc.

Moderators: chulett, rschirm, roy

Post Reply
pmadhavi
Charter Member
Charter Member
Posts: 92
Joined: Fri Jan 27, 2006 2:54 pm

Splitting String into different substrings of variablelength

Post by pmadhavi »

Hi
I have an Address column in the table which is getting split into 3 diffrent fields city, state, zip.

Now we need to split the value from Address column and load it to the 3 columns.
The value of Address looks like
21785 Filigree Court, Suite 103 Ashburn, Virginia 20147-6214
4431-A Brookfield Corp Drive Chantilly, Virginia 20151

it means the zip code is not always 5digit

Please suggest how to split them.
Thanks,
Madhavi
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Use the Field() function. To get the rightmost field, count the delimiters (space characters).

Code: Select all

Field(InLink.Address, " ", Count(InLink.Address, " ") + 1, 1)
An even better approach would be a QualityStage STAN (standardization) job, which can accurately parse free-form address data. Do you have QualityStage installed? It's the perfect tool for this job.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
pmadhavi
Charter Member
Charter Member
Posts: 92
Joined: Fri Jan 27, 2006 2:54 pm

Post by pmadhavi »

Hi Ray,
Thanks for the help. I am able to fetch the zipcode from the address as it is the last substring in the address column.

But when I am mapping the complete address column to a flat file, it is not fetching the full string rather its throwing an error like

read_delimited()-invalid quotes, row 61 column address='Sumner county courthouse'

The actual value of the column is
"Sumner County Courthouse Gallatin, TN 37066". But only 'Sumner County Courthouse" is getting loaded to the flat file.
I tried with hash file too. No change.

pls suggest me why am i getting this error.
Is it because the no. of spaces between the substrings?
Thanks,
Madhavi
DSguru2B
Charter Member
Charter Member
Posts: 6854
Joined: Wed Feb 09, 2005 3:44 pm
Location: Houston, TX

Post by DSguru2B »

That is because Ray's solution for "Sumner County Courthouse Gallatin, TN 37066" is going to give you 37066". The sequential file will shout because only one " is found. It always looks for pairs. In your sequential file output, provide the quote character as 000. That will calm down the sequential file stage but you still need to handle that situation. Maybe trim the quotes.
Creativity is allowing yourself to make mistakes. Art is knowing which ones to keep.
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

... and you will need to Trim() the trailing quote character from the result generated by the Field() function in this case.
:!:
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
pmadhavi
Charter Member
Charter Member
Posts: 92
Joined: Fri Jan 27, 2006 2:54 pm

Post by pmadhavi »

ray.wurlod wrote:... and you will need to Trim() the trailing quote character from the result generated by the Field() function in this case.
:!: ...
Hi
The problem is Address field contains many spaces in between.
I have a function called CHANGE() which I am using in a transformer to remove the spaces. But before getting the address into the transformer itself it is throwing the error as given below.

DataStage Job 1258 Phantom 3572
Program "JOB.1124724363.DT.1403136011.TRANS1": Line 49, Improper data type.
Attempting to Cleanup after ABORT raised in stage Test2..Transformer_3
DataStage Phantom Aborting with @ABORT.CODE = 3

Its behaving in a weird manner.
The steps I followed:
1. tried loading data from ODBC to flat file(successful)
2. tried loading data from ODBC to flat file via transformer(successful)
3. if I use a second transformer for lookup and then load data to flat file, it is throwing up the above error.

then I keep on removing one after the other and go to the step 1, still it throws the same error.

I dont understand the reason.
Pls help me
Thanks,
Madhavi
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

The number of spaces is irrelevant. The Field() function I provided counts them in order to yield the final delimited field. If there are trailing spaces, you might like to apply a TrimB() function before applying the Field() function.

Code: Select all

Field(TrimB(InLink.Address), " ", Count(TrimB(InLink.Address), " ") + 1, 1)
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Post Reply