Extracting the substring from a Char Field

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
Raamc
Premium Member
Premium Member
Posts: 87
Joined: Mon Aug 20, 2007 9:08 am

Extracting the substring from a Char Field

Post by Raamc »

Hi All,

I have a column called ADDR4 of type VARCHAR(40).It will hold the 4th Address line of an account. I am getting data as POSTCODE concatenated with STATE in this field.
I need to extract the State from ADDR4.

Here is the sample data
ADDR4 ABCD M22 1 TQ
ADDR4 AB CD BD 22 1 SW

In all the cases i need to extract ABCD.

Please suggest me how to do this in the transformer.

Thanks
Raam
ArndW
Participant
Posts: 16318
Joined: Tue Nov 16, 2004 9:08 am
Location: Germany
Contact:

Post by ArndW »

Do you want to extract the first 4 characters from the line, but ignoring spaces?
Raamc
Premium Member
Premium Member
Posts: 87
Joined: Mon Aug 20, 2007 9:08 am

Post by Raamc »

I want to extract the STATE name from this column.but i dont know the lenght of the state data. It may variey from record to record. I want to extract the data before POST CODE...(State will be followed by POST CODE)

We can identify POSTCODE by numbers in it.

So my logic is, i want to extract the data before numbers.
Sreedhar
Participant
Posts: 187
Joined: Mon Oct 30, 2006 12:16 am

Post by Sreedhar »

The INDEX function with the DataStage should help you in this case. Please go through the INDEX FUNCTION.


Theoretically it is some thing like this.

Use index function to get the occurrence of numeric data (it returns the location value where you have found the numeric data) having got that do a substring to get the STATE Name.

Hope this helps....


Regards,

SM<
Regards,
Shree
785-816-0728
ArndW
Participant
Posts: 16318
Joined: Tue Nov 16, 2004 9:08 am
Location: Germany
Contact:

Post by ArndW »

I'm not sure how much you are expecting to do here - if your data is coming in different order then you need to write (or buy) software to do this complex chore. QualityStage will do it for you.

The INDEX() function, as mentioned earlier, can let you parse elements from your string. For example, INDEX(In.DataColumn,' ',2) will give you "M22" and "CD" from your sample data.
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

A fully general solution using this approach would require an unwieldy expression involving ten Index() functions, one for each numeric character, and a lot of Or operators.

You would probably be better off either using QualityStage (as mentioned) which has inbuilt name and address parsing capability, or creating your own parallel routine.

Another possibility, especially if your data volumes are not huge, is to make use of the pattern matching capabilities of the DataStage BASIC language (Matches operator and MatchField() function) using a BASIC Transformer stage.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Raamc
Premium Member
Premium Member
Posts: 87
Joined: Mon Aug 20, 2007 9:08 am

Post by Raamc »

I have been solved this problem by using the Excel logic and datastage job.

I used the datastage job to bring the data into Uni format and written it into XLS file. In excel i written a formula to compare the strings and sorted out the problem.


Thank you for all your valuable suggessions and support.

Cheers,
Raamc :D
Raamc
Premium Member
Premium Member
Posts: 87
Joined: Mon Aug 20, 2007 9:08 am

Post by Raamc »

I have been solved this problem by using the Excel logic and datastage job.

I used the datastage job to bring the data into Uni format and written it into XLS file. In excel i written a formula to compare the strings and sorted out the problem.


Thank you for all your valuable suggessions and support.

Cheers,
Raamc :D
Post Reply