How to removw the duplicate words in thecolumn

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
uppalapati2003
Participant
Posts: 70
Joined: Thu Nov 09, 2006 2:14 am

How to removw the duplicate words in thecolumn

Post by uppalapati2003 »

i have one string like "ram sita ravan ram lakshman"
i need to get out put "ram sita ravan lakshman"
please help me in this
Srini
ArndW
Participant
Posts: 16318
Joined: Tue Nov 16, 2004 9:08 am
Location: Germany
Contact:

Post by ArndW »

There is no single function in DataStage that will do this for you. It isn't a complex thing to do in either a Server BASIC program or a C++ routine, though. I would assume that the space character is the token delimiter and then write a loop to check each token with all following ones to check for and remove duplicates. By the time you reach the last token you will have removed all the duplicate.
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

What output is needed where there are case differences, such as "Ram sita ravan ram lakshman"? What output is needed where there are embedded substrings, such as "ram sita ravan ram lakshram"?

I agree that this falls into the do-it-yourself category but, before you do, tighten your specification as much as possible.

I suggest that this is easier done in DataStage BASIC than in C++, using dynamic arrays and the Locate statement. Of course, that suggestion is moderated by one's expertise with the two programming languages.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Post Reply