Splitting a Record based on a Pattern

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
Kalaiselva
Participant
Posts: 13
Joined: Sun Jul 11, 2010 2:34 am

Splitting a Record based on a Pattern

Post by Kalaiselva »

Hi,

We have a requirement for reading stelink messages as below.

:20:ABCD122
:21:BCD3455
:32B:USD10000,00
:52B:1323467
ABC BANK
ABC OPERATIONS
NO:78, NEW TOWN
ATLANTA, NEW MEXICO 30265
:53D:1323467
BCD BANKS
BCD OPERATIONS
NO:92 FIRST SQUARE
INIDA
:50A:9612485
KJDKNUIOPA


This input record will have to be split based on the :XXX: tags. We did manage to split using Index and Field function for single line tags. Multiline tags were also split by using : as the next delimiter while capturing the record.

But now we are receiving colon ( : ) as part of the content itself. Hence breaking the code which we wrote.

Can somebody suggest me any ways to split the above mentioned record into mutiple columns based on the tag values?

Thanks,
Kalai Selvan
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Correct the arguments in your Field() function. For example everything after the second colon is given by

Code: Select all

Field(InLink.TheString, ":", 3, 999)
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
prasannakumarkk
Participant
Posts: 117
Joined: Wed Feb 06, 2013 9:24 am
Location: Chennai,TN, India

Post by prasannakumarkk »

Do the colon part of content should be acting as delimiter or data?
Thanks,
Prasanna
Kalaiselva
Participant
Posts: 13
Joined: Sun Jul 11, 2010 2:34 am

Post by Kalaiselva »

Hi Ray,

Thanks, the above code fetches data from the 2nd colon, but we need a way to stop before the next :XXX: tag starts. Thats the problem for us. We are not able to use ( : ) as the delimiter as the Data between tags are also having colon.

Thanks,
Kalai Selvan
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Is it the case that there is at most one tag per physical line? If so, test the line for beginning with a tag (or even with a colon); if not, simply accumulate the string. Form groups based on the tag and preserve only the last line in each group, probably with a Remove Duplicates stage.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Post Reply