Functions in datastage

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
Ranjini
Participant
Posts: 1
Joined: Tue Oct 22, 2013 7:05 am

Functions in datastage

Post by Ranjini »

How to find 4 consecutive repetition of same alphabet in a string.Where we are not sure about the alphabet it can be anything from A-Z
Ranjini
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Welcome aboard.

Do you have knowledge of the "alphabet" at job run time? If so you can make it a job parameter and use an Index() function to search for four contiguous occurrences of that value.

Code: Select all

Index(InLink.TheColumn, "#jpLetter##jpLetter##jpLetter##jpLetter#", 1)
will be zero if not found or some non-zero value (location in the column) if found.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Developer9
Premium Member
Premium Member
Posts: 187
Joined: Thu Apr 14, 2011 5:10 pm

Post by Developer9 »

Code: Select all

Index(InLink.TheColumn,"AAAA",1)
Above Expression gives the position as 2 .

This record can be captured as reject reject record if we check for occurrences whether Zero's or not (All Non-Zero values)

Let me be clear with the requirement.
If the column from the file has data NSAAAAI then the record is rejected since there is 4 consecutive A's in it.
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Developer9 wrote:Above Expression gives the position as 2 .
Depends what's in InLink.TheColumn - which you did not indicate before making this assertion.
Developer9 wrote:Let me be clear with the requirement.
If the column from the file has data NSAAAAI then the record is rejected since there is 4 consecutive A's in it.
I did not read anything in the original requirement about rejecting records - only about determining whether there are four consecutive occurrences of the same "alphabet" (which I took to mean "alphabetic character").
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Index isn't going to be helpful here, unless perhaps you're willing to execute it 26 times. Seems to me you'll need something more like a Regular Expression to detect the presence of four contiguous occurances of the same letter in your data.
-craig

"You can never have too many knives" -- Logan Nine Fingers
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

chulett wrote:Seems to me you'll need something more like a Regular Expression to detect the presence of four contiguous occurances of the same letter in your data.
This is do-able in DataStage if you have the Data Rules stage, which implies version 8.7FP1 or later and an Information Analyzer licence. One of the possible tests for this stage is whether or not the data matches a regular expression (matches_regex).

Otherwise you could create a BuildOp or leverage the Java capabilities of DataStage to test the regular expression.

Yet another possibility would be to use grep in an External Filter stage.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
priyadarshikunal
Premium Member
Premium Member
Posts: 1735
Joined: Thu Mar 01, 2007 5:44 am
Location: Troy, MI

Post by priyadarshikunal »

I remember regular expressions functionality in filter stage in some 8.x version onwards. Can't that be used?
Priyadarshi Kunal

Genius may have its limitations, but stupidity is not thus handicapped. :wink:
Developer9
Premium Member
Premium Member
Posts: 187
Joined: Thu Apr 14, 2011 5:10 pm

Post by Developer9 »

I was able to check the position of A's with my expression ..may be I need little more research before responding :?
Thanks
Post Reply