Functions in datastage
Moderators: chulett, rschirm, roy
Functions in datastage
How to find 4 consecutive repetition of same alphabet in a string.Where we are not sure about the alphabet it can be anything from A-Z
Ranjini
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
Welcome aboard.
Do you have knowledge of the "alphabet" at job run time? If so you can make it a job parameter and use an Index() function to search for four contiguous occurrences of that value.will be zero if not found or some non-zero value (location in the column) if found.
Do you have knowledge of the "alphabet" at job run time? If so you can make it a job parameter and use an Index() function to search for four contiguous occurrences of that value.
Code: Select all
Index(InLink.TheColumn, "#jpLetter##jpLetter##jpLetter##jpLetter#", 1)
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
-
- Premium Member
- Posts: 187
- Joined: Thu Apr 14, 2011 5:10 pm
Code: Select all
Index(InLink.TheColumn,"AAAA",1)
This record can be captured as reject reject record if we check for occurrences whether Zero's or not (All Non-Zero values)
Let me be clear with the requirement.
If the column from the file has data NSAAAAI then the record is rejected since there is 4 consecutive A's in it.
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
Depends what's in InLink.TheColumn - which you did not indicate before making this assertion.Developer9 wrote:Above Expression gives the position as 2 .
I did not read anything in the original requirement about rejecting records - only about determining whether there are four consecutive occurrences of the same "alphabet" (which I took to mean "alphabetic character").Developer9 wrote:Let me be clear with the requirement.
If the column from the file has data NSAAAAI then the record is rejected since there is 4 consecutive A's in it.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Index isn't going to be helpful here, unless perhaps you're willing to execute it 26 times. Seems to me you'll need something more like a Regular Expression to detect the presence of four contiguous occurances of the same letter in your data.
-craig
"You can never have too many knives" -- Logan Nine Fingers
"You can never have too many knives" -- Logan Nine Fingers
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
This is do-able in DataStage if you have the Data Rules stage, which implies version 8.7FP1 or later and an Information Analyzer licence. One of the possible tests for this stage is whether or not the data matches a regular expression (matches_regex).chulett wrote:Seems to me you'll need something more like a Regular Expression to detect the presence of four contiguous occurances of the same letter in your data.
Otherwise you could create a BuildOp or leverage the Java capabilities of DataStage to test the regular expression.
Yet another possibility would be to use grep in an External Filter stage.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
-
- Premium Member
- Posts: 1735
- Joined: Thu Mar 01, 2007 5:44 am
- Location: Troy, MI
-
- Premium Member
- Posts: 187
- Joined: Thu Apr 14, 2011 5:10 pm