filter with regular expression

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
PeterPol
Premium Member
Premium Member
Posts: 73
Joined: Wed Mar 08, 2006 8:59 am

filter with regular expression

Post by PeterPol »

Hello,

I tried to filter records from a table using the following regular expression:

idnummer like '[P-Q][0-9][0-9][0-9][0-9][0-9][0-9][0-9]'

(to get the idnummers that begin with P or Q directly followed by 7 digits).

However, when I run the job I get the following error:
Filter_3: Literal regular expression formatted incorrectly. Got: '[P-Q][0-9][0-9][0-9][0-9][0-9][0-9][0-9]'.

idnummer has type varchar(8)

The strange thing is that when I trim the last [0-9] the expression seems valid: the job runs without error (although this is not what I want...)

Any ideas what goed wrong?

Peter
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

It looks fine - as a regular expression - to me. Perhaps the Filter stage has an undocumented limit on the length of the regular expression. You ought to ask your support provider. Meanwhile can someone else with access to parallel jobs try to reproduce Peter's issue?
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
PeterPol
Premium Member
Premium Member
Posts: 73
Joined: Wed Mar 08, 2006 8:59 am

Post by PeterPol »

Hello,

Can it be that the filter stage does not support the use of brackets in expressions?

In the meantime I was experimenting with external filter stage (using awk) and I found it very useful for my purposes.
I want to build a job that checks if a field in a given dataset matches a certain (user given) pattern, like string starts with ZZ and is 10 long, or string contains only 'A' or 'B', or string is fully in lowercase, etc .

By using the external filter stage with Awk I was able to do this kind of filtering. However, what I actually want are the lines that do NOT match the given awk pattern.

suppose I have the following filter: "/^[Aa]b/"
How do I filter all fields that do not match this pattern?

Unfortunately, the external filter stage has no reject link...

Peter
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

So what you're looking for is an awk expert?

Can you filter it (in the External Filter stage) with grep, which may be more efficient, and in which it's easier to filter the logical inverse (that means using the -v option in the grep command).
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Post Reply