IN , CASE Clause in FILTER stage

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
karthi_gana
Premium Member
Premium Member
Posts: 729
Joined: Tue Apr 28, 2009 10:49 pm

IN , CASE Clause in FILTER stage

Post by karthi_gana »

All,

I have a requirement.

We will receive country code as input to the input parameter.

Based on that i have to amend some more country code.

say for example, If i receive country code 'EU', i have to check like below

EU = GB, DE, EU

I have a dataset with country code column.

If the input parameter value is 'EU' , I have to read all those three country (EU, DE, GB ) data from the file.

Inputs are welcome!
Karthik
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

:?

What are your rules for this "amendment"? How do you know what to amend the input parameter with... is it always just something you do with "EU" or are there others that need the same treatment?
-craig

"You can never have too many knives" -- Logan Nine Fingers
srinivas.g
Participant
Posts: 251
Joined: Mon Jun 09, 2008 5:52 am

Post by srinivas.g »

Use Constriant in transformer stage.

(Coutry_code='GB' or Coutnry_code=DE or coutnry_code= EU)
and <<Job_parmeter>>='EU'
Srinu Gadipudi
jwiles
Premium Member
Premium Member
Posts: 1274
Joined: Sun Nov 14, 2004 8:50 pm
Contact:

Post by jwiles »

I believe a transformer with a combination of stage variables and a constraint may work.

Two stage variables, one to contain the list of codes to check (use if-then-else logic in the Initial Value derivation to set something like "|EU|GB|DE|") and one to contain the result of the check (use the Index() function to compare the incoming record's code against the list--set to @TRUE if the code is found).

The constraint simply checks the value of the second stage variable for @TRUE

Alternatively, you could run the job using a job sequence which builds the list of codes and passes that as a parameter to the job.

Regards,
- james wiles


All generalizations are false, including this one - Mark Twain.
karthi_gana
Premium Member
Premium Member
Posts: 729
Joined: Tue Apr 28, 2009 10:49 pm

Post by karthi_gana »

I used Transofrmer and Constrait to do this. It works fine. But i just wanted to check is it possible to do in FILTER.
Karthik
jerome_rajan
Premium Member
Premium Member
Posts: 376
Joined: Sat Jan 07, 2012 12:25 pm
Location: Piscataway

Post by jerome_rajan »

But why would you want to use a filter when you can achieve it with a parallel transformer? The transformer is more efficient than the filter
Jerome
Data Integration Consultant at AWS
Connect With Me On LinkedIn

Life is really simple, but we insist on making it complicated.
ArndW
Participant
Posts: 16318
Joined: Tue Nov 16, 2004 9:08 am
Location: Germany
Contact:

Post by ArndW »

No, I do not think it is; the filter stage will be more efficient than putting a in transform stage with constraints.
jerome_rajan
Premium Member
Premium Member
Posts: 376
Joined: Sat Jan 07, 2012 12:25 pm
Location: Piscataway

Post by jerome_rajan »

I used to think the same till I came across the below in an IBM redbook (Data flow and Design)
The following guidelines should be followed when constructing parallel jobs in IBM InfoSphere DataStage Enterprise Edition:
Never use Server Edition components (BASIC Transformer, Server Shared Containers) within a parallel job. BASIC Routines are appropriate only for job control sequences.
Always use parallel Data Sets for intermediate storage between jobs unless that specific data also needs to be shared with other applications.
Use the Copy stage as a placeholder for iterative design, and to facilitate default type conversions.
Use the parallel Transformer stage (not the BASIC Transformer) instead of the Filter or Switch stages.
Use BuildOp stages only when logic cannot be implemented in the parallel Transformer.
No real justification found. I can however cite a reference to one of my own jobs. WHen i replaced the Filter with a Transformer, there was a notable improvement in performance. The job was processing close to 10 million records. It is possible that the Filter might perform better for lesser volumes but I cannot confirm that.
Jerome
Data Integration Consultant at AWS
Connect With Me On LinkedIn

Life is really simple, but we insist on making it complicated.
ArndW
Participant
Posts: 16318
Joined: Tue Nov 16, 2004 9:08 am
Location: Germany
Contact:

Post by ArndW »

Jerome,

it wouldn't make sense for the filter to be quicker with smaller volumes; I'll take the Redbook entry at face value and admit that the transform stage will be faster than the filter stage. Since transforms are compiled into c++ routines they will be relatively quick at runtime, but the datatypes are converted and I assumed a purpose-built stage such as the filter stage would be significantly faster. One never stops learning, thanks!
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Keep in mind the fact that the Redbook mentioned was written in 2008 for the 8.1 release. Not that that invalidates the quotes, just... things can change. Actual hands-on experience helps when in doubt. :wink:
-craig

"You can never have too many knives" -- Logan Nine Fingers
jwiles
Premium Member
Premium Member
Posts: 1274
Joined: Sun Nov 14, 2004 8:50 pm
Contact:

Post by jwiles »

The Filter stage doesn't have the IN or CASE logic available (see the Filter stage docs here. You could implement CCODE = 'GB' or CCODE = 'DE' or CCODE = 'EU' in the Where clause, and possibly pass this in as a job parameter, but I believe the transformer will still be more efficient.
- james wiles


All generalizations are false, including this one - Mark Twain.
rameshrr3
Premium Member
Premium Member
Posts: 609
Joined: Mon May 10, 2004 3:32 am
Location: BRENTWOOD, TN

Post by rameshrr3 »

That comment in the redbook is perfectly valid . They suggest using a filter ( Or Switch) stages only if the condition expression to be evaluated is parametrized ( and evaluated at runtime ) .
Otherwise for all fixed conditions - go witha transformer. I doubt if they will mention it multiple times without having a justification.

As the OP asked for the CASE feature - the switch stage /operator implements the equivalent of CASE- but you can very well do something similar witha filter - just that your condition expression will have multiple OR clauses .

Interestingly Standard C/C++ style Switch Case construct is NOT supported in the PX Transformer Language (the subset of C/C++ to create PX routines) you can lookup to the docu for this . The PX transfomer routine will need a long list of If Else If's to do a switch case equivalent

Only Transformer compile times are longer compared to native osh stages. Transformer runtime performance is good. With earlier versions of transformer ( 8.0/8.01) only a certain number of derivations that you could put in a transformer stage before it would hang at compile time.
Post Reply