Page 1 of 1

IN , CASE Clause in FILTER stage

Posted: Wed Sep 12, 2012 4:07 am
by karthi_gana
All,

I have a requirement.

We will receive country code as input to the input parameter.

Based on that i have to amend some more country code.

say for example, If i receive country code 'EU', i have to check like below

EU = GB, DE, EU

I have a dataset with country code column.

If the input parameter value is 'EU' , I have to read all those three country (EU, DE, GB ) data from the file.

Inputs are welcome!

Posted: Wed Sep 12, 2012 7:29 am
by chulett
:?

What are your rules for this "amendment"? How do you know what to amend the input parameter with... is it always just something you do with "EU" or are there others that need the same treatment?

Posted: Wed Sep 12, 2012 9:07 am
by srinivas.g
Use Constriant in transformer stage.

(Coutry_code='GB' or Coutnry_code=DE or coutnry_code= EU)
and <<Job_parmeter>>='EU'

Posted: Wed Sep 12, 2012 10:22 am
by jwiles
I believe a transformer with a combination of stage variables and a constraint may work.

Two stage variables, one to contain the list of codes to check (use if-then-else logic in the Initial Value derivation to set something like "|EU|GB|DE|") and one to contain the result of the check (use the Index() function to compare the incoming record's code against the list--set to @TRUE if the code is found).

The constraint simply checks the value of the second stage variable for @TRUE

Alternatively, you could run the job using a job sequence which builds the list of codes and passes that as a parameter to the job.

Regards,

Posted: Wed Sep 12, 2012 9:15 pm
by karthi_gana
I used Transofrmer and Constrait to do this. It works fine. But i just wanted to check is it possible to do in FILTER.

Posted: Thu Sep 13, 2012 3:21 am
by jerome_rajan
But why would you want to use a filter when you can achieve it with a parallel transformer? The transformer is more efficient than the filter

Posted: Thu Sep 13, 2012 4:40 am
by ArndW
No, I do not think it is; the filter stage will be more efficient than putting a in transform stage with constraints.

Posted: Thu Sep 13, 2012 4:50 am
by jerome_rajan
I used to think the same till I came across the below in an IBM redbook (Data flow and Design)
The following guidelines should be followed when constructing parallel jobs in IBM InfoSphere DataStage Enterprise Edition:
Never use Server Edition components (BASIC Transformer, Server Shared Containers) within a parallel job. BASIC Routines are appropriate only for job control sequences.
Always use parallel Data Sets for intermediate storage between jobs unless that specific data also needs to be shared with other applications.
Use the Copy stage as a placeholder for iterative design, and to facilitate default type conversions.
Use the parallel Transformer stage (not the BASIC Transformer) instead of the Filter or Switch stages.
Use BuildOp stages only when logic cannot be implemented in the parallel Transformer.
No real justification found. I can however cite a reference to one of my own jobs. WHen i replaced the Filter with a Transformer, there was a notable improvement in performance. The job was processing close to 10 million records. It is possible that the Filter might perform better for lesser volumes but I cannot confirm that.

Posted: Thu Sep 13, 2012 5:25 am
by ArndW
Jerome,

it wouldn't make sense for the filter to be quicker with smaller volumes; I'll take the Redbook entry at face value and admit that the transform stage will be faster than the filter stage. Since transforms are compiled into c++ routines they will be relatively quick at runtime, but the datatypes are converted and I assumed a purpose-built stage such as the filter stage would be significantly faster. One never stops learning, thanks!

Posted: Thu Sep 13, 2012 7:21 am
by chulett
Keep in mind the fact that the Redbook mentioned was written in 2008 for the 8.1 release. Not that that invalidates the quotes, just... things can change. Actual hands-on experience helps when in doubt. :wink:

Posted: Thu Sep 13, 2012 7:46 am
by jwiles
The Filter stage doesn't have the IN or CASE logic available (see the Filter stage docs here. You could implement CCODE = 'GB' or CCODE = 'DE' or CCODE = 'EU' in the Where clause, and possibly pass this in as a job parameter, but I believe the transformer will still be more efficient.

Posted: Thu Sep 13, 2012 9:58 am
by rameshrr3
That comment in the redbook is perfectly valid . They suggest using a filter ( Or Switch) stages only if the condition expression to be evaluated is parametrized ( and evaluated at runtime ) .
Otherwise for all fixed conditions - go witha transformer. I doubt if they will mention it multiple times without having a justification.

As the OP asked for the CASE feature - the switch stage /operator implements the equivalent of CASE- but you can very well do something similar witha filter - just that your condition expression will have multiple OR clauses .

Interestingly Standard C/C++ style Switch Case construct is NOT supported in the PX Transformer Language (the subset of C/C++ to create PX routines) you can lookup to the docu for this . The PX transfomer routine will need a long list of If Else If's to do a switch case equivalent

Only Transformer compile times are longer compared to native osh stages. Transformer runtime performance is good. With earlier versions of transformer ( 8.0/8.01) only a certain number of derivations that you could put in a transformer stage before it would hang at compile time.