Page 1 of 1

How to specify lookup found condition in EE ?

Posted: Fri Jun 04, 2010 11:10 am
by kaps
Hi

We just migrated to EE from server. We would like to specify Lookup found condition in the lookup stage but there is no option for that. I can see lookup failure option but I don't see a lookup success option in there.

For restartability of the jobs we usually do lookup to the target table and if the record is there we will reject it. but I am not sure how to do that in EE.
In server we say Not(NotFound(link)).

Job design is:

(DB2API)
|
|
SeqFile Stage-->Transformer-->Lookup Stage--->Target(DB2API)
Please advise.

Thanks

Posted: Fri Jun 04, 2010 11:15 am
by chulett
By default, failed lookups do not return anything, unlike Server. You can, however, tell it to 'forward reject rows' or something similar via a property setting and it will include both on the output link. Then you check like you might have in Server, by seeing if the output is null or not null, with the former indicating failure and the latter success.

That's what I recall, anyway, got no way to check anymore. :(

Posted: Fri Jun 04, 2010 2:33 pm
by kaps
I tried the method suggested by Craig. I had "Continue" option selected in "Lookup Failure" and in the next transformer I have checked for the column coming out of Lookup for NULL value(alslo checked for 0) which did not work. Both records does not pass through the transformer. One suppose to go as it's not in the table.

Basically I am trying to check for the existence of record in the table before inserting it. How do we do this ?

If I "reject" on the lookup failure then it works correctly. but not sure if that's the best practice.

Please advise.

Posted: Fri Jun 04, 2010 3:58 pm
by chulett
If all you want are the failures, it's perfectly fine to capture the rejects like it sounds you did. For "Continue" however, did you check to see what the actual values were that "did not work" in your constraint? Push them out to a Peek stage so you know why it is not working... then make it work correctly.

Dynamic lookup...

Posted: Mon Jun 07, 2010 5:01 pm
by kaps
I mentioned 0 and it worked. Now I have my constraints as follows:

If Column_From_Lookup = 0 then Insert Else Update and then I am passing only Inserts to Target.

I am facing a different problem though...when we have both Insert and update coming in the same load even though I mentioned Array size as 1 in the target the update does not go to the update link and it goes to Insert link and fails because of the database constraints.

Does it mean when we have Transformer stage between Lookup stage and Target stage the dynamic lookup won't work ? How do we handle this in EE ?

Thanks for the help...

Posted: Mon Jun 07, 2010 6:10 pm
by ray.wurlod
That behaviour is also driven by whether you have a default value defined for the particular field.

Posted: Tue Jun 08, 2010 3:58 am
by mk_ds09
I mentioned 0 and it worked. Now I have my constraints as follows:

If Column_From_Lookup = 0 then Insert Else Update and then I am passing only Inserts to Target.

I am facing a different problem though...when we have both Insert and update coming in the same load even though I mentioned Array size as 1 in the target the update does not go to the update link and it goes to Insert link and fails because of the database constraints.

Does it mean when we have Transformer stage between Lookup stage and Target stage the dynamic lookup won't work ? How do we handle this in EE ?

Thanks for the help...



For Update and Insert :-

There is option as 'Upsert' in the EE , which can be used to update and insert depending on the values.

In between the target state and lookup,you can think of putting the Copy stage.

One more thing..What exactly you want to convey by 'dymanic lookup' ??

Posted: Tue Jun 08, 2010 8:56 am
by kaps
We can't do upsert as we do capture intra day changes and moreover using upsert will slow thigns down.

What I meant by dynamic lookup is that when I insert a row in the table I shoule be able to see that in the lookup immediately as we capture intra day changes and we may get insert and update in the same load.

Posted: Tue Jun 08, 2010 10:41 am
by jcthornton
Assuming that what you have described means you get both inserts and updates (to those same inserts) in the same batch, you will probably do better to change your design.

Upsert is the normal way to handle it, with the choice of Insert/Update first dependent upon which action will occur most often.

If you cannot do that, use a Sort and a Transformer to identify your in-batch Inserts and Updates (you then only need to do the DB lookup for the Inserts - which may improve performance). Your in batch Inserts will simply be the first record for that key in the batch - everything else is an Update.

Posted: Wed Jun 09, 2010 9:32 am
by kaps
But what If I have multiple updates coming in the same batch ? Now, some of those updates I may not want to take it to Target as they may be dummy. For this I need to look for the last record updated in the Database which is possible only with dynamic lookup.

I think to get the functionality of dynamic lookup's(sparse) in EE, we need to use EE stage. Other stages it's not possible.

Please correct me If I am wrong.

Posted: Wed Jun 09, 2010 3:10 pm
by jcthornton
You can use the EE stage to do a dynamic lookup via the sparse option. However, your update cannot be guaranteed to complete before the next record calls the lookup. Actually, not only is it not guaranteed - it will probably not be updated on the database before the next lookup.

To detect your dummy updates, use a transformer with a sorted/partitioned input on your keys. Setup stage variables to hold the 'previous' values you want to compare against. Perform your update logic within the transformer.

Then, depending on the requirements for performing updates, you can use constraints or downstream filters to remove records that are 'dummy' and should not be put into the database.

The key point is that you cannot use the database in a transactional manner during the run of a single job.

Anytime that processing step depends on the result of a previous record in the same run, it screams 'Transformer' to me.