How to specify lookup found condition in EE ?

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
kaps
Participant
Posts: 452
Joined: Tue May 10, 2005 12:36 pm

How to specify lookup found condition in EE ?

Post by kaps »

Hi

We just migrated to EE from server. We would like to specify Lookup found condition in the lookup stage but there is no option for that. I can see lookup failure option but I don't see a lookup success option in there.

For restartability of the jobs we usually do lookup to the target table and if the record is there we will reject it. but I am not sure how to do that in EE.
In server we say Not(NotFound(link)).

Job design is:

(DB2API)
|
|
SeqFile Stage-->Transformer-->Lookup Stage--->Target(DB2API)
Please advise.

Thanks
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

By default, failed lookups do not return anything, unlike Server. You can, however, tell it to 'forward reject rows' or something similar via a property setting and it will include both on the output link. Then you check like you might have in Server, by seeing if the output is null or not null, with the former indicating failure and the latter success.

That's what I recall, anyway, got no way to check anymore. :(
-craig

"You can never have too many knives" -- Logan Nine Fingers
kaps
Participant
Posts: 452
Joined: Tue May 10, 2005 12:36 pm

Post by kaps »

I tried the method suggested by Craig. I had "Continue" option selected in "Lookup Failure" and in the next transformer I have checked for the column coming out of Lookup for NULL value(alslo checked for 0) which did not work. Both records does not pass through the transformer. One suppose to go as it's not in the table.

Basically I am trying to check for the existence of record in the table before inserting it. How do we do this ?

If I "reject" on the lookup failure then it works correctly. but not sure if that's the best practice.

Please advise.
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

If all you want are the failures, it's perfectly fine to capture the rejects like it sounds you did. For "Continue" however, did you check to see what the actual values were that "did not work" in your constraint? Push them out to a Peek stage so you know why it is not working... then make it work correctly.
-craig

"You can never have too many knives" -- Logan Nine Fingers
kaps
Participant
Posts: 452
Joined: Tue May 10, 2005 12:36 pm

Dynamic lookup...

Post by kaps »

I mentioned 0 and it worked. Now I have my constraints as follows:

If Column_From_Lookup = 0 then Insert Else Update and then I am passing only Inserts to Target.

I am facing a different problem though...when we have both Insert and update coming in the same load even though I mentioned Array size as 1 in the target the update does not go to the update link and it goes to Insert link and fails because of the database constraints.

Does it mean when we have Transformer stage between Lookup stage and Target stage the dynamic lookup won't work ? How do we handle this in EE ?

Thanks for the help...
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

That behaviour is also driven by whether you have a default value defined for the particular field.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
mk_ds09
Participant
Posts: 72
Joined: Sun Jan 25, 2009 4:50 pm
Location: Pune

Post by mk_ds09 »

I mentioned 0 and it worked. Now I have my constraints as follows:

If Column_From_Lookup = 0 then Insert Else Update and then I am passing only Inserts to Target.

I am facing a different problem though...when we have both Insert and update coming in the same load even though I mentioned Array size as 1 in the target the update does not go to the update link and it goes to Insert link and fails because of the database constraints.

Does it mean when we have Transformer stage between Lookup stage and Target stage the dynamic lookup won't work ? How do we handle this in EE ?

Thanks for the help...



For Update and Insert :-

There is option as 'Upsert' in the EE , which can be used to update and insert depending on the values.

In between the target state and lookup,you can think of putting the Copy stage.

One more thing..What exactly you want to convey by 'dymanic lookup' ??
-----------------------------------
Regards
MK

What would you attempt to do if you knew you could not fail?

-----------------------------------
kaps
Participant
Posts: 452
Joined: Tue May 10, 2005 12:36 pm

Post by kaps »

We can't do upsert as we do capture intra day changes and moreover using upsert will slow thigns down.

What I meant by dynamic lookup is that when I insert a row in the table I shoule be able to see that in the lookup immediately as we capture intra day changes and we may get insert and update in the same load.
jcthornton
Premium Member
Premium Member
Posts: 79
Joined: Thu Mar 22, 2007 4:58 pm
Location: USA

Post by jcthornton »

Assuming that what you have described means you get both inserts and updates (to those same inserts) in the same batch, you will probably do better to change your design.

Upsert is the normal way to handle it, with the choice of Insert/Update first dependent upon which action will occur most often.

If you cannot do that, use a Sort and a Transformer to identify your in-batch Inserts and Updates (you then only need to do the DB lookup for the Inserts - which may improve performance). Your in batch Inserts will simply be the first record for that key in the batch - everything else is an Update.
Jack Thornton
----------------
Spectacular achievement is always preceded by spectacular preparation - Robert H. Schuller
kaps
Participant
Posts: 452
Joined: Tue May 10, 2005 12:36 pm

Post by kaps »

But what If I have multiple updates coming in the same batch ? Now, some of those updates I may not want to take it to Target as they may be dummy. For this I need to look for the last record updated in the Database which is possible only with dynamic lookup.

I think to get the functionality of dynamic lookup's(sparse) in EE, we need to use EE stage. Other stages it's not possible.

Please correct me If I am wrong.
jcthornton
Premium Member
Premium Member
Posts: 79
Joined: Thu Mar 22, 2007 4:58 pm
Location: USA

Post by jcthornton »

You can use the EE stage to do a dynamic lookup via the sparse option. However, your update cannot be guaranteed to complete before the next record calls the lookup. Actually, not only is it not guaranteed - it will probably not be updated on the database before the next lookup.

To detect your dummy updates, use a transformer with a sorted/partitioned input on your keys. Setup stage variables to hold the 'previous' values you want to compare against. Perform your update logic within the transformer.

Then, depending on the requirements for performing updates, you can use constraints or downstream filters to remove records that are 'dummy' and should not be put into the database.

The key point is that you cannot use the database in a transactional manner during the run of a single job.

Anytime that processing step depends on the result of a previous record in the same run, it screams 'Transformer' to me.
Jack Thornton
----------------
Spectacular achievement is always preceded by spectacular preparation - Robert H. Schuller
Post Reply