Multiple tables Lookup Issue

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
jsyed
Participant
Posts: 8
Joined: Thu May 29, 2008 9:12 am

Multiple tables Lookup Issue

Post by jsyed »

Hi there,

I've a job which is linking 1 source table with 30 reference tables with lookup stage. But the job is not reading from the source and 4 reference tables. I checked in the job monitor and it's showing source+4 unread reference tables in ready status and rest in running status. My reference tables are very small and have maximum of 5000 rows and the source table has 13 million records.

Any clue why this is behaving like this. I checked the memory with df cmd and it's showing plenty of free memory in the unix box. I don't know whether this is a design issue or memory allocation issue.

I don't find any other way of joining multiple tables in parallel job. I'm new to data stage will highly appreciate if somebody let me know if my design is wrong and the other ways of joining multiple tables.


Thanks in advance

-JS
bkumar103
Participant
Posts: 214
Joined: Wed Jul 25, 2007 2:29 am
Location: Chennai

Post by bkumar103 »

How are you doing the lookup. all the 20 ref table are used in one lookup stage, if yes then try to break the single lookup stage to multiple.
Birendra
jsyed
Participant
Posts: 8
Joined: Thu May 29, 2008 9:12 am

Post by jsyed »

Hi Birendra,

I tried splitting the tables with two lookups but still facing the same problem. I checked the job monitor after adding two lookups and found that the reference tables from both lookups are loading at the same time. I think this is the issue as if it's load the reference tables from first lookup and then goes to the second one then I believe it should work. But as it's loading reference tables of both lookups together so it's working exactly the same like having one lookup.

It's something related with memory allocation ? Generally how much memory is required to run job with multiple lookups.


Thanks
Javed


bkumar103 wrote:How are you doing the lookup. all the 20 ref table are used in one lookup stage, if yes then try to break the single lookup stage to multiple.
priyadarshikunal
Premium Member
Premium Member
Posts: 1735
Joined: Thu Mar 01, 2007 5:44 am
Location: Troy, MI

Post by priyadarshikunal »

jsyed wrote:Hi Birendra,

I tried splitting the tables with two lookups but still facing the same problem. I checked the job monitor after adding two lookups and found that the reference tables from both lookups are loading at the same time. I think this is the issue as if it's load the reference tables from first lookup and then goes to the second one then I believe it should work. But as it's loading reference tables of both lookups together so it's working exactly the same like having one lookup.

It's something related with memory allocation ? Generally how much memory is required to run job with multiple lookups.


Thanks
Javed


bkumar103 wrote:How are you doing the lookup. all the 20 ref table are used in one lookup stage, if yes then try to break the single lookup stage to multiple.

yes lookup is dependent on physical memory but i think it should not take that much space
first of all try disabling the operator combination when using two lookup stages.

If server doesn't have much memory the job should abort.

If it doesn't abort then you may have other problems.

Are you getting the correct result?
Priyadarshi Kunal

Genius may have its limitations, but stupidity is not thus handicapped. :wink:
jsyed
Participant
Posts: 8
Joined: Thu May 29, 2008 9:12 am

Post by jsyed »

Hi Kunal,

What do you mean by your comment disabling the operator combination when using two lookup stages" Could you please elaborate your comment.

I'm not getting any records as job is waiting for the 4 tables to be run. It's showing "delayed metadata" in the job log for the tables which are not running and when I'm checking job monitor it's showing the status "ready" for the same tables.

Is there any other way of joining multiple tables ?

Thanks
Javed

priyadareshikunal wrote:
jsyed wrote:Hi Birendra,

I tried splitting the tables with two lookups but still facing the same problem. I checked the job monitor after adding two lookups and found that the reference tables from both lookups are loading at the same time. I think this is the issue as if it's load the reference tables from first lookup and then goes to the second one then I believe it should work. But as it's loading reference tables of both lookups together so it's working exactly the same like having one lookup.

It's something related with memory allocation ? Generally how much memory is required to run job with multiple lookups.


Thanks
Javed


bkumar103 wrote:How are you doing the lookup. all the 20 ref table are used in one lookup stage, if yes then try to break the single lookup stage to multiple.

yes lookup is dependent on physical memory but i think it should not take that much space
first of all try disabling the operator combination when using two lookup stages.

If server doesn't have much memory the job should abort.

If it doesn't abort then you may have other problems.

Are you getting the correct result?
priyadarshikunal
Premium Member
Premium Member
Posts: 1735
Joined: Thu Mar 01, 2007 5:44 am
Location: Troy, MI

Post by priyadarshikunal »

Disabling Combination means set APT_DISABLE_COMBINATION to True
Else it combines the operator to optomize the performance.

What happens when you perform lookup for only those 4 tables?

is the result same?
Priyadarshi Kunal

Genius may have its limitations, but stupidity is not thus handicapped. :wink:
jsyed
Participant
Posts: 8
Joined: Thu May 29, 2008 9:12 am

Multiple Lookup Issue

Post by jsyed »

Hi Kunal,

Thanks for replying. If I'm running the job with only 4 lookup so it's working fine and these reference tables are not same on every run. The issue here is on every run the job takes the reference tables randomly.

I strongly believe that this is the memory problem. Let me try some other alternative. I'll keep you posted.

Thanks
JAved



priyadarshikunal wrote:Disabling Combination means set APT_DISABLE_COMBINATION to True
Else it combines the operator to optomize the performance.

What happens when you perform lookup for only those 4 tables?

is the result same?
Brian Zhang
Participant
Posts: 39
Joined: Tue Jun 03, 2008 12:13 am
Location: Melbourne

Post by Brian Zhang »

What kind of DB stage you are using to perform lookup operation with Lookup stage?
If Oracle/DB2 Enterprise stage, try Sparse Lookup instead of Normal Lookup. Sparse lookup would reach reference table in DB directory other than operating in memory.

Hope that would be helpful.

Thanks
Brian
jsyed
Participant
Posts: 8
Joined: Thu May 29, 2008 9:12 am

Post by jsyed »

Hi Brian,

Thanks for posting. As I'm using Sybase IQ as the DB so I think sparse lookup is not the option :(

Thanks
Javed

Brian Zhang wrote:What kind of DB stage you are using to perform lookup operation with Lookup stage?
If Oracle/DB2 Enterprise stage, try Sparse Lookup instead of Normal Lookup. Sparse lookup would reach reference table in DB directory other than operating in memory.

Hope that would be helpful.

Thanks
Brian
Post Reply