view read

Post questions here relative to DataStage Server Edition for such areas as Server job design, DS Basic, Routines, Job Sequences, etc.

Moderators: chulett, rschirm, roy

Post Reply
bryan
Participant
Posts: 91
Joined: Sat Feb 21, 2004 1:17 pm

view read

Post by bryan »

We are sourcing from views which has like 15 million rows and we do 5 validations against hash lookup. Writing into a sequential file.
Read throughput:300rows/sec


On an average, each million takes one hour.

How can I do to leverage Datastae/AIX power?

5CPU
2GB memory
AIX 5.0
kcbland
Participant
Posts: 5208
Joined: Wed Jan 15, 2003 8:56 am
Location: Lutz, FL
Contact:

Post by kcbland »

Convert your job to write to a parameterized output file and run 5 instances of it. Have the select statement portion the data out into 1/5th chunks, perhaps using a WHERE MOD(somekeycolumn,5) = 0,1,2,3,4 where 0 to 4 is fed as a value to each instance.

Hopefully, your source database can handle you hitting that view 5 times simultaneously. If it can, you should finish 5X faster.

This technique is the only technique that can scale in multiples of performance, which is what you want. I'm fairly confident that no amount of hash file tweaking and tuning will give you a five-fold increase. But, tune them anyway so that each instance is optimally doing reference lookups.
Kenneth Bland

Rank: Sempai
Belt: First degree black
Fight name: Captain Hook
Signature knockout: right upper cut followed by left hook
Signature submission: Crucifix combined with leg triangle
T42
Participant
Posts: 499
Joined: Thu Nov 11, 2004 6:45 pm

Post by T42 »

Or buy DataStage EE and let the code do the work for you most of the time. :)
kcbland
Participant
Posts: 5208
Joined: Wed Jan 15, 2003 8:56 am
Location: Lutz, FL
Contact:

Post by kcbland »

Smart aleck young whippersnapper.
Kenneth Bland

Rank: Sempai
Belt: First degree black
Fight name: Captain Hook
Signature knockout: right upper cut followed by left hook
Signature submission: Crucifix combined with leg triangle
Sainath.Srinivasan
Participant
Posts: 3337
Joined: Mon Jan 17, 2005 4:49 am
Location: United Kingdom

Post by Sainath.Srinivasan »

Set inter-process buffer.

Use IPC-Stage.

Partition your hash files.

Use multi-instance jobs.

You can do lots in Server component itself.
Post Reply