Performance tunning

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

kcbland
Participant
Posts: 5208
Joined: Wed Jan 15, 2003 8:56 am
Location: Lutz, FL
Contact:

Post by kcbland »

The average number of inserts and updates is very small. If you're using DIRECT path loading, you may find that the loading takes a lot longer because this mechanism is geared towards high volumes of inserts. Just inserting 1 row with DIRECT loading can take longer because of the methods used to update indexes. There's a point where CONVENTIONAL loading is more efficient.

I suggest you experiment with DML based (UPSERTS) loading to see if this reduces your runtime. Indexes only reduce runtime on updates if the index supports the primary key based update. All other indexes on attribute columns degrade insert/update performance.

You're mixing two issues: space and performance. If you want to talk about space consumption, that's a different situation. If you want to save space, use VARCHAR instead of CHAR and only store columns you care about. Otherwise, more indexes means more space used to store those indexes, so only index what you need.
Kenneth Bland

Rank: Sempai
Belt: First degree black
Fight name: Captain Hook
Signature knockout: right upper cut followed by left hook
Signature submission: Crucifix combined with leg triangle
vsi
Premium Member
Premium Member
Posts: 507
Joined: Wed Mar 15, 2006 1:44 pm

Post by vsi »

kcbland wrote:The average number of inserts and updates is very small. If you're using DIRECT path loading, you may find that the loading takes a lot longer because this mechanism is geared towards high volumes of inserts. Just inserting 1 row with DIRECT loading can take longer because of the methods used to update indexes. There's a point where CONVENTIONAL loading is more efficient.

I suggest you experiment with DML based (UPSERTS) loading to see if this reduces your runtime. Indexes only reduce runtime on updates if the index supports the primary key based update. All other indexes on attribute columns degrade insert/update performance.

You're mixing two issues: space and performance. If you want to talk about space consumption, that's a different situation. If you want to save space, use VARCHAR instead of CHAR and only store columns you care about. Otherwise, more indexes means more space used to store those indexes, so only index what you need.
Hi,

At the first i have to reduce the space.After that performance,so for reducing the space let me know what i have to do.

Thank you very much.
gnreddy
Participant
Posts: 8
Joined: Wed Jan 30, 2008 11:50 am
Location: Bangalore
Contact:

hai members

Post by gnreddy »

Hi, first of all use varchar instead of char as it reduces the memory space.
while loading bulkdata try to use orabulk stage for oracle table or BCP Loader for SQL Server or Sybase.Try to use indexes while loading the data
as performance increases.
Try to use SQL Statement while Extracting the data.
Hi,

At the first i have to reduce the space.After that performance,so for reducing the space let me know what i have to do.

Thank you very much.[/quote] :idea:
gangula.narayan reddy
vsi
Premium Member
Premium Member
Posts: 507
Joined: Wed Mar 15, 2006 1:44 pm

Re: hai members

Post by vsi »

Hi,
I am using varchar instead of char and using SQL statement while extracting the data
I am using indexes also still it's taking more disk space and running the jobs very slow.

So anything otherthan this let me know,Any how thanks.


gnreddy wrote:Hi, first of all use varchar instead of char as it reduces the memory space.
while loading bulkdata try to use orabulk stage for oracle table or BCP Loader for SQL Server or Sybase.Try to use indexes while loading the data
as performance increases.
Try to use SQL Statement while Extracting the data.
Hi,

At the first i have to reduce the space.After that performance,so for reducing the space let me know what i have to do.

Thank you very much.
:idea:[/quote]
gateleys
Premium Member
Premium Member
Posts: 992
Joined: Mon Aug 08, 2005 5:08 pm
Location: USA

Re: hai members

Post by gateleys »

What about the warnings limit? Is any of your job generating huge logs? Are you performing a lot of sorts? Have you tried partitioning your table?
gateleys
vsi
Premium Member
Premium Member
Posts: 507
Joined: Wed Mar 15, 2006 1:44 pm

Re: hai members

Post by vsi »

Hi,

I am not getting any warning limits,not performing any sorts ,i tried for partition also.

Moreover as per my assumption is index table is growing very fast,so if we can drop the indexes before doing the inserts and updates and after rebuild the indexes ,it will work out not taking the more space.

Is it correct or not let me know

Thanks.
gateleys wrote:What about the warnings limit? Is any of your job generating huge logs? Are you performing a lot of sorts? Have you tried partitioning your table?
kcbland
Participant
Posts: 5208
Joined: Wed Jan 15, 2003 8:56 am
Location: Lutz, FL
Contact:

Post by kcbland »

I don't understand what space savings you think you'll achieve by dropping and rebuilding indexes. If you have indexes then they take up space. Different types of indexes take up different amounts of space.

If your most important issue is table space usage, then I suggest you look towards data modeling and seek to optimize the columns that you're storing. It's a waste of time to talk about saving space once you're using the optimal datatypes and have eliminated extraneous columns.

You've stated your number of updates is relatively small, less than 10,000 updates out of 70000000 rows, so less than .05%. Chained row updates could be wasting space, but there's no way that's your situation. I recommend you have an Oracle DBA look at extent sizes and such and configure your table properly for your growth patterns.
Kenneth Bland

Rank: Sempai
Belt: First degree black
Fight name: Captain Hook
Signature knockout: right upper cut followed by left hook
Signature submission: Crucifix combined with leg triangle
gnreddy
Participant
Posts: 8
Joined: Wed Jan 30, 2008 11:50 am
Location: Bangalore
Contact:

Re: Performance tunning

Post by gnreddy »

Hi All,

Incase of loading data into target oracle table its better to use ORABULK
Stage as performance increases.Also to reduce space try to use VARCHAR
instead of CHAR.
Also remove indexes while loading the data into the table.After Loading
rebuilt all the necessary indexes.
Try to use SQL Statement at the time of loading and extracting the data.
And if u use ORDER BY Clause in Oracle don't use SORT Stage in DATASTAGE Jobs.
gangula.narayan reddy
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Like Ken, I really don't understand all this discussion on 'saving space'. Those kinds of concerns are the domain of the Modeller and the DBA. As the person loading the table, your job is to get the data shoved in and it falls to others to worry about the efficiency of the table / index design, tablespaces, extents, etc. Sure you can raise concerns if you have them, but others will need to solve them... unless you fill all of these roles in your organization? :?
-craig

"You can never have too many knives" -- Logan Nine Fingers
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

tunning: (pp)
filling a tun

tun: (n)
1. a large cask for holding liquids, esp. wine, ale or beer
2. a measure of liquid capacity, usually equivalent to 252 wine gallons

No wonder there's a need to want to save space! :lol:
Last edited by ray.wurlod on Thu Jan 31, 2008 8:39 am, edited 1 time in total.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
vsi
Premium Member
Premium Member
Posts: 507
Joined: Wed Mar 15, 2006 1:44 pm

Re: Performance tunning

Post by vsi »

Hi Narayana

Thank you very much.


gnreddy wrote:Hi All,

Incase of loading data into target oracle table its better to use ORABULK
Stage as performance increases.Also to reduce space try to use VARCHAR
instead of CHAR.
Also remove indexes while loading the data into the table.After Loading
rebuilt all the necessary indexes.
Try to use SQL Statement at the time of loading and extracting the data.
And if u use ORDER BY Clause in Oracle don't use SORT Stage in DATASTAGE Jobs.
vsi
Premium Member
Premium Member
Posts: 507
Joined: Wed Mar 15, 2006 1:44 pm

Re: Performance tunning

Post by vsi »

Hi Narayana

Thank you very much.


gnreddy wrote:Hi All,

Incase of loading data into target oracle table its better to use ORABULK
Stage as performance increases.Also to reduce space try to use VARCHAR
instead of CHAR.
Also remove indexes while loading the data into the table.After Loading
rebuilt all the necessary indexes.
Try to use SQL Statement at the time of loading and extracting the data.
And if u use ORDER BY Clause in Oracle don't use SORT Stage in DATASTAGE Jobs.
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Take all that advice with a grain or two of salt.

Bulk loading will be faster but can 'take more space' as the data is appended to any existing data, internal free space is not taken into account.

Who among us can reduce the space the target table consumes by 'using VARCHAR instead of CHAR'? You're stuck with whatever the current columns are defined as unless you can make a hell of a case to have them changed, as this will affect more than just you.

Removing indexes? What about PK and FK constraints? Sure, the load will go faster for pure inserts but you'll kill any updates. Then factor in how long it will take to rebuild them all and you may not find yourself in a 'win-win' situation. Never mind the impact of these dropped indexes would have on others, can you guarantee your process runs in isolation?

Etc.
-craig

"You can never have too many knives" -- Logan Nine Fingers
kcbland
Participant
Posts: 5208
Joined: Wed Jan 15, 2003 8:56 am
Location: Lutz, FL
Contact:

Post by kcbland »

As for wasting space in a table because of bulk load versus DML load, DBA's and admins periodically compact and/or manage table growth. I mentioned chained row updates as one space waster that is a managed issue thru periodic table optimization. It is not an ETL issue.
Kenneth Bland

Rank: Sempai
Belt: First degree black
Fight name: Captain Hook
Signature knockout: right upper cut followed by left hook
Signature submission: Crucifix combined with leg triangle
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Exactly, not an ETL issue. Was trying to clarify that all of that 'burden' falls squarely on the shoulders of your DBAs. Thanks Ken.
-craig

"You can never have too many knives" -- Logan Nine Fingers
Post Reply