Datastage performance

Archive of postings to DataStageUsers@Oliver.com. This forum intended only as a reference and cannot be posted to.

Moderators: chulett, rschirm

Locked
admin
Posts: 8720
Joined: Sun Jan 12, 2003 11:26 pm

Datastage performance

Post by admin »

Hi all,

I have a datawarehouse on Oracle 8.1.6 (on Digital Unix) which im loading using Ardent Datastage 4.1 on NT Server 4.0 I do this using the ORAOCI8 Plugin of DS. So my target is always the ORAOCI8 stage. Now heres the problem. If my source too is Oracle ( the same db or any other db, again using OCI) i get a throughput of abt 650 records/sec. This includes some amount of transformation logic in between. If however, if my source happens to be a sequential file, an ODBC source, an FTP Stage or even a hash file, my throughput falls to abt 4 records/sec, the transformation logic remaining same.

Any tips as on how to speed this up? Some server side settings or a different logic for loading?

Thanks in advance,
Regds,
Vivek Pandey
Consultant - Business Intelligence
Sonata Software Ltd.
Bangalore - India
admin
Posts: 8720
Joined: Sun Jan 12, 2003 11:26 pm

Post by admin »

Dont blindly believe the claimed rows per second in monitor.

They dont take into account the startup time for the stage. If for example, if your Oracle input involves a query (for example a sort or group
by) that must do significant processing before returning the first row, DataStage does not consider the stage to have started until the first row arrives.

If you have set the array size very high on the Oracle input side, this could also affect the calculated rows per second.

How does the total job run time compare with the different sources? You would probably be better to compare overall job run time rather than the stage run time displayed in the monitor dialogue.

Is it a very large data set or a small data set? If the data set is not very large, then these factors might come into play. If it is very large, then I find your observations rather strange.

As for speeding it up, you will need to provide more information about the logic.

David Barham
Information Technology Consultant
InformAtect Pty Ltd
Brisbane, Australia

e-mail: david@barham.hm

-----Original Message-----
From: Vivek Pandey. [mailto:vive@sonata-software.com]
Sent: Friday, 6 July 2001 12:53 PM
To: datastage-users@oliver.com
Subject: Datastage performance

Hi all,

I have a datawarehouse on Oracle 8.1.6 (on Digital Unix) which im loading using Ardent Datastage 4.1 on NT Server 4.0 I do this using the ORAOCI8 Plugin of DS. So my target is always the ORAOCI8 stage. Now heres the problem. If my source too is Oracle ( the same db or any other db, again using OCI) i get a throughput of abt 650 records/sec. This includes some amount of transformation logic in between. If however, if my source happens to be a sequential file, an ODBC source, an FTP Stage or even a hash file, my throughput falls to abt 4 records/sec, the transformation logic remaining same.

Any tips as on how to speed this up? Some server side settings or a different logic for loading?

Thanks in advance,
Regds,
Vivek Pandey
Consultant - Business Intelligence
Sonata Software Ltd.
Bangalore - India
admin
Posts: 8720
Joined: Sun Jan 12, 2003 11:26 pm

Post by admin »

Thanks David,

It was quite of trivial issue of our starting testing jobs at the same time as the db schema was being laid out. Once the tables were partitioned and indexed, load is beautifully fast.

Regds,
Vivek Pandey
Consultant - Business Intelligence
Sonata Software Ltd.
Bangalore - India


> -----Original Message-----
> From: David Barham [SMTP:david@barham.hm]
> Sent: Friday, July 06, 2001 9:20 AM
> To: datastage-users@oliver.com
> Subject: RE: Datastage performance
>
> Dont blindly believe the claimed rows per second in monitor.
>
> They dont take into account the startup time for the stage. If for
> example, if your Oracle input involves a query (for example a sort or
> group
> by) that must do significant processing before returning the first
> row, DataStage does not consider the stage to have started until the
> first row arrives.
>
> If you have set the array size very high on the Oracle input side,
> this could also affect the calculated rows per second.
>
> How does the total job run time compare with the different sources?
> You would probably be better to compare overall job run time rather
> than the stage run time displayed in the monitor dialogue.
>
> Is it a very large data set or a small data set? If the data set is
> not very large, then these factors might come into play. If it is
> very large, then I find your observations rather strange.
>
> As for speeding it up, you will need to provide more information about
> the logic.
>
> David Barham
> Information Technology Consultant
> InformAtect Pty Ltd
> Brisbane, Australia
>
> e-mail: david@barham.hm
>
> -----Original Message-----
> From: Vivek Pandey. [mailto:vive@sonata-software.com]
> Sent: Friday, 6 July 2001 12:53 PM
> To: datastage-users@oliver.com
> Subject: Datastage performance
>
> Hi all,
>
> I have a datawarehouse on Oracle 8.1.6 (on Digital Unix) which im
> loading using Ardent Datastage 4.1 on NT Server 4.0 I do this using
> the ORAOCI8 Plugin of DS. So my target is always the ORAOCI8
> stage. Now heres the problem. If my source too is Oracle ( the same db or
> any other db, again using OCI) i get a throughput of abt 650 records/sec.
> This includes some amount of transformation logic in between. If however,
> if
> my source happens to be a sequential file, an ODBC source, an FTP Stage or
> even a hash file, my throughput falls to abt 4 records/sec, the
> transformation logic remaining same.
>
> Any tips as on how to speed this up? Some server side settings or a
> different logic for loading?
>
> Thanks in advance,
> Regds,
> Vivek Pandey
> Consultant - Business Intelligence
> Sonata Software Ltd.
> Bangalore - India
>
admin
Posts: 8720
Joined: Sun Jan 12, 2003 11:26 pm

Post by admin »

There is a whole section called "Designing and Performance" in the "Advanced DataStage" class offered in the Asia-Pacific region. Note that this is NOT the same class as DS305 offered in the USA.

-----Original Message-----
From: Bibhu C [mailto:bibhuds@yahoo.com]
Sent: Saturday, 11 August 2001 02:55
To: datastage-users@oliver.com
Subject: datastage performance


Hi All,

Even when I am in the design phase of my ETL jobs, I
was curious to know about the ways to improve
performance of DataStage jobs. I understand that use
of hash files is one of ways, but the manuals are
really tightlipped about them and any other paths that
one would want to tread to improve load performance.

I am looking to the knowledgeable group to tell me
about the various performance enhancing strategies in DataStage, also tell me if I have missed something in the manuals.

Thanks
Bibhu


__________________________________________________
Do You Yahoo!?
Make international calls for as low as $.04/minute with Yahoo! Messenger http://phonecard.yahoo.com/
admin
Posts: 8720
Joined: Sun Jan 12, 2003 11:26 pm

Post by admin »

Hi Mark,

Thanks for your inputs. They have really been very
helpful !

Regards,
Bibhu

--- mark.huffman@baesystems.com wrote:
> Bibhu,
> I asked the same question of DS tech support a few
> months ago, and
> they sent me these two articles in response. I have
> attached them as zipped
> Word 2000 documents.
> HTH,
> Mark
>
>
> -----Original Message-----
> From: Bibhu C [mailto:bibhuds@yahoo.com]
> Sent: Friday, August 10, 2001 11:55 AM
> To: datastage-users@oliver.com
> Subject: datastage performance
>
>
> Hi All,
>
> Even when I am in the design phase of my ETL jobs, I
> was curious to know about the ways to improve
> performance of DataStage jobs. I understand that use
> of hash files is one of ways, but the manuals are
> really tightlipped about them and any other paths
> that
> one would want to tread to improve load performance.
>
> I am looking to the knowledgeable group to tell me
> about the various performance enhancing strategies
> in
> DataStage, also tell me if I have missed something
> in
> the manuals.
>
> Thanks
> Bibhu
>
>
> __________________________________________________
> Do You Yahoo!?
> Make international calls for as low as $.04/minute
> with Yahoo! Messenger
> http://phonecard.yahoo.com/
>
>


__________________________________________________
Do You Yahoo!?
Send instant messages & get email alerts with Yahoo! Messenger. http://im.yahoo.com/
Locked