Hi All,
We recently bought DS 8.1 which has been installed on T4000 (Sun) box which has 8 dual core CPUs with 64 GB of Memory.
For each installation of DS the Admin created 1 node Config file whose Fastname is same as the server name.
{
node "node1"
{
fastname "Cont1"
pools ""
resource disk "/opt/ibm/IS/Server/Datasets" {pools ""}
resource scratchdisk "/opt/ibm/IS/Server/Scratch" {pools ""}
}
}
We ran a basic Job which copies data from One Oracle EE stage to Another Oracle EE stage with a simple select. The performance - 2600 Rows/sec.
Oracle EE --> Copy --> Oracle EE.
The Admin claims that this is among the best performance he has seen for similar job which is something we cannot digest.
We insisted that please create another config file with multiple node. He said that it wont improve the performance because only thing that changes in config file is the resource disk/scratch disk.
The 2 node file he created -
{
node "node1"
{
fastname "Cont1"
pools ""
resource disk "/opt/IBM/IS/Server/Datasets" {pools ""}
resource scratchdisk "/opt/IBM/IS/Server/Scratch" {pools ""}
}
node "node2"
{
fastname "Cont1"
pools ""
resource disk "/opt/IBM/IS/Server/Datasets" {pools ""}
resource scratchdisk "/opt/IBM/IS/Server/Scratch" {pools ""}
}
}
Is this correct? Every thing looks same for both the nodes.
Is this the limit of DS performance - 2600 rows/sec?
We are moving from Informatica to Datastage and the grounds for buying DS was performance improvement but Infa seems to do better.
Is it worth converting the job? Do we need to involve IBM here?
We would appreciate any answer.
Thanks
Sumeet
Job Performance Issue
Moderators: chulett, rschirm, roy
Thanks nagarjuna for your reply.
The query which we are running is very simple -
select col1, col2,col3, col4, col5, col6 from tablea where rownum < 5000000.
I assume with one node file the type of partition wont matter.
Use of two node (LOGICAL) file definitely improved the performance. But How do I understand how many CPU and how much memory is the process using.
I used $APT_DUMP_SCORE which gave 2 nodes are being used by 6 processes. Is there any other way to get more detailed information regarding hardware used by parallel engine.
Thanks
Sumeet
The query which we are running is very simple -
select col1, col2,col3, col4, col5, col6 from tablea where rownum < 5000000.
I assume with one node file the type of partition wont matter.
Use of two node (LOGICAL) file definitely improved the performance. But How do I understand how many CPU and how much memory is the process using.
I used $APT_DUMP_SCORE which gave 2 nodes are being used by 6 processes. Is there any other way to get more detailed information regarding hardware used by parallel engine.
Thanks
Sumeet
-
- Premium Member
- Posts: 1735
- Joined: Thu Mar 01, 2007 5:44 am
- Location: Troy, MI
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
The limiting factor is probably, and curiously perhaps, the SELECT operation, which is performed sequentially. Under appropriate circumstances I have seen over 100,000 rows/second but, then, I believe this metric to be meaningless for most purposes.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.