Config File creation

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
deepak.hsbc
Participant
Posts: 39
Joined: Sun Apr 15, 2007 11:30 pm

Config File creation

Post by deepak.hsbc »

Hello all
I have to write configuration file for my job which process large amout of data.My job contains 3 joins and 6 sort stages which takes most of the time.
Input of this file is a sequential file and one row is having 176 columns in combine forms 1245 byte long string.And there are average 40 million rows.. :(

I pulled following information abt resources using TOPAS
Online Memory: 16384.0 MB
Online Logical CPUs: 8
Online Virtual CPUs: 4

Any suggestion on how should i start writing configuration file using above information !!!!
"Books are as useful to a stupid person as a mirror is useful to a Blind person."
srimitta
Premium Member
Premium Member
Posts: 187
Joined: Sun Apr 04, 2004 7:50 pm

Post by srimitta »

/Ascential/DataStage/Configurations/default.apt is the location of configuration file.

to add more power to data processing, you need number of nodes and also below information

fastname <Host Name>
resource disk <Datasets path>
resource scratchdisk <Scratch file path>
Quality is never an accident; it is always the result of high intention, sincere effort, intelligent direction and skillful execution; it represents the wise choice of many alternatives.
By William A.Foster
deepak.hsbc
Participant
Posts: 39
Joined: Sun Apr 15, 2007 11:30 pm

Post by deepak.hsbc »

Thanks for the reply but how many nodes should i use ?
"Books are as useful to a stupid person as a mirror is useful to a Blind person."
srimitta
Premium Member
Premium Member
Posts: 187
Joined: Sun Apr 04, 2004 7:50 pm

Post by srimitta »

Ask your DataStage admin, he / she should be able to provide information.
Quality is never an accident; it is always the result of high intention, sincere effort, intelligent direction and skillful execution; it represents the wise choice of many alternatives.
By William A.Foster
deepak.hsbc
Participant
Posts: 39
Joined: Sun Apr 15, 2007 11:30 pm

Post by deepak.hsbc »

I am the new Admin and i m gonna decide these things...:)
thats want i want to learn how to start doing this !!
"Books are as useful to a stupid person as a mirror is useful to a Blind person."
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Have you taken the IBM class DX437 (Administering DataStage) or some equivalent? This provides you with the skills you need.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
deepak.hsbc
Participant
Posts: 39
Joined: Sun Apr 15, 2007 11:30 pm

Post by deepak.hsbc »

yeah i have taken the Dx437 classes but still I am not fully confident of doing that..making an normal config files seems easy and i did that but the job in cencern definetly need a highly tuned config file..

Below is the config file i made and using but it gives a run of 3 hour to that job.do u think this is normal ??
==============================================
{
node "node01"
{
fastname "EtlServer01"
pools ""
resource disk "/IBM/MyProject/node01/resource" {pools "" }
resource scratchdisk "/IBM/MyProject/node01/scratch" {pools "" }
resource scratchdisk "/IBM/MyProject/node01/buffer" {pools "buffer"}
}
node "node02"
{
fastname "EtlServer01"
pools ""
resource disk "/IBM/MyProject/node02/resource" {pools "" }
resource scratchdisk "/IBM/MyProject/node02/scratch" {pools "" }
resource scratchdisk "/IBM/MyProject/node02/buffer" {pools "buffer"}
}
node "node03"
{
fastname "EtlServer01"
pools ""
resource disk "/IBM/MyProject/node03/resource" {pools "" }
resource scratchdisk "/IBM/MyProject/node03/scratch" {pools "" }
resource scratchdisk "/IBM/MyProject/node03/buffer" {pools "buffer"}
}
node "node04"
{
fastname "EtlServer01"
pools ""
resource disk "/IBM/MyProject/node04/resource" {pools "" }
resource scratchdisk "/IBM/MyProject/node04/scratch" {pools "" }
resource scratchdisk "/IBM/MyProject/node04/buffer" {pools "buffer"}
}
node "node05"
{
fastname "EtlServer01"
pools ""
resource disk "/IBM/MyProject/node05/resource" {pools "" }
resource scratchdisk "/IBM/MyProject/node05/scratch" {pools "" }
resource scratchdisk "/IBM/MyProject/node05/buffer" {pools "buffer"}
}
node "DB2"
{ fastname "Db2Server01"
pools "db2"
resource disk "/IBM/MyProject/node01/scratch" {pools ""}
resource scratchdisk "/IBM/MyProject/node01/buffer" {pools ""}
}
}

===============================================
"Books are as useful to a stupid person as a mirror is useful to a Blind person."
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Why only one node in the DB2 node pool? Surely this will be a bottleneck. Create extra nodes in this node pool on the Db2Server01 machine. Unless you do this, all DB2 operations will be sequential.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
deepak.hsbc
Participant
Posts: 39
Joined: Sun Apr 15, 2007 11:30 pm

Post by deepak.hsbc »

thanks.....This is what i was looking for,now i can start working from this point...
"Books are as useful to a stupid person as a mirror is useful to a Blind person."
Post Reply