Dataset Read is slow

PaulVL · Post by **PaulVL** » Fri Mar 07, 2014 9:02 am

What OS are you using?
try creating a new subdirectory under your regular resource disk path and defining it in your APT file as your resource disk location. copy the dataset into that path (orchadmin cp command).

thompsonp · Post by **thompsonp** » Fri Mar 07, 2014 9:31 am

When you followed Ray's suggestion how long did the job take to read the 1GB dataset?

A few things to consider in your investigation in no particular order:
Do any other jobs that read or write large datasets suffer from poor performance?
Is there any partitioning or sorting going on?
Are you able to monitor the box and then run this job (preferably with nothing else running at the same time)? Doing so should allow you or the system admin to determine if the job is i/o bound.

1GB in around 8 minutes is very slow at about 2MB a second.
Can you exclude DataStage and test performance copying a large file around the system?
Are all 4 nodes writing to the same disk?
Are these disk(s) local or is DataStage going across a network - is there a network problem (dropped packets for example - seen that before with a faulty switch)?
Are the disks and controller looking healthy.

rohitagarwal15 · Post by **rohitagarwal15** » Wed Mar 12, 2014 5:33 am

Is this Dataset problem is with this particular Dataset or is this is for all other Datasets too ?
If this is the problem for all the datasets then probably you can check with your storage or unix team about the mount points on which datasets description file is getting created.