Sequential file performance issue

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
synsog
Premium Member
Premium Member
Posts: 232
Joined: Sun Aug 01, 2010 11:01 pm
Location: Pune

Sequential file performance issue

Post by synsog »

we have a sequential file FTP from mainframe contains 67million records.
Target is MS SQL server 2008.

In v8.7 datastage, the job took 1hr 50 minutes.

We recently upgraded to v11.3.1.1 and the job is taking 29hrs.

We checked the UVCONFIG file etc every thing is similar and v11.3 box is having bigger RAM and enough memory.
Running with 4node APT configuration file.


Any suggestions or help ?
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

How long does it take to read into a Copy stage as target? That will advise you of the read speed. Chances are that there's something else amiss with how the connection to SQL Server is being managed.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
synsog
Premium Member
Premium Member
Posts: 232
Joined: Sun Aug 01, 2010 11:01 pm
Location: Pune

Post by synsog »

Thanks Ray.

I tested with Source Sequential File , Target Copy Stage. Even also its reading 80 records per second which will eventually says it will take 30+hrs to process 67Million records.


What else I can check ?
FranklinE
Premium Member
Premium Member
Posts: 739
Joined: Tue Nov 25, 2008 2:19 pm
Location: Malvern, PA

Post by FranklinE »

Looks like an FTP server potential issue, not a DataStage issue. Much depends on your FTP configurations and any extra security layers involved.

Start from scratch. Monitor the transfer session at the server. i don't know what else to suggest without further details on that.
Franklin Evans
"Shared pain is lessened, shared joy increased. Thus do we refute entropy." -- Spider Robinson

Using mainframe data FAQ: viewtopic.php?t=143596 Using CFF FAQ: viewtopic.php?t=157872
synsog
Premium Member
Premium Member
Posts: 232
Joined: Sun Aug 01, 2010 11:01 pm
Location: Pune

Post by synsog »

Frank,

We are not having issues with FTP, the file is on Engine server. When we try to load into MSSQL server table it is taking more than a day for 67million.
Today we re-tried on v8.7 , it processed within 2hrs.
PaulVL
Premium Member
Premium Member
Posts: 1315
Joined: Fri Dec 17, 2010 4:36 pm

Post by PaulVL »

I attened an IBM Webinar on Performance Tuning. Here's the PDF. It may contain some helpful hints.



http://www-01.ibm.com/support/docview.w ... wg27046170
FranklinE
Premium Member
Premium Member
Posts: 739
Joined: Tue Nov 25, 2008 2:19 pm
Location: Malvern, PA

Post by FranklinE »

It's the table load, not the file transfer. I misunderstood that. Thanks, and good luck.
Franklin Evans
"Shared pain is lessened, shared joy increased. Thus do we refute entropy." -- Spider Robinson

Using mainframe data FAQ: viewtopic.php?t=143596 Using CFF FAQ: viewtopic.php?t=157872
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

synsog wrote:Thanks Ray.

I tested with Source Sequential File , Target Copy Stage. Even also its reading 80 records per second which will eventually says it will take 30+hrs to process 67Million records.


What else I can check ?
That is very unusual. Even with one reader I can get tens of thousands of rows per second out of a Sequential File stage. Is there anything unusual about the file?

What about reading the file as a single VarChar field, and effecting the parsing in a downstream Transformer stage?
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
rkashyap
Premium Member
Premium Member
Posts: 532
Joined: Fri Dec 02, 2011 12:02 pm
Location: Richmond VA

Post by rkashyap »

ray.wurlod wrote:Chances are that there's something else amiss with how the connection to SQL Server is being managed.
Same as noted earlier, I suspect that the issue is in loading to SQL Server. Datastage throttles input to match the consumption rate of the output. If loading to SQL server is slow, then upstream read operator would slow down to match it. This slowing down may be appearing as read rate of 80 rows/second.

See section on Buffering on page 27 of Parallel Job Advanced Developer's Guide.

Questions:
- Are you using same SQL Server drivers in 8.7 and 11.3?
- Any change in topology or server location between 8.7 and 11.3?
- Share your job design? What are the stages used?
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Except, @rkashyap, the OP (@synsog) claims to seem the same result when writing to a Copy stage.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
qt_ky
Premium Member
Premium Member
Posts: 2895
Joined: Wed Aug 03, 2011 6:16 am
Location: USA

Post by qt_ky »

Can your server or storage administrator check into the file I/O performance? Maybe the disk subsystem is screwed up.
Choose a job you love, and you will never have to work a day in your life. - Confucius
Post Reply