Job slow in reading sequential file

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
sjordery
Premium Member
Premium Member
Posts: 202
Joined: Thu Jun 08, 2006 5:58 am

Job slow in reading sequential file

Post by sjordery »

Hi All,

I have a job which reads data from a sequential file and loads to sql server data base.
The job used to run faster with < 1lakh records.

After the the source file size has increased ( around 3+ lakhs of records)the job is running very slow.
When I checked director for this job I learned that it is the source sequential file which is running for more than 1 hour to read the records.

Could any body please suggest what should I do to make the sequential file run faster.
Should I check the "read from multiple nodes" option in sequential file?

The job is running in 2 nodes.

Thanks in Advance,
Sjordery.
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

The first thing you need to do is to be sure you've named the correct culprit. Create a job that consists only of a Sequential File stage and a Copy stage, and measure your read speed with that.

You could experiment with multiple readers per node in this job too.

I strongly suspect that the problem is not in the Sequential File stage, and that earlier you were populating an empty, or nearly empty, table, which tends to be faster than a larger table. If you're doing Upserts, then we can extend the discussion more than somewhat.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
sjordery
Premium Member
Premium Member
Posts: 202
Joined: Thu Jun 08, 2006 5:58 am

Post by sjordery »

Thanks Ray.
Actually the job has a copy stage in between source sequential file stage and target ODBC stage.
When I check the director it shows :

Sequential_File_0,0: Progress: 10 percent.
Sequential_File_0,0: Progress: 20 percent.

.......
.......

Sequential_File_0,0: Progress: 100 percent

To complete 100 percent it takes more than 1 hour while reading a source file with 3+ lakhs of records.
And in target ODBC stage Write Method= Write and Write Mode=Append is specified.

Ok I will try with "read from multiple node" option and reply to the post.

Thanks,
Sjordery.
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Get rid of the database stage to do your test. You will be amazed at what you see.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Exactly... I wager 400 Quatloos that it is your inserts that are the 'slow' part of this equation. You can't drive any faster than the guy in front of you. :wink:
-craig

"You can never have too many knives" -- Logan Nine Fingers
Post Reply