External source stage very slow

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
chetan.c
Participant
Posts: 112
Joined: Tue Jan 17, 2012 2:09 am
Location: Bangalore

External source stage very slow

Post by chetan.c »

Hi,

I have a n external source stage with below command.

Code: Select all

tar -xOvf /home/LLS2TEALFA01_CSS997.tar
This tar ball has 19k files and around 100,000 records.
The job takes around 30 mins to finish.
The performace of the stage is very slow.I'm just testing the load with the job having only 2 stages: Externalsource--->Dataset.

Can somebody guide me why this is so slow and also how external source stage actually interprets the command and does it wait untill all rows are read to go into next stage?

Thanks,
Chetan.
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

How long does the tar command take by itself?

How long does it take to load a Data Set with a comparable number of generated rows?
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
chetan.c
Participant
Posts: 112
Joined: Tue Jan 17, 2012 2:09 am
Location: Bangalore

Post by chetan.c »

Tar command takes 35 mins on the command prompt.
But when run in the job ,the job is not completeing at all.

Thanks,
Chetan
qt_ky
Premium Member
Premium Member
Posts: 2895
Joined: Wed Aug 03, 2011 6:16 am
Location: USA

Post by qt_ky »

It's not running your tar command in parallel on multiple nodes at once, is it?
Choose a job you love, and you will never have to work a day in your life. - Confucius
PaulVL
Premium Member
Premium Member
Posts: 1315
Joined: Fri Dec 17, 2010 4:36 pm

Post by PaulVL »

Also, what exactly are you wishing to pump into your dataset?
Post Reply