Performance Issue In Job

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
Nageshsunkoji
Participant
Posts: 222
Joined: Tue Aug 30, 2005 2:07 am
Location: pune
Contact:

Performance Issue In Job

Post by Nageshsunkoji »

Hi All,

I have a requirement to add a column called Analog to the job flow, in this process i have two options 1) The source dataset is coming from another job which have column Analog and add the column in the input dataset and dragg the column up to the Target satge through the flow.
2) I can able to get this column in the flow by performing Inner join based on keys at the end before target dataset.
Like wise i have to implememt this logic in 10 jobs.

So, my question is which one is performance wise better if my data in millions, means dragging the Metadata through flow from source to Target (or) add a join and perform inner join based on keys before target stage.Please clarify.

Regards
Nagesh.[/b]
NageshSunkoji

If you know anything SHARE it.............
If you Don't know anything LEARN it...............
richdhan
Premium Member
Premium Member
Posts: 364
Joined: Thu Feb 12, 2004 12:24 am

Post by richdhan »

Hi Nagesh,

Use Option 1. Changing the metadata is always better(eventhough it is going to take some time) rather than introducing a new stage.

HTH
--Rich
kumar_s
Charter Member
Charter Member
Posts: 5245
Joined: Thu Jun 16, 2005 11:00 pm

Post by kumar_s »

Depends on the column you add, and the number of stage and jobs the column need to pass though from source to target.
It is always better to reduce the number of column in flow inorder to improve the performance, but it dosent ment to join at the later stage. Join may be worth if your length of the Analog is too huge and nubmer of jobs and stage which need to pass may be tediously high.
But i would prefer to avoid unecessary joins (since you have millions of records) and carry along the way till target.

-Kumar
Nageshsunkoji
Participant
Posts: 222
Joined: Tue Aug 30, 2005 2:07 am
Location: pune
Contact:

Post by Nageshsunkoji »

Hi Richardan & Kumar,

Thank you for your Inputs. I am selecting the option as dragging the column from Source to Target even though it's a time consuming work to reach better performance.

Regards,
Nagesh
NageshSunkoji

If you know anything SHARE it.............
If you Don't know anything LEARN it...............
kumar_s
Charter Member
Charter Member
Posts: 5245
Joined: Thu Jun 16, 2005 11:00 pm

Post by kumar_s »

Drag - Drop is one time that too initially, performance need to be considired for the whole life time (run time of project)

-Kumar
Post Reply