Selective reading Metadata versus using a copy stage/Modify

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
samplify
Participant
Posts: 5
Joined: Mon Aug 22, 2005 12:42 am

Selective reading Metadata versus using a copy stage/Modify

Post by samplify »

Hi Folks,

Here is my dilemma:

I have 200 columns in the input source and I need only 100 for downstream processing, Should I read all 200 with a dataset and drop redundant using a copy/modify stage as next step or should I restrict the metadata definition in the input dataset to read only 100 selective columns (by altering the column definition/layout)

I am looking from the code maintainability and most importantly performance point of view.

Appreciate your help.
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Depends on your source. If it's a sequential file (in which you must read past every byte to get to the next) then you have to read them all. In this case you can use the column property Drop On Input. Otherwise, select only those columns that you actually need from the source table.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
manjunathnet
Participant
Posts: 5
Joined: Fri Dec 22, 2006 2:11 am

Post by manjunathnet »

both the ways u can work, 1st read all 200 columns in seq file and then drop 100 un wanted columns by copy stage.
2nd u can directly read only required 100 columns.
becoze all these u r reading based on the metadata, so when u r loading columns from metadata for required 100 column and it reads data according the column defination not based on order of columns.
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

"you", not "u".
"are", not "r".
"because", not "becoze"
"definition", not "defination"
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Post Reply