Hi Folks,
Here is my dilemma:
I have 200 columns in the input source and I need only 100 for downstream processing, Should I read all 200 with a dataset and drop redundant using a copy/modify stage as next step or should I restrict the metadata definition in the input dataset to read only 100 selective columns (by altering the column definition/layout)
I am looking from the code maintainability and most importantly performance point of view.
Appreciate your help.
Selective reading Metadata versus using a copy stage/Modify
Moderators: chulett, rschirm, roy
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
Depends on your source. If it's a sequential file (in which you must read past every byte to get to the next) then you have to read them all. In this case you can use the column property Drop On Input. Otherwise, select only those columns that you actually need from the source table.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
-
- Participant
- Posts: 5
- Joined: Fri Dec 22, 2006 2:11 am
both the ways u can work, 1st read all 200 columns in seq file and then drop 100 un wanted columns by copy stage.
2nd u can directly read only required 100 columns.
becoze all these u r reading based on the metadata, so when u r loading columns from metadata for required 100 column and it reads data according the column defination not based on order of columns.
2nd u can directly read only required 100 columns.
becoze all these u r reading based on the metadata, so when u r loading columns from metadata for required 100 column and it reads data according the column defination not based on order of columns.
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact: