Hi
How can we drop null column in a transformer.
i have 250 columns in the job.i have to drop all the column which are null
how can we achieve this
Thanks!
dropping null columns in tranformer
Moderators: chulett, rschirm, roy
dropping null columns in tranformer
sravanthi
You cannot change the metadata dynamically, but you can achive you requirement by some work arround.
Read all the 250 column as single varchar column along with delimeter. Check for '' between two delimeter. If so exclude it.
Say if you delimeter is ',' then something like if inputfield[i,2]=',,' then '' else <value>, you need to loop for all the character in the field.
You can use any custome subroutine, or with 250 stage variable.
Read all the 250 column as single varchar column along with delimeter. Check for '' between two delimeter. If so exclude it.
Say if you delimeter is ',' then something like if inputfield[i,2]=',,' then '' else <value>, you need to loop for all the character in the field.
You can use any custome subroutine, or with 250 stage variable.
Impossible doesn't mean 'it is not possible' actually means... 'NOBODY HAS DONE IT SO FAR'
-
- Participant
- Posts: 3593
- Joined: Thu Jan 23, 2003 5:25 pm
- Location: Australia, Melbourne
- Contact:
Sounds like a good candidate for a custom built stage. Something that could navigate through the list of columns looking for nulls in a loop with an output link and a reject link.
It is code intensive but you can do it in a transformer stage. Most functions will reject a row if that row has nulls in it, therefore you do not necessarily need to check for nulls, you just need to run a function on any field and have a reject link.
For example you can run a trim on every string field, you can check for zeros or negative numbers or high values on numeric fields, you can validate date fields etc. Any time a null is encountered the row will be rejected because parallel functions do not like nulls.
It is code intensive but you can do it in a transformer stage. Most functions will reject a row if that row has nulls in it, therefore you do not necessarily need to check for nulls, you just need to run a function on any field and have a reject link.
For example you can run a trim on every string field, you can check for zeros or negative numbers or high values on numeric fields, you can validate date fields etc. Any time a null is encountered the row will be rejected because parallel functions do not like nulls.
Certus Solutions
Blog: Tooling Around in the InfoSphere
Twitter: @vmcburney
LinkedIn:Vincent McBurney LinkedIn
Blog: Tooling Around in the InfoSphere
Twitter: @vmcburney
LinkedIn:Vincent McBurney LinkedIn