Page 1 of 1

Dropping column depend on condition

Posted: Thu Oct 03, 2013 7:28 am
by gowrishankar_h
Hi .

I have a list of column as below.

A1
A2
A3
A4
A5.

I want to check the condition like if A1 is not null then send that column to next stage else drop.(i need to check this condition in all incoming input column which will be more then 100+)

If A1 and A2 columns having null values then oly following column oly should sent to next stage .A1 and A2 should be dropped.

A3
A4
A5


Any stage or anyway to achieve this.

Thanks,
Gowri

Posted: Thu Oct 03, 2013 7:56 am
by priyadarshikunal
I fail to understand the requirement as what you can achive by doing this and unless I understand what is the next stage, and how it will utilize the data, I won't be able to provide the correct answer.

Posted: Thu Oct 03, 2013 8:36 am
by gowrishankar_h
This is my design

DS--->transformer---->mQ

Mq table definition will be as below.

B1-- column name coming frm transformer which value is not null(Say A3)
B2-- A3 length
B3-- A3 value.

In order to do so i need to filer colum which is having null values.
I need to take the column name and length also in runtime.any function in transformer to take the inpu coming column name and its length.

Posted: Thu Oct 03, 2013 9:41 am
by ArndW
In parallel jobs you can use RCP (Runtime Column Propagation) to copy columns to downstream stages without explicitly naming them. You can also selectively DROP columns. But you cannot selectively drop columns on a row-by-row basis using this approach.

From your example it might be easiest to use a transform stage with one output link per column and then merge all the links in a funnel stage again. The drawback is that you need one link per column, which could be a lot of links in a wide input data stream.

Posted: Thu Oct 03, 2013 11:35 am
by arunkumarmm
Can you give more info about your source? Are all your source columns independent? From your example I guess they are. If they are all independent, why dont you convert all your columns as records and filter out the unwanted records?

Posted: Fri Oct 04, 2013 9:27 am
by abhinavagarwal
hi Gowrishankar
For your following requirement -

"I need to take the column name and length also in runtime.any function in transformer to take the inpu coming column name and its length."

If you are not using the RCP. Then it can be achieved by passing the column name as a string to the various available function to work on strings.

You will be passing your column name along with link - function("sourcelinkname.column")

You can find the position of "." Dot and then can separate the column name with the link name and then you can find the length and use that as it is also for your column name requirements.

Hope this will help!