Dropping column depend on condition

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
gowrishankar_h
Participant
Posts: 42
Joined: Wed Dec 26, 2012 1:13 pm

Dropping column depend on condition

Post by gowrishankar_h »

Hi .

I have a list of column as below.

A1
A2
A3
A4
A5.

I want to check the condition like if A1 is not null then send that column to next stage else drop.(i need to check this condition in all incoming input column which will be more then 100+)

If A1 and A2 columns having null values then oly following column oly should sent to next stage .A1 and A2 should be dropped.

A3
A4
A5


Any stage or anyway to achieve this.

Thanks,
Gowri
priyadarshikunal
Premium Member
Premium Member
Posts: 1735
Joined: Thu Mar 01, 2007 5:44 am
Location: Troy, MI

Post by priyadarshikunal »

I fail to understand the requirement as what you can achive by doing this and unless I understand what is the next stage, and how it will utilize the data, I won't be able to provide the correct answer.
Priyadarshi Kunal

Genius may have its limitations, but stupidity is not thus handicapped. :wink:
gowrishankar_h
Participant
Posts: 42
Joined: Wed Dec 26, 2012 1:13 pm

Post by gowrishankar_h »

This is my design

DS--->transformer---->mQ

Mq table definition will be as below.

B1-- column name coming frm transformer which value is not null(Say A3)
B2-- A3 length
B3-- A3 value.

In order to do so i need to filer colum which is having null values.
I need to take the column name and length also in runtime.any function in transformer to take the inpu coming column name and its length.
ArndW
Participant
Posts: 16318
Joined: Tue Nov 16, 2004 9:08 am
Location: Germany
Contact:

Post by ArndW »

In parallel jobs you can use RCP (Runtime Column Propagation) to copy columns to downstream stages without explicitly naming them. You can also selectively DROP columns. But you cannot selectively drop columns on a row-by-row basis using this approach.

From your example it might be easiest to use a transform stage with one output link per column and then merge all the links in a funnel stage again. The drawback is that you need one link per column, which could be a lot of links in a wide input data stream.
arunkumarmm
Participant
Posts: 246
Joined: Mon Jun 30, 2008 3:22 am
Location: New York
Contact:

Post by arunkumarmm »

Can you give more info about your source? Are all your source columns independent? From your example I guess they are. If they are all independent, why dont you convert all your columns as records and filter out the unwanted records?
Arun
abhinavagarwal
Participant
Posts: 26
Joined: Thu Jun 19, 2008 12:39 am
Location: Atlanta

Post by abhinavagarwal »

hi Gowrishankar
For your following requirement -

"I need to take the column name and length also in runtime.any function in transformer to take the inpu coming column name and its length."

If you are not using the RCP. Then it can be achieved by passing the column name as a string to the various available function to work on strings.

You will be passing your column name along with link - function("sourcelinkname.column")

You can find the position of "." Dot and then can separate the column name with the link name and then you can find the length and use that as it is also for your column name requirements.

Hope this will help!
- Thanks and Regards,
Abhinav Agarwal
Post Reply