when to use RCP

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
Aggie99
Participant
Posts: 54
Joined: Thu Sep 04, 2008 6:54 pm

when to use RCP

Post by Aggie99 »

I tried to have a good understanding of Run Time Column Propagation (RCP) in term of when and how to use it properly.

Under what circumstances would it be a good idea to enable RCP on the project level?

What are the downside for enabling RCP at the project level.

When not to turn on RCP at the project level.

thanks.
kduke
Charter Member
Charter Member
Posts: 5227
Joined: Thu May 29, 2003 9:47 am
Location: Dallas, TX
Contact:

Post by kduke »

We use RCP to only show columns that have transforms. That way it makes it easier to view. The other columns have not changes and are just passing thru. Much cleaner.
Mamu Kim
rohithmuthyala
Participant
Posts: 57
Joined: Wed Oct 21, 2009 4:46 am
Location: India

Post by rohithmuthyala »

Are there any disadvantages, if we choose RCP in Datastage?
Rohith
ArndW
Participant
Posts: 16318
Joined: Tue Nov 16, 2004 9:08 am
Location: Germany
Contact:

Post by ArndW »

RCP can be confusing at first, since you can do manipulations on columns that don't seem to exist. Along the same lines, you can get confusing warnings in joins when a column is marked as "duplicate" because it is coming from RCP.
In terms of execution and compile times there is no difference.
kduke
Charter Member
Charter Member
Posts: 5227
Joined: Thu May 29, 2003 9:47 am
Location: Dallas, TX
Contact:

Post by kduke »

I agree, it can be confusing.

Most people do not need it because they carry all the columns across manually. Why do they use RCP?
Mamu Kim
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

To answer the question as originally asked, IMHO I would never simply turn it on at the Project level unless you had an expert developer team all of whom were very comfortable with it.
-craig

"You can never have too many knives" -- Logan Nine Fingers
eostic
Premium Member
Premium Member
Posts: 3838
Joined: Mon Oct 17, 2005 9:34 am

Post by eostic »

The best reason I've seen for RCP is when you have a set of functions (stage types) that you want to use to process a vast set of tables, each that has a different structure ....especially in cases where you don't need "really specific" transformations on certain column names...... Entire generic jobs are written that have no columns at all...... These scenarios may be rare, but it becomes a very powerful option to have a job that works against nearly "any" table.

Ernie
Ernie Ostic

blogit!
<a href="https://dsrealtime.wordpress.com/2015/0 ... ere/">Open IGC is Here!</a>
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Exactly... turn it on at the job level when you really need or can leverage it properly, case by case. All IMHO, of course. :wink:
-craig

"You can never have too many knives" -- Logan Nine Fingers
Sreenivasulu
Premium Member
Premium Member
Posts: 892
Joined: Thu Oct 16, 2003 5:18 am

Post by Sreenivasulu »

Hi All,
When i use schema file for reading a data file then i use RCP since i do not want to specify the column names in the source and target.

Regards
Sreeni
kduke
Charter Member
Charter Member
Posts: 5227
Joined: Thu May 29, 2003 9:47 am
Location: Dallas, TX
Contact:

Post by kduke »

I agree with Craig. Turn it on at the job level. I think it helps with shared containers. Only specifying the columns needed.
Mamu Kim
asorrell
Posts: 1707
Joined: Fri Apr 04, 2003 2:00 pm
Location: Colleyville, Texas

Post by asorrell »

I've also used it shops with very large structures (for example a record with hundreds and hundreds of columns) where only a few columns needed to be worked on, but all columns needed to be transmitted along to the end result.

I only enable it on a per-job basis - it can cause a lot of problems otherwise.
Andy Sorrell
Certified DataStage Consultant
IBM Analytics Champion 2009 - 2020
Post Reply