Rule of Thumb on Runtime Column Propagation
Moderators: chulett, rschirm, roy
Rule of Thumb on Runtime Column Propagation
Hi
I am having hard-time grasping this RCP concept. Reading manual, and going thru the forum, only made my case worse. In a nutshell can anyone point out from the Best Practice Manual, when and where RCP is to be turned ON/OFF.. Sorry, for posting this question, but as I mentioned, manual, and the forum questions, I couldn't come to a conclusion.
Thanks in advance
Vijay
I am having hard-time grasping this RCP concept. Reading manual, and going thru the forum, only made my case worse. In a nutshell can anyone point out from the Best Practice Manual, when and where RCP is to be turned ON/OFF.. Sorry, for posting this question, but as I mentioned, manual, and the forum questions, I couldn't come to a conclusion.
Thanks in advance
Vijay
I think RCP is best defaulted to be turned OFF and only enabled when you explicitly need it. It is an incredibly useful and powerful feature with all the benefits and potential drawbacks associated with such functionality.
If you explicitly work with each column in your schema then having RCP enabled is no advantage at all
If you explicitly work with each column in your schema then having RCP enabled is no advantage at all
<a href=http://www.worldcommunitygrid.org/team/ ... TZ9H4CGVP1 target="WCGWin">
</a>
</a>
Vijay, I am also a relative novice with RCP. I also have the same problem. From what I understand, RCP is needed (need to be ON) when you want a column moved across a stage without it doing anything with the column. You might need to do this when you want to use this column at a downstream stage in the same job.
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
That is not correct. If you include that column in your design, directly mapped to output, then RCP has no effect.
Since RCP can invalidate lineage analysis its use should be discouraged.
RCP is, in my view, only for lazy developers.
Since RCP can invalidate lineage analysis its use should be discouraged.
RCP is, in my view, only for lazy developers.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
I have to disagree with you on this one, Ray. There are some great things that PX allows you to do with RCP, particularly when developing "generic" jobs with logic & inputs/outputs which can be used with different file formats.
But for the majority of job development work RCP causes issues and should be disabled.
But for the majority of job development work RCP causes issues and should be disabled.
<a href=http://www.worldcommunitygrid.org/team/ ... TZ9H4CGVP1 target="WCGWin">
</a>
</a>
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
NO, because I refuse to use RCP, because I believe in strict management of metadata. If "they" could figure out a way that lineage analysis could identify the source of data generated by RCP, I might be persuaded to reconsider.
You are wrong to assert that RCP is needed to map a column, this is just as easily designed in, particularly with helpers such as Auto-Match Columns and Propagate Columns being available in the Designer tool.
You are wrong to assert that RCP is needed to map a column, this is just as easily designed in, particularly with helpers such as Auto-Match Columns and Propagate Columns being available in the Designer tool.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
-
- Charter Member
- Posts: 199
- Joined: Tue Jan 18, 2005 2:50 am
- Location: India
My understanding on the subject is: Assuming I have 50 columns and I have 10 stages in my DS job (with stage 1 as input file stage and stage 10 as output file and 2-8 are some intermediate stages) . Out of 50 I am using only 10 columns for some logic and rest 40 columns are straight move. In such situation I may choose not to map all 40 columns from stage 1 thru stage 10, as it would cause flow of same data from one memory location to another (for every stage).
Instead, I will enable RCP and have the column names in stage 1 and stage 10 and NO mention abt the columns in stage 2-8.
Instead, I will enable RCP and have the column names in stage 1 and stage 10 and NO mention abt the columns in stage 2-8.
Shantanu Choudhary
-
- Premium Member
- Posts: 397
- Joined: Wed Apr 12, 2006 2:28 pm
- Location: Tennesse
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
-
- Charter Member
- Posts: 199
- Joined: Tue Jan 18, 2005 2:50 am
- Location: India
yep, thatz the concept of RCP ..propagation..
words of caution...if you are using below stages and logic, i wd suggest not to use RCP..you may not get desired value, in ur o/p
1. Join stage (depend on type of join ur using)
2. remove duplicate
3. aggregator or aggragation logic in transformer stage
4. merge
5. pivot
these r just a list of most common stages u must be using, and RCP may not wrk with these stages in the job...
words of caution...if you are using below stages and logic, i wd suggest not to use RCP..you may not get desired value, in ur o/p
1. Join stage (depend on type of join ur using)
2. remove duplicate
3. aggregator or aggragation logic in transformer stage
4. merge
5. pivot
these r just a list of most common stages u must be using, and RCP may not wrk with these stages in the job...
Shantanu Choudhary
-
- Participant
- Posts: 7
- Joined: Thu Sep 30, 2004 5:22 am
Re: Rule of Thumb on Runtime Column Propagation
I do agree with Arnd on this one!! I am a frequent user of RCP and it works quiet smooth with the concept of generic jobs. Especially when we want to built something with varying Meta data! For example say we have a job, which combines or compares data between two inputs. But what meta data will be used say is decided at run time, then in this case RCP plays really cool and picks up the details from the descriptor file!
But yes I do agree of RCP being unpredictable and recommend it to be disabled by default and enable it on requirement basis rather than setting as default.
But yes I do agree of RCP being unpredictable and recommend it to be disabled by default and enable it on requirement basis rather than setting as default.
-
- Participant
- Posts: 222
- Joined: Tue Aug 30, 2005 2:07 am
- Location: pune
- Contact:
Hi All,
The great extend of RCP use will come in picture , when you are using shared containers. I think shared containers will help alot to reduce the coding efforts. I used RCP extensively, while developing Shared Containers. Its so help ful incase of that shred containers. In other cases, I used to switch off my RCP.
Regards
Nagesh
The great extend of RCP use will come in picture , when you are using shared containers. I think shared containers will help alot to reduce the coding efforts. I used RCP extensively, while developing Shared Containers. Its so help ful incase of that shred containers. In other cases, I used to switch off my RCP.
Regards
Nagesh
NageshSunkoji
If you know anything SHARE it.............
If you Don't know anything LEARN it...............
If you know anything SHARE it.............
If you Don't know anything LEARN it...............