RCP

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
kittu.raja
Premium Member
Premium Member
Posts: 175
Joined: Tue Oct 14, 2008 1:48 pm

RCP

Post by kittu.raja »

Hi,

Actually I want to know about RCP, so I am testing it.

I have 1000 columns in my source and I want to pass these 1000 columns into my target. I want transformations on only few columns. How does RCP affect my job. How can I improve my job?

What can I do. My design is

FF--------->copy------------>Shared Container---------->copy-------->FF

Inside the shared container I am doing simple transformation like filetering the records using contraint (filter is done on one column).

Thanks in advance
Rajesh Kumar
ShaneMuir
Premium Member
Premium Member
Posts: 508
Joined: Tue Jun 15, 2004 5:00 am
Location: London

Post by ShaneMuir »

RCP allows you to create jobs and only work with the fields that you wish to work with. You need only specify a column in the stage before you wish to use it in order to "surface" the column. This column will then be available for processing as per normal in any subsequent stage that your push it through.

For examply if you input is a dataset, then you need not specify any column values, merely the dataset name. In the next copy stage, surface the fields that are required as inputs to your shared container. Any fields that you wish to perform tasks on must be surfaced prior to reaching that stage.

Be wary though, lookups and column creations can create extra columns in your output that you can not see in your job design, and these will have to be handled appropriately.

Hope this gives you a good starting point
battaliou
Participant
Posts: 155
Joined: Mon Feb 24, 2003 7:28 am
Location: London
Contact:

Post by battaliou »

RCP will not work with flat files, need a relational source.
3NF: Every non-key attribute must provide a fact about the key, the whole key, and nothing but the key. So help me Codd.
ShaneMuir
Premium Member
Premium Member
Posts: 508
Joined: Tue Jun 15, 2004 5:00 am
Location: London

Post by ShaneMuir »

battaliou wrote:RCP will not work with flat files, need a relational source.
For the initial read of the file, yes, however you can turn on RCP after you have specified the format.

Or read in the record as one column and pass it over a schema and output as RCP.
battaliou
Participant
Posts: 155
Joined: Mon Feb 24, 2003 7:28 am
Location: London
Contact:

Post by battaliou »

Yes, but whats the point of RCP if you have to define your meta data?
3NF: Every non-key attribute must provide a fact about the key, the whole key, and nothing but the key. So help me Codd.
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

You can define it at runtime via schemas.
-craig

"You can never have too many knives" -- Logan Nine Fingers
SettValleyConsulting
Premium Member
Premium Member
Posts: 72
Joined: Thu Sep 04, 2003 5:01 am
Location: UK & Europe

Post by SettValleyConsulting »

You can define it at runtime via schemas
Yes, and this is a very powerful technique; if you have a series of files as input that require similar handling - e.g. basic validation, load to a table or dataset etc, you can write a generic job with RCP, and specify the schema file name, table (dataset) and Modify stage specs as parameters. Then you have a single job, and if the input file metadata changes you just amend the schema files, no need for code change or redeployment....

The other main use I've found for RCP is to make shared containers as re-usable as possible by propagating columns thru the container transparently without specifying them, which also works well.

Otherwise we generally switch it off otherwise it can affect job maintainabiliy/readability if there are 'invisible' columns being propagated thru your design.....
Phil Clarke
Post Reply