Change Capture Stage Problem

Post questions here relative to DataStage Server Edition for such areas as Server job design, DS Basic, Routines, Job Sequences, etc.

Moderators: chulett, rschirm, roy

Post Reply
pkothana
Participant
Posts: 50
Joined: Tue Oct 14, 2003 6:12 am

Change Capture Stage Problem

Post by pkothana »

Hi,
I created a Job whose flow is as below:

FileSet Stage (Output columns A,B,C,D,E) -> Transformer1 (o/p cols A,B,C) -> Lookup (O/p columns A,B,C) -> Transformer2 (o/p Cols A1, B1, D,E) -> Change Capture (O/p cols A1,B1,D,E) -> Sequential file (Cols A1,B1,D,E)

It was giving me following Fatal error on Change Capture stage :

CcpDiffMergeMasterFiles: Error when checking operator: On input data set 0: Could not find input field "A"

Same error for fields B, C, D, E.

I don't know, why it is looking for the columns which are not at all related with Change Capture Stage and which are there much earlier in the upstream.

Then just guessing the reason may be Runtime Column Propagation (RCP), I unchecked all the RCP.

Then it gave me same error for Column C, which was input to Transformer2 stage but not there in the output and where i can not control RCP.

After that I introduced one sort stage between Transformer2 and Change Capture Stage just to stop the Run time column propagation. It worked fine. But Sort is a costlier operation and this is not an efficient solution.

I am wondering if somebody can help me in this regarding what is the actual problem? what is the solution?

Prompt help is very much appreciated.

Thanks & Regards
Pinkesh
Teej
Participant
Posts: 677
Joined: Fri Aug 08, 2003 9:26 am
Location: USA

Re: Change Capture Stage Problem

Post by Teej »

pkothana wrote:CcpDiffMergeMasterFiles: Error when checking operator: On input data set 0: Could not find input field "A"

Same error for fields B, C, D, E.
Turn off Column Propagation. Click on the Job's parameter, and choose the last tab (I forgot what it is, and I don't have DataStage at home), and click the checkbox for Column Propagation to off (clear).

This is a major contention I have about Parallel Extender -- Column Propagation is something that sounds nice, but introduces far more problems than it try to solve. "Hey, don't worry about moving columns! We'll do it for you!" Uh... name me a project that doesn't care what columns you're outputting to the output link.

<sarcasm=on>
We have clients with strict text formatting requirements, but they'll appreciate the column propagated fields because we are too lazy to drag and drop those fields from input to output.
<sarcasm=off>

Yes, I know that there are some stages (including custom stages) that need this column propagation. I consider this to be a poor excuse for the use of this universal concept. Why can't there be a third field, with you having "Input", "Output" and "STAGE DEFINITION"? Within this stage definition, anything that is changed, added, dropped, automatically defined by the stage, et cetera, are done. Output columns are the DEFINITE output, not "plus some hidden fields we won't tell you until run-time."

While having a lack of dynamic column definition is bad, I am willing to sacrifice that as long as column propagation is removed.

-T.J.
Developer of DataStage Parallel Engine (Orchestrate).
pkothana
Participant
Posts: 50
Joined: Tue Oct 14, 2003 6:12 am

Post by pkothana »

Hi Teej,

Thanks a Lot.

It's working now.

Regards
Pinkesh
Post Reply