Page 1 of 1

Duplicate and missing records

Posted: Wed Apr 28, 2004 4:35 am
by jcoley
I have a very strange error that has been driving me crazy for a couple of weeks now. I have stripped back my job to the bare bones and it still exhibits the same behaviour.

The scenario is this:

A sequential file feeding into a transformer, followed by another transformer, feeding into a sequential file. The first transformer strips trailing blanks off the input, the second transformer does some simple transformations including a routine.

There are no contraints anywhere so I expect the same number of records in the output file as are in the input file. And in fact I do get the same number of records, except that one record a third of the way through the file has duplicated itself and the last record is missing.

The only way I can explain it is a bug in DataStage. Does anyone else have experience of this problem or a way of getting round it?

Posted: Wed Apr 28, 2004 5:22 am
by chulett
Assuming there is nothing funky in the Sequential file, a couple of questions:

What does the routine do? Anything 'special' or does it use COMMON storage?

Do you have Row Buffering turned on? It can be on either in this particular job or at the Project level and this job uses 'Project Level Defaults'... or something to that effect. In any case, it's a Job Property on the 'Performance' tab, I believe.

Posted: Wed Apr 28, 2004 7:31 am
by jcoley
There's nothing special in the routine.

I have found that I don't get the problem if I switch off row buffering. (It was set as the project default.)

Thanks,
Jeremy.

Posted: Wed Apr 28, 2004 7:39 am
by chulett
Row Buffering is a little... dangerous. I've seen it cause false errors to be reported and to cause other odd ones like you just saw.

Glad you got it straightened out.

Posted: Wed Apr 28, 2004 8:04 am
by iamrajy
I am also encountering similar problem. My job is generating a sequential file which have duplicate values, so in order to avoid that I have put a aggregator stage and that aggregator stage is taking off duplicates and generating some additional duplicates by it's own.

Please advise.

Posted: Wed Mar 23, 2005 3:35 am
by Joshi
I have tested serveral options. We are using an SMP system. I solved the problem by activating row buffering for Inter Process and put an InterProcess Stage between the two transformers. It even works without using an InterProcess Stage. The key is, wether using SMP or Uni-Processor System.