Duplicate and missing records

Post questions here relative to DataStage Server Edition for such areas as Server job design, DS Basic, Routines, Job Sequences, etc.

Moderators: chulett, rschirm, roy

Post Reply
jcoley
Participant
Posts: 11
Joined: Wed May 14, 2003 3:03 am
Location: United Kingdom

Duplicate and missing records

Post by jcoley »

I have a very strange error that has been driving me crazy for a couple of weeks now. I have stripped back my job to the bare bones and it still exhibits the same behaviour.

The scenario is this:

A sequential file feeding into a transformer, followed by another transformer, feeding into a sequential file. The first transformer strips trailing blanks off the input, the second transformer does some simple transformations including a routine.

There are no contraints anywhere so I expect the same number of records in the output file as are in the input file. And in fact I do get the same number of records, except that one record a third of the way through the file has duplicated itself and the last record is missing.

The only way I can explain it is a bug in DataStage. Does anyone else have experience of this problem or a way of getting round it?
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Assuming there is nothing funky in the Sequential file, a couple of questions:

What does the routine do? Anything 'special' or does it use COMMON storage?

Do you have Row Buffering turned on? It can be on either in this particular job or at the Project level and this job uses 'Project Level Defaults'... or something to that effect. In any case, it's a Job Property on the 'Performance' tab, I believe.
-craig

"You can never have too many knives" -- Logan Nine Fingers
jcoley
Participant
Posts: 11
Joined: Wed May 14, 2003 3:03 am
Location: United Kingdom

Post by jcoley »

There's nothing special in the routine.

I have found that I don't get the problem if I switch off row buffering. (It was set as the project default.)

Thanks,
Jeremy.
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Row Buffering is a little... dangerous. I've seen it cause false errors to be reported and to cause other odd ones like you just saw.

Glad you got it straightened out.
-craig

"You can never have too many knives" -- Logan Nine Fingers
iamrajy
Participant
Posts: 20
Joined: Mon Apr 26, 2004 10:38 am

Post by iamrajy »

I am also encountering similar problem. My job is generating a sequential file which have duplicate values, so in order to avoid that I have put a aggregator stage and that aggregator stage is taking off duplicates and generating some additional duplicates by it's own.

Please advise.
Joshi
Premium Member
Premium Member
Posts: 17
Joined: Mon Aug 18, 2003 11:59 pm
Location: Germany

Post by Joshi »

I have tested serveral options. We are using an SMP system. I solved the problem by activating row buffering for Inter Process and put an InterProcess Stage between the two transformers. It even works without using an InterProcess Stage. The key is, wether using SMP or Uni-Processor System.
Post Reply