Erratic stripping

Post questions here relative to DataStage Server Edition for such areas as Server job design, DS Basic, Routines, Job Sequences, etc.

Moderators: chulett, rschirm, roy

Post Reply
mihai
Participant
Posts: 30
Joined: Thu Mar 13, 2003 5:24 am
Location: Hertfordshire

Erratic stripping

Post by mihai »

Hello all

We are experiencing a very obscure random-ish error and we can't see where the problem is. I hope someone can hazard some guesses.

We have a DataStage (4.2.1r8) job that reads some data from a sequential file, runs a routine against it then dumps it to a hashed file to deduplicate based on the columns identified as PK's by the source system.
The next thing the job does is to read this data into two fields -- PROCESS_DATA and ORIGINAL_DATA. PROCESS_DATA is then processed by a fairly nasty DataStage routine, while ORIGINAL_DATA is just transported.
At the end of the job, the two columns are fed into a sequential file.

In other words, the job can be summarised to

SEQ_IN --> TR1 --> HASH --> TR2 --> SEQ_OUT

where TR2 contais the nasty routine. The routine has two roles: CLEANSE and LOOKUP. The CLEANSE part goes through the fields/columns and checks for valid data types, lengths, date formats, etc. The LOOKUP part performs a lookup for a defined set of columns against a dataset held in a COMMON block array (2000-ish elements, 50 characters per element on average).

The problem manifests itself through the stripping of the E and R (capital e and r) characters from the PROCESS_DATA.

The erratic routine has been identified as the culprit because
1) The hash file contains well behaved data
2) Not all the data is mauled (so it's not external to DataStage)
3) Only PROCESS_DATA has the characters stripped out (and only the nasty routine manipulates that)

We were fairly unsuccessful in reproducing the problem.
We ran the job overnight, with a dataset of around 60,000 rows, and this erratic stripping behaviour was exhibited in 6 out of 88 times. We then ran it 400 times with a 10 row dataset and some debugging switched on inside the nasty routine -- it didn't fail once. The machine this is running on is a Windows NT server with decent amount of memory, but nothing spectacular.

I would like to point out that the nasty routine doesn't do any trim() stuff anywhere, and no code changes were made between runs (other than enabling the debug statements).

We were wondering if anyone else has experienced this kind of behaviour, and could anyone hazard a guess at what could be going on? As the problem isn't really reproduceable, I can understand how Ascential support would have trouble tackling it, hence taking your time.


Thanks in advance,
Mihai

_________________________
desk direct:+441908448571
ds_developer
Premium Member
Premium Member
Posts: 224
Joined: Tue Sep 24, 2002 7:32 am
Location: Denver, CO USA

Post by ds_developer »

The first thing I would check is that all of the parameters sent to the PROCESS_DATA routine are assigned to variables within the routine. The variables should be worked on, not the parameters.

John
mihai
Participant
Posts: 30
Joined: Thu Mar 13, 2003 5:24 am
Location: Hertfordshire

Post by mihai »

The parameters (Arg1, Arg2, Arg3) are immediately assigned into variables, with no more referenced being made to the Arg... 'variables'.

Unfortunately, that is not the problem.


Kind regards,
Mihai

_________________________
desk direct:+441908448571
Post Reply