Folder Stage, Ereplace and Field Marks (@FM)

ShaneMuir · Post by **ShaneMuir** » Thu Sep 14, 2006 8:34 pm

Hi All

Back again and totally confused as usual.

We have a job which reads from a folder stage into a transformer and out to a sequential file.

Folder ---> Transform ---> Seq File

In the transform stage there are several stage variables whose aim is to append a prefix to each row.

The variables are as follows

Code: Select all

a)  AddPrefixtoFirstRow = Prefix : "," : Field(input.data, @FM, 1)
b)  LINESEP = LINESEP
c)  AddPrefixtoFM = LINESEP : Prefix : "," 
d)  ReplacedFirstRow = EREPLACE(input.data, Field(input.data, @FM, 1), AddPrefixToFirstRow)
e)  Outputdata = EREPLACE(ReplacedFirstRow, @FM, AddPrefixToFM)

So what is supposed to be happening is
1) 1 row per input file comes from the folder stage, into the transformer with each row of raw data inferred by an @FM
2) The variables will:

a) create a new inferred first row with the prefix
c) create a prefix including a row seperation
d) replace the first inferred row the new one created above
e) replace all @FM with AddPrefixtoFM

This generally works fine, until today. It has been discovered that if the first row of the input file has a duplicate in the file anywhere then that row will have output of Prefix:Prefix:input instead of just Prefix:input

I am pretty sure it has something to do with the interaction between the ereplaces in the variables ReplacedFirstRow and Outputdata.

During debug I noticed that ReplacedFirstRow had a value of just the first new inferred row - ie all the other output data was missing?

Am I missing something? Am I misunderstanding how stage variables process data?

Any ideas how to overcome this issue would be greatly appreciated

Thanks in advance

ray.wurlod · Post by **ray.wurlod** » Thu Sep 14, 2006 8:44 pm

Tell/show us how Prefix is defined/derived.

Investigate the Cats() function and the Splice() function for easier mechanisms for adding a prefix to each element of a dynamic array.

Code: Select all

Cats(Reuse(Prefix:","),input.data)

Code: Select all

Splice(Reuse(Prefix), ",", input.data)

ShaneMuir · Post by **ShaneMuir** » Thu Sep 14, 2006 8:48 pm

ray.wurlod wrote:Tell/show us how Prefix is defined/derived.

Investigate the Cats() function and the Splice() function for easier mechanisms for adding a prefix to each element of a dynamic array.
Code: Select all
Cats(Reuse(Prefix:","),input.data)
Code: Select all
Splice(Reuse(Prefix), ",", input.data)

Hi Ray

Prefix = input.filename : SourceDate
where SourceDate is derived from part of the filename.

Is it strange that it only occurs where the row is the same as the first row, but for all other rows the logic works fine?

ShaneMuir · Post by **ShaneMuir** » Thu Sep 14, 2006 9:57 pm

Ok I have figured it out!

:D

It was the variable ReplacedFirstRow that was the problem. The substring value of the ereplace was set to pick up the value of the first row, when it found other instances of the value it replaced them also. All I need to do is limit the replacement to the first instance! Simple

ray.wurlod · Post by **ray.wurlod** » Fri Sep 15, 2006 12:04 am

Use the five-argument variant of Ereplace() in which you can specify the starting occurrence and number of occurrences. But I still think Cats() or Splice() is a neater solution.

ShaneMuir · Post by **ShaneMuir** » Fri Sep 15, 2006 12:27 am

ray.wurlod wrote:Use the five-argument variant of Ereplace() in which you can specify the starting occurrence and number of occurrences. But I still think Cats() or Splice() is a neater solution.

Have already implemented and tested the 5 argument version. Thanks for the suggestion of cats or splice - i will give them a go to try and tidy up the job a little before it gets to prod.

Thanks again!