How to process sequential file with varying record formats?

rameshrr3 · Post by **rameshrr3** » Mon Aug 19, 2013 10:22 am

is the pattern '*' found in the data also ? Otherwise you can define this character as the record delimiter.

truenorth · Post by **truenorth** » Mon Aug 19, 2013 11:27 am

Unfortunately, yes, the data contains *.

ray.wurlod · Post by **ray.wurlod** » Mon Aug 19, 2013 4:51 pm

Read each line as a single VarChar column, and effect your parsing in a Transformer stage. Use stage variables to track where you're up to in each logical record.

arunkumarmm · Post by **arunkumarmm** » Tue Aug 20, 2013 5:08 am

truenorth wrote:Unfortunately, yes, the data contains *.

How will you, even manually differentiate the '*' in the delimiter and the data?

We have had a similar file once but we never got '*' in the data.

Maybe you can check with your source system and confirm if the delimiter can be changed to something which will not be in the data.

truenorth · Post by **truenorth** » Tue Aug 20, 2013 6:41 am

ray.wurlod wrote:Read each line as a single VarChar column, and effect your parsing in a Transformer stage. Use stage variables to track where you're up to in each logical record. ...

Makes sense, Ray. I'll go that route.

arunkumarmm wrote:
truenorth wrote:Unfortunately, yes, the data contains *.
How will you, even manually differentiate the '*' in the delimiter and the data

As a matter of fact, that was exactly what I was saying. We couldn't differentiate the *s.

arunkumarmm wrote:Maybe you can check with your source system and confirm if the delimiter can be changed to something which will not be in the data.

Great idea. I'll pursue that, too.

Many, many thanks, everyone. I'll keep you posted.

truenorth · Post by **truenorth** » Fri Aug 23, 2013 6:35 am

We've determined that the tilde character is not present in the data. We will replace all * in the first column of each record. I have assigned a developer for this task. Consider this resolved.