Page 1 of 1

Processing Flat files

Posted: Tue May 12, 2009 6:40 am
by santhoshrao.kuttuva
Hi all

Iam having a flat file which need to be processed.. The file is delimited by semicolon and before extraction the file name should be validated.

my file name for eg.MASS.F012.034.200904051800. each field should be validated with some default format ..

after the file naming pattern satisfies the file should be extracted. in the file the first record is header record which is identified by '0'

from the second line i have the records which is identified by '1'.

Please suggest how to validate the file names and how to use it file stage in DataStage.

Thanks in advance...

Posted: Tue May 12, 2009 7:02 am
by nagarjuna
Read it as a varchar and use constraints in transformer to separate out the header and trailer . You mentioned

my file name for eg.MASS.F012.034.200904051800. each field should be validated with some default format ..

Do we need to validate the filename ??

Posted: Tue May 12, 2009 7:45 am
by chulett
Then you'll need some sort of 'pre-processing' step to validate filenames and only pass along ones that match your mysterious criteria. How exactly does one validate your filenames? What are the business rules? :?

Posted: Tue May 12, 2009 9:35 am
by santhoshrao.kuttuva
yes .. first validating the filename with the format .. if it satisfies the format it should proceed further...




nagarjuna wrote:Read it as a varchar and use constraints in transformer to separate out the header and trailer . You mentioned

my file name for eg.MASS.F012.034.200904051800. each field should be validated with some default format ..

Do we need to validate the filename ??

Posted: Tue May 12, 2009 3:39 pm
by ray.wurlod
WHAT format ?!!!

In a job sequence you might use a Nested Condition activity with a Matches operator.

Code: Select all

#FileName# Matches "4A'.'1A3N'.'3N'.'12N'"
If you need to check that the date and time components are also valid date and time, then create a server routine and invoke from a Routine activity.

Posted: Thu May 14, 2009 4:16 am
by santhoshrao.kuttuva
Thanks for ur response....

My format for file name is as follows

eg MASS.F0XX.YYYY.200904051800.

MASS- mandatory
XX - it can be 0,1 or 2 like "F021"
YYYY - the value will be from 0000 to 9999
200904051800- Timestamp should be in this format..
ray.wurlod wrote:WHAT format ?!!!

In a job sequence you might use a Nested Condition activity with a Matches operator.

Code: Select all

#FileName# Matches "4A'.'1A3N'.'3N'.'12N'"
If you need to check that ...

Posted: Thu May 14, 2009 5:39 am
by chulett
So then, as noted, create a Server routine that takes the filename and validates all of the components are there and good, then invoke it in a Routine Activity stage or from inside a job. Or validate them all and land a list of 'validated' filenames that you then process in a loop or all at once.