Processing Flat files

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
santhoshrao.kuttuva
Participant
Posts: 13
Joined: Tue May 12, 2009 6:22 am
Location: Chennai

Processing Flat files

Post by santhoshrao.kuttuva »

Hi all

Iam having a flat file which need to be processed.. The file is delimited by semicolon and before extraction the file name should be validated.

my file name for eg.MASS.F012.034.200904051800. each field should be validated with some default format ..

after the file naming pattern satisfies the file should be extracted. in the file the first record is header record which is identified by '0'

from the second line i have the records which is identified by '1'.

Please suggest how to validate the file names and how to use it file stage in DataStage.

Thanks in advance...
nagarjuna
Premium Member
Premium Member
Posts: 533
Joined: Fri Jun 27, 2008 9:11 pm
Location: Chicago

Post by nagarjuna »

Read it as a varchar and use constraints in transformer to separate out the header and trailer . You mentioned

my file name for eg.MASS.F012.034.200904051800. each field should be validated with some default format ..

Do we need to validate the filename ??
Nag
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Then you'll need some sort of 'pre-processing' step to validate filenames and only pass along ones that match your mysterious criteria. How exactly does one validate your filenames? What are the business rules? :?
-craig

"You can never have too many knives" -- Logan Nine Fingers
santhoshrao.kuttuva
Participant
Posts: 13
Joined: Tue May 12, 2009 6:22 am
Location: Chennai

Post by santhoshrao.kuttuva »

yes .. first validating the filename with the format .. if it satisfies the format it should proceed further...




nagarjuna wrote:Read it as a varchar and use constraints in transformer to separate out the header and trailer . You mentioned

my file name for eg.MASS.F012.034.200904051800. each field should be validated with some default format ..

Do we need to validate the filename ??
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

WHAT format ?!!!

In a job sequence you might use a Nested Condition activity with a Matches operator.

Code: Select all

#FileName# Matches "4A'.'1A3N'.'3N'.'12N'"
If you need to check that the date and time components are also valid date and time, then create a server routine and invoke from a Routine activity.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
santhoshrao.kuttuva
Participant
Posts: 13
Joined: Tue May 12, 2009 6:22 am
Location: Chennai

Post by santhoshrao.kuttuva »

Thanks for ur response....

My format for file name is as follows

eg MASS.F0XX.YYYY.200904051800.

MASS- mandatory
XX - it can be 0,1 or 2 like "F021"
YYYY - the value will be from 0000 to 9999
200904051800- Timestamp should be in this format..
ray.wurlod wrote:WHAT format ?!!!

In a job sequence you might use a Nested Condition activity with a Matches operator.

Code: Select all

#FileName# Matches "4A'.'1A3N'.'3N'.'12N'"
If you need to check that ...
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

So then, as noted, create a Server routine that takes the filename and validates all of the components are there and good, then invoke it in a Routine Activity stage or from inside a job. Or validate them all and land a list of 'validated' filenames that you then process in a loop or all at once.
-craig

"You can never have too many knives" -- Logan Nine Fingers
Post Reply