how to check whether source file is correct or not

A forum for discussing DataStage<sup>®</sup> basics. If you're not sure where your question goes, start here.

Moderators: chulett, rschirm, roy

Post Reply
praveenk
Participant
Posts: 18
Joined: Sat Jan 15, 2011 11:31 am
Location: HYDERABAD

how to check whether source file is correct or not

Post by praveenk »

hi,

i have a src file which i get every day for example src_20110218 (src_yyyymmdd) and again next day will be src_20110219..
i need to check whether iam picking the right file every day or not before reading the records in the file.

can anyone help me on this

Thankx a lot in advance
Praveen
jwiles
Premium Member
Premium Member
Posts: 1274
Joined: Sun Nov 14, 2004 8:50 pm
Contact:

Post by jwiles »

Is the file consistently named as in your example? src_YYYYMMDD

If so, you could try this: As your sequential file stage Source option, use the File Pattern instead of File and specify the following as your file pattern:

ls /pathname/src_*|tail -n 1

File Pattern allows you to specify a shell command which returns a list of files to process. The command above should return the latest file based on the example filenames you provided.

Regards,
- james wiles


All generalizations are false, including this one - Mark Twain.
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Define for us in words what "the right file" means here... what are your rules to validate the "rightness" of it?
-craig

"You can never have too many knives" -- Logan Nine Fingers
praveenk
Participant
Posts: 18
Joined: Sat Jan 15, 2011 11:31 am
Location: HYDERABAD

Post by praveenk »

chulett: right file exactly means the latest file..
as i get the file which is in the format of 'Src_yyyymmdd' every day for example src_20110219 and again next day filename will be src_20110220, and i need to run the datastage job every day which reads the source file and do some transofmation on that....so, before i read the file i need to check whether it is the latest file or not....hope u got my requirement...

jwiles: i'll try your code and let you know...thaks for your answer
Praveen
kkalyanrao@gmail.com
Participant
Posts: 11
Joined: Thu Feb 10, 2011 1:09 am
Location: Kuala Lumpur

Post by kkalyanrao@gmail.com »

Consider moving the source file to some other directory (archiving it) once it is processed by your job.
- Kalyan
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Saying the "latest" file doesn't really narrow things down, I was asking how strict you need to be in ensuring you have just recieved the "right" file. Sure, one simple way to make detection easier is to move processed files to another area, an archive area for example, once your done so you don't have to worry about processing it again accidently.

However, what if the next day they send you the same file again? Is it ok to reprocess it or do you need to be smart enough to recognize the problem? What if they skip a day? Do you need to recognize there is a problem if you get these two files a day apart?

src_20110218
src_20110220

Those are the kind of things that I was asking you to think about and lay out for us.
-craig

"You can never have too many knives" -- Logan Nine Fingers
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

ps. I moved this to the General forum as it really is a general topic and not specific to any of the job types.
-craig

"You can never have too many knives" -- Logan Nine Fingers
greggknight
Premium Member
Premium Member
Posts: 120
Joined: Thu Oct 28, 2004 4:24 pm

Post by greggknight »

I agree with kkalyanrao,
I recieve trigger files everyday from an external process when they have completed their endday processing which contains an endday number that I use in a where clause to select my data.

I use a wait for file and after the file arives and I have read its contents I move the file to an archive directory. If there is a file in the landing directory it is the one I want to process.

Your just asking for issues by leaving already processed files laying around.

:o
"Don't let the bull between you and the fence"

Thanks
Gregg J Knight

"Never Never Never Quit"
Winston Churchill
Post Reply