Page 1 of 1
File processing question
Posted: Tue Jan 13, 2009 9:55 am
by mydsworld
I have a file with a No. of records and a control record (containing count of records) at the end. I would like to check that No. of records actually match the control record count, then only I will take the file for further processing. What will be the most efficient design for matching the counts.
Thanks.
Posted: Tue Jan 13, 2009 10:06 am
by chulett
Probably a UNIX script before you ever get into a job. Tail off the control record, cut out the number and compare it to a count of lines (minus 1 or 2) from the file itself. (minus 2 if there's a header record with column names)
Posted: Tue Jan 13, 2009 10:21 am
by mydsworld
Any suggestion if I had to do it in DataStage
Posted: Tue Jan 13, 2009 10:39 am
by chulett
Thought you only wanted to 'process' the file if the counts matched?
Would two jobs work, one to validate counts only and one to process it if the counts are ok?
Posted: Tue Jan 13, 2009 11:05 am
by mydsworld
Two jobs would work. Please let me know how you plan to validate the count in first job.
Posted: Tue Jan 13, 2009 3:08 pm
by ray.wurlod
Lots of possible ways. A server job using an Aggregator, a server job using a Transformer stage reporting @INROWNUM into a hashed file with a constant key, an Execute Command activity executing a wc -l command are among them.
Posted: Tue Jan 13, 2009 3:55 pm
by kandyshandy
mydsworld,
In general, this kind of files will have some kind of identification of header record, detail record and footer record. Do you have something like shown below?
0$filename
1$detailrecord
1$detailrecord
...
...
2$footerrecord
where record starting with 0 is header, 1 is detail and 2 is footer. Basically, you need to keep the count increasing while reading detail records and compare the count with footer record's count. Ray has given some ideas too.
Just as a FYI. I used this kind of files in the past where we check the files and load a table with a flag whether the file meets the expectations to be processed. Then we read the table and then process the files whose entry has flag Y in that table. This also helped us to troubleshoot.