Page 1 of 1

File processing question

Posted: Tue Jan 13, 2009 9:55 am
by mydsworld
I have a file with a No. of records and a control record (containing count of records) at the end. I would like to check that No. of records actually match the control record count, then only I will take the file for further processing. What will be the most efficient design for matching the counts.

Thanks.

Posted: Tue Jan 13, 2009 10:06 am
by chulett
Probably a UNIX script before you ever get into a job. Tail off the control record, cut out the number and compare it to a count of lines (minus 1 or 2) from the file itself. (minus 2 if there's a header record with column names)

Posted: Tue Jan 13, 2009 10:21 am
by mydsworld
Any suggestion if I had to do it in DataStage

Posted: Tue Jan 13, 2009 10:39 am
by chulett
Thought you only wanted to 'process' the file if the counts matched? :wink: Would two jobs work, one to validate counts only and one to process it if the counts are ok?

Posted: Tue Jan 13, 2009 11:05 am
by mydsworld
Two jobs would work. Please let me know how you plan to validate the count in first job.

Posted: Tue Jan 13, 2009 3:08 pm
by ray.wurlod
Lots of possible ways. A server job using an Aggregator, a server job using a Transformer stage reporting @INROWNUM into a hashed file with a constant key, an Execute Command activity executing a wc -l command are among them.

Posted: Tue Jan 13, 2009 3:55 pm
by kandyshandy
mydsworld,
In general, this kind of files will have some kind of identification of header record, detail record and footer record. Do you have something like shown below?

0$filename
1$detailrecord
1$detailrecord
...
...
2$footerrecord

where record starting with 0 is header, 1 is detail and 2 is footer. Basically, you need to keep the count increasing while reading detail records and compare the count with footer record's count. Ray has given some ideas too.

Just as a FYI. I used this kind of files in the past where we check the files and load a table with a flag whether the file meets the expectations to be processed. Then we read the table and then process the files whose entry has flag Y in that table. This also helped us to troubleshoot.