File processing question

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
mydsworld
Participant
Posts: 321
Joined: Thu Sep 07, 2006 3:55 am

File processing question

Post by mydsworld »

I have a file with a No. of records and a control record (containing count of records) at the end. I would like to check that No. of records actually match the control record count, then only I will take the file for further processing. What will be the most efficient design for matching the counts.

Thanks.
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Probably a UNIX script before you ever get into a job. Tail off the control record, cut out the number and compare it to a count of lines (minus 1 or 2) from the file itself. (minus 2 if there's a header record with column names)
-craig

"You can never have too many knives" -- Logan Nine Fingers
mydsworld
Participant
Posts: 321
Joined: Thu Sep 07, 2006 3:55 am

Post by mydsworld »

Any suggestion if I had to do it in DataStage
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Thought you only wanted to 'process' the file if the counts matched? :wink: Would two jobs work, one to validate counts only and one to process it if the counts are ok?
-craig

"You can never have too many knives" -- Logan Nine Fingers
mydsworld
Participant
Posts: 321
Joined: Thu Sep 07, 2006 3:55 am

Post by mydsworld »

Two jobs would work. Please let me know how you plan to validate the count in first job.
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Lots of possible ways. A server job using an Aggregator, a server job using a Transformer stage reporting @INROWNUM into a hashed file with a constant key, an Execute Command activity executing a wc -l command are among them.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
kandyshandy
Participant
Posts: 597
Joined: Fri Apr 29, 2005 6:19 am
Location: Singapore

Post by kandyshandy »

mydsworld,
In general, this kind of files will have some kind of identification of header record, detail record and footer record. Do you have something like shown below?

0$filename
1$detailrecord
1$detailrecord
...
...
2$footerrecord

where record starting with 0 is header, 1 is detail and 2 is footer. Basically, you need to keep the count increasing while reading detail records and compare the count with footer record's count. Ray has given some ideas too.

Just as a FYI. I used this kind of files in the past where we check the files and load a table with a flag whether the file meets the expectations to be processed. Then we read the table and then process the files whose entry has flag Y in that table. This also helped us to troubleshoot.
Kandy
_________________
Try and Try again…You will succeed atlast!!
Post Reply