Extracting data from a File in the Mainframe system

vnspn · Post by **vnspn** » Thu Mar 01, 2007 3:23 pm

Hi,

We are working in DS 7.1 version Server Job and the server OS in Windows.

We have a scenario where we need to read a hugh file from the Mainframe system and then process it.

- We cannot use a CFF stage to read this Mainframe file as it is in a remote Mainframe machine. So, we have use a FTP stage to fetch it from a remote machine.

- But what is the metadata that we need use in the FTP stage. We have the Cobol copybook of the Mainframe file that we need to fetch. But the copybook imported metadata has 'Level number' attribute for each column and so this metadata cannot be used in a FTP stage. It can be used only in a CFF stage.

- The file that we need to process is very huge (about 40 million records). So we didn't want to transfer it to DS server and then do the processing.

Please give us your thoughts on how we could proceed in this scenario.

Those who have worked on reading files directly from a Mainframe system, please share your experience on how you acheived this.

Thank you.

kumar_s · Post by **kumar_s** » Fri Mar 02, 2007 3:42 am

USS deployment in Mainframe server will improve performance.
Or ConnectDirect as mentioned in this post

ray.wurlod · Post by **ray.wurlod** » Fri Mar 02, 2007 5:03 am

No matter what you do you're going to need to move the file from the mainframe to your DataStage server. Why not have the mainframe push the file across and then read it with a Complex Flat File stage, which is intended for use with COBOL file definitions, handling level numbers according to your requirements (for example, flatten, normalize).

Have the mainframe push a second, empty file after the first. Wait for the second file to appear. You will know, by this fact, that the first has fully arrived.

vnspn · Post by **vnspn** » Fri Mar 02, 2007 9:23 am

Ray,

I do not actually require all those millions of records for my processing. I only need some of those records based on a condition.

So, I thought if there is some way I could directly extract those records from the Mainframe system itself, then I could filter only those records that I need by applying some constraint. This way, I could minimize the record count when the data actually gets to the disk for the first time on the DS server.

DSguru2B · Post by **DSguru2B** » Fri Mar 02, 2007 9:29 am

Give that condition to the mainframe folks. They can apply that condition while doing the extract.

vnspn · Post by **vnspn** » Fri Mar 02, 2007 9:47 am

DSguru2B,

The condition that I mean here is not a simple one, but its some complex conditions also involving join with some other table. Also this condition has to be implemented in DS as part of my application. So, I cannot ask the Mainframe folks to do it. Thats the scenario

DSguru2B · Post by **DSguru2B** » Fri Mar 02, 2007 9:49 am

Well in that case as others have noted, you need to get the file on the DataStage server first. Then read it with CFF stage and then do your manipulations.

ray.wurlod · Post by **ray.wurlod** » Fri Mar 02, 2007 3:12 pm

You're working for someone who expects you to process mainframe volumes on a Windows server?!! Have they purchased the mainframe edition of DataStage, so at least you have have DataStage generate code that can run on the mainframe? (Didn't think so.)

There is no mechanism whatsoever that means server jobs can initiate processing on a mainframe. Sure you can initiate an FTP transfer, but that's it. You transfer whatever's there.

kumar_s · Post by **kumar_s** » Fri Mar 02, 2007 8:32 pm

You need to compromise either of the one. FTP stage is not tuned to read mainframe files. Atleast as far as I have seen, it doesn't support all the functionality available in CFF stage. So either you need to transfer the whole file into Datastage and read it using CFF and apply you rules, or you need to demand for USS, which you read the files and process it from the remote server and transfer the required data to Datastage server.