Reading using FTP Stage - Need to improve performance

Post questions here relative to DataStage Server Edition for such areas as Server job design, DS Basic, Routines, Job Sequences, etc.

Moderators: chulett, rschirm, roy

Post Reply
vnspn
Participant
Posts: 165
Joined: Mon Feb 12, 2007 11:42 am

Reading using FTP Stage - Need to improve performance

Post by vnspn »

Hi,

We are using DS Server Edition, 7.1 version. Here is the process that we are doing in the Job.

We have just 3 stages in the Job - a FTP, a Transformer and a Sequential File stage. Using the FTP stage we are reading from a file from a remote Mainframe server. In the Transformer we use DS transforms to convert the EBCDIC to ASCII. Then, write it to a Sequential file.

The number of records to be extracted from the Mainframe file is so hugh (around 40 million). I test ran the Job with 0.5 million records and it took 1 hour for the Job to complete. With this speed we cannot process the 40 million records. We think the major cause for this slow run, is because of the FTP stage. The FTP stage executes only at 120 rows/sec. :(

Please let me know if there are any possibilities in improving the performance of this scenario, like trying to change the settings, etc. Or also let us know if any better alternate method would be preferrable for performing this task above said.

Thanks.
DSguru2B
Charter Member
Charter Member
Posts: 6854
Joined: Wed Feb 09, 2005 3:44 pm
Location: Houston, TX

Post by DSguru2B »

I doubt it. I suspect the Ebcidic to Ascii conversion is the bottleneck. For testing, ftp the file, without any transformations, just a straight load to a flat file.
Creativity is allowing yourself to make mistakes. Art is knowing which ones to keep.
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Ugh. Or just use a tool designed to move Mainframe/UNIX files at high speed and do the EBCDIC to ASCII conversion on the fly during the transfer.
Last edited by chulett on Tue Mar 13, 2007 4:43 pm, edited 1 time in total.
-craig

"You can never have too many knives" -- Logan Nine Fingers
narasimha
Charter Member
Charter Member
Posts: 1236
Joined: Fri Oct 22, 2004 8:59 am
Location: Staten Island, NY

Post by narasimha »

Write a ftp script to move the file into your datastage server first.
Now that you have the file locally, do your transformation from EBCIDIC to ASCII.

The other factor affecting the performance could be network delays.
Try to identify the bottleneck whether it is - FTP Stage, Transformation, Network Traffic....
Narasimha Kade

Finding answers is simple, all you need to do is come up with the correct questions.
kumar_s
Charter Member
Charter Member
Posts: 5245
Joined: Thu Jun 16, 2005 11:00 pm

Post by kumar_s »

Dont you have CFF stage pluging? If so, as noted, FTP the file using script and use CFF to Extract as well as covert into Ascii.
Impossible doesn't mean 'it is not possible' actually means... 'NOBODY HAS DONE IT SO FAR'
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

PUSH the file (in binary mode) from the mainframe, rather than pulling it from the DataStage server. Use the CFF stage to read it.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
vnspn
Participant
Posts: 165
Joined: Mon Feb 12, 2007 11:42 am

Post by vnspn »

DSguru2B,

As you said I split this process into 2 Jobs - First one to only FTP the file and the Second one to covert the data from EBCDIC to ASCII. I could find that the bottleneck is in FTP ing using the FTP stage.

The first Job using the FTP stage to just transfer the data runs for the same 1 hour (as mentioned in the original post). But the second Job transforming EBCDIC to ASCII for the same number of records ran for only 6 minutes.

So, its the FTP stage that slowing the processing. Anyone have any idea in improving the performance of the FTP stage.

The reason I didn't want to move the file completely was, the full file is going to be of size about 300 GB. Thats why I wanted to process it directly by reading it from the remote server instead of moving such a big(comsumption of hugh disk space).

Please let me know any sample FTP batch script to fetch a file from Mainframes to Windows.

Thanks.
DSguru2B
Charter Member
Charter Member
Posts: 6854
Joined: Wed Feb 09, 2005 3:44 pm
Location: Houston, TX

Post by DSguru2B »

Search for ftp scripts on google or as Craig mentioned, any tool that the mainframe guys use to ftp files, I think its NDM, not sure. Get them to push the file whereever you want. Connect Direct is another tool used for fast and secure ftps.
Also while doing the ftp, monitor the network to find out how heavily loaded it is. Clearly the bottleneck is the ftp and not the transformations itself.
Creativity is allowing yourself to make mistakes. Art is knowing which ones to keep.
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

And as an added bonus, those file transfer utilities can typically do the EBCDIC to ASCII transfer on the fly so you don't have to.
-craig

"You can never have too many knives" -- Logan Nine Fingers
Post Reply