Page 1 of 1

File FTP in Datastage vs UNIX

Posted: Thu Mar 03, 2016 10:04 am
by sharmabhavesh
Hi,
I want to FTP a file from Datastage UNIX server to a different server.
Will the Datastage FTP stage be better than UNIX FTP command?
Also, what kind of configuration is needed between the 2 servers to enable FTP?

Posted: Thu Mar 03, 2016 10:28 am
by asorrell
It has been a long time since I used DataStage FTP. The main reason is the last time I checked, it was transferring individual records one at a time, instead of a bulk file transfer. It was SLOW.

If anyone has more recent experience that it has changed, please chime in...

Posted: Thu Mar 03, 2016 11:34 am
by PaulVL
All of the application teams I've dealt with have always preferred shell scripts to handle their (S)FTP needs.

Posted: Thu Mar 03, 2016 11:51 am
by sharmabhavesh
Also, what all set up or configuration is required to set up FTP between 2 servers?

Posted: Thu Mar 03, 2016 11:53 am
by FranklinE
The main advantage to using an FTP stage is eliminating the need to land a physical file on the server. Performance is usually the trade-off, with disk space being less expensive that CPU components vs. an essential doubling of the I/O -- once to land the file, another to FTP it.

For Unix-to-Unix, if you are using sshv2, the better choice in my experience is secure copy (scp) from a script.

Posted: Thu Mar 03, 2016 11:58 am
by chulett
sharmabhavesh wrote:Also, what all set up or configuration is required to set up FTP between 2 servers?
That's a question for your System Administrator(s). We don't know what is or isn't allowed in your organization.

Posted: Thu Mar 03, 2016 12:02 pm
by sharmabhavesh
Hi Franklin, thanks for the reply. I couldn't really understand the point where you said that we need to land a physical file on the server.
During FTP, the file needs to by physically present on the server from which file is being FTP'd, right?

Posted: Thu Mar 03, 2016 12:13 pm
by asorrell
If you use DataStage FTP stage, the remote file (which is NOT on your server - it is somewhere else), is used as a data source and read one record at a time. It can then be processed (albeit slowly) and sent to its destination, which may be a database or file location on another server.

Hence - "it isn't landed"...

Though DataStage FTP is slow (per-record transfers) it will at least notify you if the connection fails. One thing that most people don't realize is that almost all "freebie" FTP commands on UNIX servers do not report error codes.(they always return "0" after execution). This means you might get a partial file and not know it.

That's why I usually design anything that uses UNIX FTP commands to validate the size of the file that was received, insuring "we got it all".

Posted: Thu Mar 03, 2016 12:17 pm
by FranklinE
Yes, what Andy said, with this:

FTP "get" as first stage streams the transferred data directly to the output link of the stage. No file is created.

FTP "put" as final stage takes rows from the input link and transfers them directly to the destination server. No file is created on the local server.