File FTP in Datastage vs UNIX

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
sharmabhavesh
Premium Member
Premium Member
Posts: 38
Joined: Tue Jun 19, 2012 11:03 pm
Location: India

File FTP in Datastage vs UNIX

Post by sharmabhavesh »

Hi,
I want to FTP a file from Datastage UNIX server to a different server.
Will the Datastage FTP stage be better than UNIX FTP command?
Also, what kind of configuration is needed between the 2 servers to enable FTP?
asorrell
Posts: 1707
Joined: Fri Apr 04, 2003 2:00 pm
Location: Colleyville, Texas

Post by asorrell »

It has been a long time since I used DataStage FTP. The main reason is the last time I checked, it was transferring individual records one at a time, instead of a bulk file transfer. It was SLOW.

If anyone has more recent experience that it has changed, please chime in...
Andy Sorrell
Certified DataStage Consultant
IBM Analytics Champion 2009 - 2020
PaulVL
Premium Member
Premium Member
Posts: 1315
Joined: Fri Dec 17, 2010 4:36 pm

Post by PaulVL »

All of the application teams I've dealt with have always preferred shell scripts to handle their (S)FTP needs.
sharmabhavesh
Premium Member
Premium Member
Posts: 38
Joined: Tue Jun 19, 2012 11:03 pm
Location: India

Post by sharmabhavesh »

Also, what all set up or configuration is required to set up FTP between 2 servers?
FranklinE
Premium Member
Premium Member
Posts: 739
Joined: Tue Nov 25, 2008 2:19 pm
Location: Malvern, PA

Post by FranklinE »

The main advantage to using an FTP stage is eliminating the need to land a physical file on the server. Performance is usually the trade-off, with disk space being less expensive that CPU components vs. an essential doubling of the I/O -- once to land the file, another to FTP it.

For Unix-to-Unix, if you are using sshv2, the better choice in my experience is secure copy (scp) from a script.
Franklin Evans
"Shared pain is lessened, shared joy increased. Thus do we refute entropy." -- Spider Robinson

Using mainframe data FAQ: viewtopic.php?t=143596 Using CFF FAQ: viewtopic.php?t=157872
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

sharmabhavesh wrote:Also, what all set up or configuration is required to set up FTP between 2 servers?
That's a question for your System Administrator(s). We don't know what is or isn't allowed in your organization.
-craig

"You can never have too many knives" -- Logan Nine Fingers
sharmabhavesh
Premium Member
Premium Member
Posts: 38
Joined: Tue Jun 19, 2012 11:03 pm
Location: India

Post by sharmabhavesh »

Hi Franklin, thanks for the reply. I couldn't really understand the point where you said that we need to land a physical file on the server.
During FTP, the file needs to by physically present on the server from which file is being FTP'd, right?
asorrell
Posts: 1707
Joined: Fri Apr 04, 2003 2:00 pm
Location: Colleyville, Texas

Post by asorrell »

If you use DataStage FTP stage, the remote file (which is NOT on your server - it is somewhere else), is used as a data source and read one record at a time. It can then be processed (albeit slowly) and sent to its destination, which may be a database or file location on another server.

Hence - "it isn't landed"...

Though DataStage FTP is slow (per-record transfers) it will at least notify you if the connection fails. One thing that most people don't realize is that almost all "freebie" FTP commands on UNIX servers do not report error codes.(they always return "0" after execution). This means you might get a partial file and not know it.

That's why I usually design anything that uses UNIX FTP commands to validate the size of the file that was received, insuring "we got it all".
Andy Sorrell
Certified DataStage Consultant
IBM Analytics Champion 2009 - 2020
FranklinE
Premium Member
Premium Member
Posts: 739
Joined: Tue Nov 25, 2008 2:19 pm
Location: Malvern, PA

Post by FranklinE »

Yes, what Andy said, with this:

FTP "get" as first stage streams the transferred data directly to the output link of the stage. No file is created.

FTP "put" as final stage takes rows from the input link and transfers them directly to the destination server. No file is created on the local server.
Franklin Evans
"Shared pain is lessened, shared joy increased. Thus do we refute entropy." -- Spider Robinson

Using mainframe data FAQ: viewtopic.php?t=143596 Using CFF FAQ: viewtopic.php?t=157872
Post Reply