Accessing files on a remote server

Post questions here relative to DataStage Server Edition for such areas as Server job design, DS Basic, Routines, Job Sequences, etc.

Moderators: chulett, rschirm, roy

Post Reply
raj4756
Participant
Posts: 17
Joined: Thu Feb 26, 2004 9:07 am

Accessing files on a remote server

Post by raj4756 »

Hi All,

I have DataStage installed on server A. How can I read the dataset on server B which does not have DataStage installed on it. I also want to be able to write to server B.

Thanks.

Raj
xcb
Premium Member
Premium Member
Posts: 66
Joined: Wed Mar 05, 2003 6:03 pm
Location: Brisbane, Australia
Contact:

Post by xcb »

If you are running on windows you can map a drive to the remote server and read/write the dataset over your network. I don't know how to do it on UNIX but I'm sure it has the same functionality.
Cameron Boog
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

NFS mount or an equivalent mechanism. Or work locally and use FTP to get the files back and forth.
-craig

"You can never have too many knives" -- Logan Nine Fingers
Deepak_J
Participant
Posts: 7
Joined: Tue Mar 23, 2004 9:28 am

Post by Deepak_J »

You can write FTP script to bring your file to server A and execute this script as a before job routine for your Job. Also you write the o/p on server A and then have another FTP script that would FTP the file to server B on a after job sub routine.
Hope this helps.

Deepak
raj4756
Participant
Posts: 17
Joined: Thu Feb 26, 2004 9:07 am

Post by raj4756 »

Could you please explain how NFS mount works. Any examples will be appreciated.
Also, does the FTP on a .ds file work the same way as a .dat file.

Let me know.

Thanks.

Raj
Deepak_J
Participant
Posts: 7
Joined: Tue Mar 23, 2004 9:28 am

Post by Deepak_J »

Please clarify, what do u mean by ?
does the FTP on a .ds file work the same way as a .dat file
Also, FTP would be a better solution in terms of speed and reliability.

Deepak
raj4756
Participant
Posts: 17
Joined: Thu Feb 26, 2004 9:07 am

Post by raj4756 »

What I mean is that the .ds file is just a pointer to the underlying data files in the DataFiles directory. Do I have to FTP all the underlying files as well.

Thanks.

Raj
kcbland
Participant
Posts: 5208
Joined: Wed Jan 15, 2003 8:56 am
Location: Lutz, FL
Contact:

Post by kcbland »

raj4756 wrote:What I mean is that the .ds file is just a pointer to the underlying data files in the DataFiles directory. Do I have to FTP all the underlying files as well.

Thanks.

Raj
A .ds file is a specific work file structure that PX uses. It has no ability to be used by anything other than PX, so why would you put it anywhere else? If you need to create a data file to send to another server, then create a sequential file and ftp it. An alternative is to share a filesystem on on the DS server that is visible to the remote server (the remote server NFS (Network File System, I suggest you google it and learn about it, it's been around for quite a long time). Either way, your premise of distributing a .ds file is invalid.
Kenneth Bland

Rank: Sempai
Belt: First degree black
Fight name: Captain Hook
Signature knockout: right upper cut followed by left hook
Signature submission: Crucifix combined with leg triangle
raj4756
Participant
Posts: 17
Joined: Thu Feb 26, 2004 9:07 am

Post by raj4756 »

Ken,

We are in the process of comparing peformance of reads from partitioned oracle tables versus reading from .ds files.
The oracle database is on a remote server. So, we want to mimic this with the .ds by reading from the remote server . Moreover we don't
want to be testing on the production server and meddle with the production files.

Any suggestions?

Thanks.

Raj
kcbland
Participant
Posts: 5208
Joined: Wed Jan 15, 2003 8:56 am
Location: Lutz, FL
Contact:

Post by kcbland »

Your comparison is Oracle references versus a .ds Merge or Lookup operation. In the words of Tom Kyte, you're comparing apples to toaster ovens. The whole problem with database reference lookup calls is the saturation issue on the database server side combined with repititive query combined with network traffic combined with query queuing combined with shifting degrees of parallelism based on instantaneous parallel query slave usage on the server.

Both flavors of DataStage, Server and PX, encourage the use of localized reference structures (hash files, .ds, etc) to remove the database from the equation and put it into an optimized place, the DataStage transformation server.
Kenneth Bland

Rank: Sempai
Belt: First degree black
Fight name: Captain Hook
Signature knockout: right upper cut followed by left hook
Signature submission: Crucifix combined with leg triangle
raj4756
Participant
Posts: 17
Joined: Thu Feb 26, 2004 9:07 am

Post by raj4756 »

Kenneth,

We are basically an Oracle shop (90%), and are thinking of using Oracle tables not just as reference, but as source data as well. So you don't think this is a great idea ?

Raj
kcbland
Participant
Posts: 5208
Joined: Wed Jan 15, 2003 8:56 am
Location: Lutz, FL
Contact:

Post by kcbland »

Databases are sources and targets. As reference objects they don't do well for a host of reasons. I'm not talking about joins, I'm talking about references, a totally different concept. As sources and targets, obviously, they are fine. But, when you have to manipulate, massage, or enrich data then temporary scratch pad objects not in a database have the highest performance.
Kenneth Bland

Rank: Sempai
Belt: First degree black
Fight name: Captain Hook
Signature knockout: right upper cut followed by left hook
Signature submission: Crucifix combined with leg triangle
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

The question was about datasets in the PX environment.

All you should need to do, provided that the machines are in the same cluster, is to define the hostname along with other definitions of each resource, in the PX configuration file associated with the job. Information on editing configuration files is in the DataStage Manager Guide (man_gde.pdf) in your Docs folder. In particular, see Chapter 11 (The Parallel Engine Configuration File).
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Post Reply