Input Source for PX job - FTP vs Named pipes

Post questions here relative to DataStage Server Edition for such areas as Server job design, DS Basic, Routines, Job Sequences, etc.

Moderators: chulett, rschirm, roy

Post Reply
dsuser1
Participant
Posts: 14
Joined: Thu Oct 16, 2003 5:58 pm

Input Source for PX job - FTP vs Named pipes

Post by dsuser1 »

Hi,

I have a requirement to bring in a flat file from one of the mainframe machines to the Datastage ETL in Unix and use the file as input source for the ETL job. I am thinking to do this using an FTP process to bring the file across to ETL unix server and use it as the flat file input source.
But in this case I will need to wait till the FTP process is over. Is there a way other than FTP to read the data as and when it get through using some thing like named pipes? Does DataStage support this? What are the pros and cons for both these approaches?

Advise please.
kcbland
Participant
Posts: 5208
Joined: Wed Jan 15, 2003 8:56 am
Location: Lutz, FL
Contact:

Post by kcbland »

Yes, but you had a better design the first time, you're better off landing it first. This is because once you have a physical file locally, a DataStage PX or instantiated Server job can take advantage of parallel processing and have multiple readers going after the file. If you are waiting on a serialized transfer process, you limit your ability to massively process the file.
Kenneth Bland

Rank: Sempai
Belt: First degree black
Fight name: Captain Hook
Signature knockout: right upper cut followed by left hook
Signature submission: Crucifix combined with leg triangle
bigpoppa
Participant
Posts: 190
Joined: Fri Feb 28, 2003 11:39 am

Input Source for PX job - FTP vs Named pipes

Post by bigpoppa »

I have seen a PX FTP stage in the works that does exactly what you want. You might want to ask Asc engineering for it. However, I agree with Ken that your best bet is to land the files to your server before your ETL process starts. What happens if the FTP connection is suddenly cut? Then PX would process the data it received, even if it should have waited for the entire file to come down thru FTP. You could end up with a successful PX run, even though it didn't process all of the data.

- BP
dsuser1
Participant
Posts: 14
Joined: Thu Oct 16, 2003 5:58 pm

Post by dsuser1 »

can you please throw some light on the named pipe that datasatge supports? Is there a stage for this? If I use this will it not support the parallelism?

Also the ftp stage in PX. How does this work? Can I use this as an input stage and will datastage manage the ftp connection and data pooling so that parallelism is achieved?
Post Reply