Page 1 of 1

Joining Sequential Files

Posted: Mon Nov 15, 2004 2:43 pm
by donlank
I have 3 files with the same column definitions coming from 3 different jobs. I want to put all three files into one file in order to perform one lookup on a hashed file instead of three separate identical lookups.

Is there a way to put these files together?

Thanks,
Kevin

Posted: Mon Nov 15, 2004 2:46 pm
by crouse
You could "cat" the files together at the command line in a Before Job routine, then read the resultant file and create the hash file.

-Craig

Posted: Mon Nov 15, 2004 2:49 pm
by crouse
Or, read all three files with three SeqFile stages and into a Link Collector stage, and then into the hash file.

I like the "cat" option better, though.

-Craig

Posted: Mon Nov 15, 2004 3:06 pm
by donlank
I tried the link collector already. It says, "Link Collector Stage does not support in-process active-to-active inputs or outputs"

So the only option to put three files into one in a Server Job is to write a Before Job Routine? There isnt a stage to accomplish this?

Thanks,
Kevin

Posted: Mon Nov 15, 2004 3:08 pm
by crouse
Nope, no stage to do it (the Before Routine).

You can make the Link Collector work with adding in IPC stages and such. That's why I favor the "cat" method.

-Craig

i met save problem before. you can do as such

Posted: Mon Nov 15, 2004 3:45 pm
by changming
donlank wrote:I tried the link collector already. It says, "Link Collector Stage does not support in-process active-to-active inputs or outputs"

So the only option to put three files into one in a Server Job is to write a Before Job Routine? There isnt a stage to accomplish this?

Thanks,
Kevin
open job peoperty and click performence, then select inter process.
anoter suggestion to your job is using multiple instanc, I believe that your job is a typical multi-instance job.

Posted: Mon Nov 15, 2004 4:27 pm
by tonystark622
Why not three sequential file stages into a single hash file stage, only all of them feeding the same hash file?

Tony

Posted: Mon Nov 15, 2004 4:47 pm
by crouse
The nice thing about DataStage is that there are several ways to do the same thing.

The bad thing about DataStage is that there are several ways to do the same thing.

:D

Posted: Mon Nov 15, 2004 5:50 pm
by jreddy
If you need just the hash file, and not the 3 individual sequential files, could you modify your 3 initial jobs to write to a hash file, rather than a sequential file ??

this way, all 3 jobs are writing to same hash file and when you are done with those, the next set of jobs can lookup on just this single hash file.

Posted: Mon Nov 15, 2004 6:36 pm
by rasi
Hi jreddy

when doing that make sure that the first file writes to the hash will delete the content before writing the records. Otherwise you will have the old records stacked into it.

Thanks

Posted: Tue Nov 16, 2004 11:39 am
by jreddy
kevin,

Actually, you might want to clear previous contents only on the first job that creates this hash file. The other two jobs that write to this hash file should insert data in append mode.

Posted: Thu Nov 18, 2004 8:53 am
by donlank
Thanks for all your input.

I decided to do an after job routine after the 3rd job finishes and cat all three files together, then do the hash file lookup.