Page 1 of 1

How does filter command in sequential file stage work?

Posted: Tue Apr 05, 2005 3:53 am
by dhiraj
Hello,
How does filter command in sequential file stage work?
does it read each line in the specified input file and pass each line to the filter command using indirection and uses what ever is output by the filter command for processing, including any informational message?

I am presently trying to run a sort tool(syncsort) using the filter command and am getting the following error.

sorttest..Sequential_File_1.DSLink2: ds_seqopen() - Error in filter command "/tools/syncsort/bin/syncsort /silent /noprompt /end" -
[SyncSort HP-UX/LFS Rel. 3.6.0 Copyright (c) 2002 Syncsort Inc.]

when i use the same filter command and an indirection operator followed by the input file name at unix shell, it correctly prints the sorted data on the screen.

what am i doing wrong in the filter command?

thanks
Dhriaj

Posted: Tue Apr 05, 2005 6:43 am
by chulett
Not really sure what might be wrong. All the Filter command does is execute the command as entered and the stage then reads 'standard out' as if it were a flat file (so to speak). So, yes, any informational or extraneous messages would be a problem. :?

Posted: Tue Apr 05, 2005 6:57 am
by dhiraj
chulett wrote:. All the Filter command does is execute the command as entered and the stage then reads 'standard out' as if it were a flat file
so do i also need to specify the input data file in the filter command ? does datastage not indirect the file specified in the sequential file stage to the filter command?

Thanks

Dhiraj

Posted: Tue Apr 05, 2005 7:18 am
by chulett
dhiraj wrote:so do i also need to specify the input data file in the filter command ? does datastage not indirect the file specified in the sequential file stage to the filter command?
Nope, it doesn't do anything automatic with the normal 'Filename' in the stage. You are required to put something in there, but it's just to shut it up. :wink: I usually use "/dev/null" so it's obvious.

You need the complete command in the Filter option, including any redirection - just as if you were running it from the command line.

Posted: Wed Apr 06, 2005 2:26 am
by dhiraj
I just specified "sort -u" in the filter command and input file name in the file name property of sequential file stage and it worked just fine. I mean without specifying the file to sort in the filter command.

Is there something that i am missing?

Thanks
Dhiraj

Posted: Wed Apr 06, 2005 9:45 am
by chulett
Interesting... that's not my understanding as to how it works or how I've been using it. :? I'd need to double-check the docs and play around a bit to have any more feedback on the subject.

If you've got it working that way, then great. Are you sure it actually did the sort?

Posted: Wed Apr 06, 2005 11:09 am
by kduke
ADN has an example of the filter command using dsjob -report XML. You can download it and see how it works.

Posted: Wed Apr 06, 2005 1:27 pm
by kollurianu
hi ,

i just now checked the example posted in ADN , that means sort can
done on sequential stage and then why do we need sort stage seperately.

thank you,

Posted: Wed Apr 06, 2005 5:09 pm
by ray.wurlod
Sort stage is primarily there as a marketing exercise; once upon a time Informatica beat up DataStage by observing that DataStage lacked a sort capability. So they put one in.

UNIX sort is faster than the DataStage one.

CoSort and SyncSort are faster still (faster, too, at emptying your budget).

Posted: Thu Apr 07, 2005 7:49 am
by kollurianu
Thank you very much Ray . Can you clarify me one thing , when i checked
the example in the ADN , i saw the sorted output from input stage it self,
do we really need the output stage to capture the sorted output or for illustration purpose the design was like that in ADN.

Thank you very much.

Posted: Thu Apr 07, 2005 8:44 am
by ArndW
Ray,

I seem to remember that version one of the SORT stage used a bubble sort algorithm written in BASIC, then it went to a bubble in C and then graduated into a hash bucket sort and I don't know what algorithm it utilizes now. The ones I've used in Px are certainly very, very fast and don't compare to the server version.

Posted: Thu Apr 07, 2005 9:34 am
by kollurianu
Hi all,

Can you clarify me one thing , when i checked
the example in the ADN , i saw the sorted output from input stage it self,
do we really need the output stage to capture the sorted output or for illustration purpose the design was like that in ADN.

:?

Thank you,

Posted: Thu Apr 07, 2005 4:31 pm
by ray.wurlod
There should be no reason not to use sort as a filter in a Sequential File stage's Input link. You may need to specify stdin as the source file (as "-") for the command, and any key specification ("-k") must correctly match the metadata from your job design.