How does filter command in sequential file stage work?

Post questions here relative to DataStage Server Edition for such areas as Server job design, DS Basic, Routines, Job Sequences, etc.

Moderators: chulett, rschirm, roy

Post Reply
dhiraj
Participant
Posts: 68
Joined: Sat Dec 06, 2003 7:03 am

How does filter command in sequential file stage work?

Post by dhiraj »

Hello,
How does filter command in sequential file stage work?
does it read each line in the specified input file and pass each line to the filter command using indirection and uses what ever is output by the filter command for processing, including any informational message?

I am presently trying to run a sort tool(syncsort) using the filter command and am getting the following error.

sorttest..Sequential_File_1.DSLink2: ds_seqopen() - Error in filter command "/tools/syncsort/bin/syncsort /silent /noprompt /end" -
[SyncSort HP-UX/LFS Rel. 3.6.0 Copyright (c) 2002 Syncsort Inc.]

when i use the same filter command and an indirection operator followed by the input file name at unix shell, it correctly prints the sorted data on the screen.

what am i doing wrong in the filter command?

thanks
Dhriaj
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Not really sure what might be wrong. All the Filter command does is execute the command as entered and the stage then reads 'standard out' as if it were a flat file (so to speak). So, yes, any informational or extraneous messages would be a problem. :?
-craig

"You can never have too many knives" -- Logan Nine Fingers
dhiraj
Participant
Posts: 68
Joined: Sat Dec 06, 2003 7:03 am

Post by dhiraj »

chulett wrote:. All the Filter command does is execute the command as entered and the stage then reads 'standard out' as if it were a flat file
so do i also need to specify the input data file in the filter command ? does datastage not indirect the file specified in the sequential file stage to the filter command?

Thanks

Dhiraj
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

dhiraj wrote:so do i also need to specify the input data file in the filter command ? does datastage not indirect the file specified in the sequential file stage to the filter command?
Nope, it doesn't do anything automatic with the normal 'Filename' in the stage. You are required to put something in there, but it's just to shut it up. :wink: I usually use "/dev/null" so it's obvious.

You need the complete command in the Filter option, including any redirection - just as if you were running it from the command line.
-craig

"You can never have too many knives" -- Logan Nine Fingers
dhiraj
Participant
Posts: 68
Joined: Sat Dec 06, 2003 7:03 am

Post by dhiraj »

I just specified "sort -u" in the filter command and input file name in the file name property of sequential file stage and it worked just fine. I mean without specifying the file to sort in the filter command.

Is there something that i am missing?

Thanks
Dhiraj
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Interesting... that's not my understanding as to how it works or how I've been using it. :? I'd need to double-check the docs and play around a bit to have any more feedback on the subject.

If you've got it working that way, then great. Are you sure it actually did the sort?
-craig

"You can never have too many knives" -- Logan Nine Fingers
kduke
Charter Member
Charter Member
Posts: 5227
Joined: Thu May 29, 2003 9:47 am
Location: Dallas, TX
Contact:

Post by kduke »

ADN has an example of the filter command using dsjob -report XML. You can download it and see how it works.
Mamu Kim
kollurianu
Premium Member
Premium Member
Posts: 614
Joined: Fri Feb 06, 2004 3:59 pm

Post by kollurianu »

hi ,

i just now checked the example posted in ADN , that means sort can
done on sequential stage and then why do we need sort stage seperately.

thank you,
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Sort stage is primarily there as a marketing exercise; once upon a time Informatica beat up DataStage by observing that DataStage lacked a sort capability. So they put one in.

UNIX sort is faster than the DataStage one.

CoSort and SyncSort are faster still (faster, too, at emptying your budget).
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
kollurianu
Premium Member
Premium Member
Posts: 614
Joined: Fri Feb 06, 2004 3:59 pm

Post by kollurianu »

Thank you very much Ray . Can you clarify me one thing , when i checked
the example in the ADN , i saw the sorted output from input stage it self,
do we really need the output stage to capture the sorted output or for illustration purpose the design was like that in ADN.

Thank you very much.
ArndW
Participant
Posts: 16318
Joined: Tue Nov 16, 2004 9:08 am
Location: Germany
Contact:

Post by ArndW »

Ray,

I seem to remember that version one of the SORT stage used a bubble sort algorithm written in BASIC, then it went to a bubble in C and then graduated into a hash bucket sort and I don't know what algorithm it utilizes now. The ones I've used in Px are certainly very, very fast and don't compare to the server version.
kollurianu
Premium Member
Premium Member
Posts: 614
Joined: Fri Feb 06, 2004 3:59 pm

Post by kollurianu »

Hi all,

Can you clarify me one thing , when i checked
the example in the ADN , i saw the sorted output from input stage it self,
do we really need the output stage to capture the sorted output or for illustration purpose the design was like that in ADN.

:?

Thank you,
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

There should be no reason not to use sort as a filter in a Sequential File stage's Input link. You may need to specify stdin as the source file (as "-") for the command, and any key specification ("-k") must correctly match the metadata from your job design.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Post Reply