Page 1 of 1

Sorting in server jobs

Posted: Mon Sep 24, 2007 10:56 am
by kimbal
Hi
Could any body say ,How to sort the data in Server side. By using of Transformer?

Posted: Mon Sep 24, 2007 11:25 am
by DSguru2B
Specifically using the transformer?, any particular reason you want to do it that way?
You can use stage variables to do it, but then again you will have to really sit down and design it to know how many stage variables you need.
If I were you, I would simply go for an os level sort, much faster and easier.

Posted: Mon Sep 24, 2007 11:48 am
by chulett
You can't "sort" in a Transformer. You can, hower, use the oddly named Sort stage in a Server job.

Posted: Mon Sep 24, 2007 12:02 pm
by kimbal
Could you give some tips , how to sort by using of sort stage...in DS server side.

Posted: Mon Sep 24, 2007 12:15 pm
by chulett
Sorting is sorting, so there's nothing all that mysterious going on except for two things:

1) Always provide a 'real' temporary file location in the stage, like "/tmp" for it to use. By default it will use the job's Project directory and you don't want to do that.

2) The 'Sort Specifications' can be a little confusing the first time you see it. Put your cursor in that field and click on Property Help for the official explanation.

Best to also click on the Help button in the stage and read through all of the available online help as well.

Posted: Mon Sep 24, 2007 2:00 pm
by ray.wurlod
By default on some variants of UNIX, /tmp is not big enough. Make sure that you specify a directory where there is plenty of free disk space.

Hi

Posted: Mon Oct 08, 2007 4:39 am
by kumardesu
By using the Sort Stage in DataStage Server Jobs, we should specify the "Sort Specifications" in Stage page. For ex: If we want to sort the incoming data based upon the particular column like Order Id then syntax is like in Sort Specifications:Order Id Asc/Dsc.
And if u want to sort based upon more than one fields then syntax will be the same but column names will be separated by the comma(,)operator.

Thanks&Regards,
Desu

Posted: Mon Oct 08, 2007 4:50 am
by ray.wurlod
The Sort stage in server jobs is not particularly efficient. It is often more efficient to have the UNIX sort command sort the data. If even more speed is required, you can use products such as SyncSort or CoSort.

Re: Sorting in server jobs

Posted: Thu Oct 18, 2007 1:43 am
by baglasumit21
kimbal wrote:Hi
Could any body say ,How to sort the data in Server side. By using of Transformer?
You can use the sort stage which is the simples way to sort data in a server editon.

But in our case some times sort stage gives issues when the amount of data to be sorted is huge (around approx 3 million), so we have replaced the sort stage with the unix sort command. Though it requires a additional overhead of writing and reading from a sequential file still its more faster and efficient than the sort stage supplied by datastage.

To use the sort stage you can write the data to a delimitted sequential file and then supply the file as input to the next stage (whether transformer or agregator ) and in the before stage routine sort the file using the ExecSH to run the unix sort command.

Re: Sorting in server jobs

Posted: Thu Oct 18, 2007 1:44 am
by baglasumit21
kimbal wrote:Hi
Could any body say ,How to sort the data in Server side. By using of Transformer?
You can use the sort stage which is the simples way to sort data in a server editon.

But in our case some times sort stage gives issues when the amount of data to be sorted is huge (around approx 3 million), so we have replaced the sort stage with the unix sort command. Though it requires a additional overhead of writing and reading from a sequential file still its more faster and efficient than the sort stage supplied by datastage.

To use the sort stage you can write the data to a delimitted sequential file and then supply the file as input to the next stage (whether transformer or agregator ) and in the before stage routine sort the file using the ExecSH to run the unix sort command.

Posted: Fri Oct 19, 2007 1:00 pm
by jdmiceli
Of course, if you like Perl (as I do), you could put something like this in your path somewhere (maybe common scripts directory) and be able to run it from anywhere:

Code: Select all

#!/user/local/bin/perl -w
#
#       program name: FileSorter.pl
#       Purpose     : To sort a flat file taking into account the header
#                       row if it exists. It also allows the use to sort
#                       in either ascending or descending order.
#       Usage       : perl FileSorter.pl Y/N A/D filename
#                     ie: perl FileSorter.pl Y A bob.txt - this would
#                       sort a file named bob.txt that includes a header
#                       in ascending order.
#       Limitations : This program is brain dead simple and goes in order
#                       of the lines as they are defined and it assumes
#                       this is a text sort. It does not allow
#                       you to pick a field to sort on.
#
#       This program will not overwrite your file!  It will make a new file
#       using your original filename with "sorted_" prepended.

use strict;

my $header = uc($ARGV[0]); # Y = header, N = No header
my $order = uc($ARGV[1]); # D = descending, A = ascending
my $filename = $ARGV[2];
my $tmp = "$filename" . "_sorted";
my @array = '';
my @sorted = '';

open (F, "< $filename") || die "Could not open $filename for reading\n";
open (R, "> $tmp") || die "Could not open $tmp for writing.\n";

while (<F>)
{
    chomp $_;
    if ($header eq "Y")
    {
        print R "$_\n";
        $header = "N";
        next;
    }
    push(@array,"$_\n");
}

# making the assumption this is a numeric sort
if ($order eq "A")
    {@sorted = sort {$a cmp $b} @array;}
else
    {@sorted = sort {$b cmp $a} @array;}

foreach(@sorted)
{print R "$_";}

close F;
close R;
Usage is: perl FileSorter.pl Y/N A/D filename
where Y/N is Yes-there is a header row or N- there isn't and A=Ascending order/D=Descending order and then the filename you want sorted.

It is pretty simple and pretty fast and written to work anywhere. Of course, your mileage may vary.