UNIX sort

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
lokesh_dhiran
Participant
Posts: 9
Joined: Fri Nov 18, 2005 2:02 am

UNIX sort

Post by lokesh_dhiran »

hi,
While using the sort stage i have to come to know that the unix sort option always sorts Character wise. That is between 10 and 7 10 comes before 7 in ascending order.

Please tell me if this patch has been rectified or is there any other problem.
Regards,
Lokesh Dhiran
Associate Consultant
Kanbay Inc.
novice_pgr
Participant
Posts: 8
Joined: Thu Jan 19, 2006 11:45 pm

Re: UNIX sort

Post by novice_pgr »

unix sort is by default character wise . to sort it on numeric value ..it has to be sort -n
lokesh_dhiran wrote:hi,
While using the sort stage i have to come to know that the unix sort option always sorts Character wise. That is between 10 and 7 10 comes before 7 in ascending order.

Please tell me if this patch has been rectified or is there any other problem.
kumar_s
Charter Member
Charter Member
Posts: 5245
Joined: Thu Jun 16, 2005 11:00 pm

Post by kumar_s »

It depends on how you define the column data type.
If you assign the column as Integer you will get as expected.
If you assign as varchar or character, you will get the result from datastage as you instructed the datastage to do.
Impossible doesn't mean 'it is not possible' actually means... 'NOBODY HAS DONE IT SO FAR'
lokesh_dhiran
Participant
Posts: 9
Joined: Fri Nov 18, 2005 2:02 am

Post by lokesh_dhiran »

I will differ with you on this. My key is integer but stil character sort is taking place

kumar_s wrote:It depends on how you define the column data type.
If you assign the column as Integer you will get as expected.
If you assign as varchar or character, you will get the result from datastage as you instructed the datastage to do.
Regards,
Lokesh Dhiran
Associate Consultant
Kanbay Inc.
kumar_s
Charter Member
Charter Member
Posts: 5245
Joined: Thu Jun 16, 2005 11:00 pm

Post by kumar_s »

Not sure whats wrong in your case :roll:
Is your output and input is integer?
Is it a composite key?
Are you sure that your output is not messed up the partition provided?
For testing run you job in sequentiol mode.
Impossible doesn't mean 'it is not possible' actually means... 'NOBODY HAS DONE IT SO FAR'
DSguru2B
Charter Member
Charter Member
Posts: 6854
Joined: Wed Feb 09, 2005 3:44 pm
Location: Houston, TX

Post by DSguru2B »

What exact syntax of sort are you using. Also give us a bit of sample data. We can only tell you whats going wrong by looking at the syntax you are providing.
Creativity is allowing yourself to make mistakes. Art is knowing which ones to keep.
lokesh_dhiran
Participant
Posts: 9
Joined: Fri Nov 18, 2005 2:02 am

Post by lokesh_dhiran »

I am calling the unix sort from the Datastage SORT stage. and setting an integer field as the key. Still the sorting is being done character wise. To tell you the example My key is DeptNo it values are 30,100,200,300,600
now the sorted order coming is
100
200
30
300
600

Does anyone have a reason for this. What i can make out is that it is sorting character wise from the first character
Regards,
Lokesh Dhiran
Associate Consultant
Kanbay Inc.
ArndW
Participant
Posts: 16318
Joined: Tue Nov 16, 2004 9:08 am
Location: Germany
Contact:

Post by ArndW »

Can you post your UNIX sort command line options?
lokesh_dhiran
Participant
Posts: 9
Joined: Fri Nov 18, 2005 2:02 am

Post by lokesh_dhiran »

hi,
am pasting the osh script generated by datastage

-- Here it goes

#################################################################
#### STAGE: Sort
## Operator
psort -sorter unix
## Operator options
-key 'DeptNo'
-asc
-key 'EmpNo'
-asc

## General options
[ident('Sort'); jobmon_ident('Sort'); seq]
## Inputs
0< 'Employees:Employees_Info.v'
## Outputs
0> [] 'Sort:After_Sort.v'
;

#################################################################
#### STAGE: Employees
## Operator
import
## Operator options
-schema record
{final_delim=end, delim='|', quote=none}
(
EmpNo:int32 {quote=none, max_width=3};
FName:string[max=255];
LName:string[max=255];
Gender:string[max=255];
DeptNo:int32 {quote=none};
Salary:int32 {quote=none};
DOB:date;
)
-keepPartitions
-rejects continue
-reportProgress yes
-firstLineColumnNames
-file '/home/kdude/DS_ex/input/Employee_sort.txt'

## General options
[ident('Employees'); jobmon_ident('Employees')]
## Outputs
0> [] 'Employees:Employees_Info.v'
;

#################################################################
#### STAGE: DS_Sort05_Output
## Operator
export
## Operator options
-schema record
{final_delim=end, delim=',', quote=none}
(
EmpNo:int32 {quote=none};
FName:string[max=255];
LName:string[max=255];
Gender:string[max=255];
DeptNo:int32 {quote=none};
Salary:int32 {quote=none};
DOB:date;
)
-file '/home/kdude/DS_ex/output_ds/Sort05_Output.txt'
-overwrite
-rejects continue

## General options
[ident('DS_Sort05_Output'); jobmon_ident('DS_Sort05_Output')]
## Inputs
0< 'Sort:After_Sort.v'
;
--End
Regards,
Lokesh Dhiran
Associate Consultant
Kanbay Inc.
kumar_s
Charter Member
Charter Member
Posts: 5245
Joined: Thu Jun 16, 2005 11:00 pm

Post by kumar_s »

-key 'DeptNo'
-asc
-key 'EmpNo'
-asc
But you told it was just single field...
Check with the combination of DeptNo along with DeptNo.
Impossible doesn't mean 'it is not possible' actually means... 'NOBODY HAS DONE IT SO FAR'
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

I don't see any -n options in the sort specifications.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
lokesh_dhiran
Participant
Posts: 9
Joined: Fri Nov 18, 2005 2:02 am

Post by lokesh_dhiran »

Thats the problem.I am not able to set the parameters for the unix sort so that it sorts the field as number. It by default calls the normal sort function.
Regards,
Lokesh Dhiran
Associate Consultant
Kanbay Inc.
lokesh_dhiran
Participant
Posts: 9
Joined: Fri Nov 18, 2005 2:02 am

Post by lokesh_dhiran »

Actually i have tried with single key also. this is the osh of the job am working on. It does not matter whether you have single or two sorting keys.
Regards,
Lokesh Dhiran
Associate Consultant
Kanbay Inc.
kumar_s
Charter Member
Charter Member
Posts: 5245
Joined: Thu Jun 16, 2005 11:00 pm

Post by kumar_s »

Hi Lokesh,
Do you mean to say that you got the same OSH even if you have single key?
I make quick test job, sequential file--->sort---->sequential file.
I make the key as varchar and sor the similar result as you got. When i turn back to integer it gives in the order.
So Perhaps you can look on something else like, if some space or special character present in the field, which makes the sort unpredictable.
And you havnt provided the partition information yet.
Impossible doesn't mean 'it is not possible' actually means... 'NOBODY HAS DONE IT SO FAR'
Post Reply