How to sort

Post questions here relative to DataStage Server Edition for such areas as Server job design, DS Basic, Routines, Job Sequences, etc.

Moderators: chulett, rschirm, roy

Post Reply
somu_june
Premium Member
Premium Member
Posts: 439
Joined: Wed Sep 14, 2005 9:28 am
Location: 36p,reading road

How to sort

Post by somu_june »

Hi,


I Want to split my source file in to smaller input files. I have material data in that. If I split my source files in to smaller ones then I may face duplicate materials in different files. So how to sort the file on different fields in a file first so that I can split the sorted file in to smaller ones and then I can use my logic to remove duplicates.


Thanks,
Somaraju
somaraju
ArndW
Participant
Posts: 16318
Joined: Tue Nov 16, 2004 9:08 am
Location: Germany
Contact:

Post by ArndW »

Somaraju,

the UNIX "sort" command will let you sort on different fields, with or without the option of removing duplicates. The same applies to the DataStage SORT stage; you can choose your columns to sort on. As you've already stated, you need to sort prior to splitting the file into smaller ones.
What exactly are you asking?
somu_june
Premium Member
Premium Member
Posts: 439
Joined: Wed Sep 14, 2005 9:28 am
Location: 36p,reading road

Post by somu_june »

Hi Arnv,


I have a job that is having sort stage. The problem is it will sort only on that file. For example If iam having same material in first file and also the same material in the 10 th file will this sort stage sort on all files or it will sort columms in first file first then second and so on . if it is sorting on individual file then I will have duplicate materials.


thanks,
somaraju
somaraju
kumar_s
Charter Member
Charter Member
Posts: 5245
Joined: Thu Jun 16, 2005 11:00 pm

Post by kumar_s »

Hi,
"As you've already stated, you need to sort prior to splitting the file into smaller ones."
Pls look into Arnds statement.
Sort the whole file and later split it.

-Kumar
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

:idea: Begin with a plan.

Exactly what is in the single source file? Exactly what do you want in the smaller files? How would you achieve this (in language, not in UNIX/DataStage terms)?

Then you can either convert this to appropriate commands and/or job designs.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Post Reply