fileset Stage

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
arvind
Participant
Posts: 17
Joined: Sun Aug 07, 2005 7:57 am

fileset Stage

Post by arvind »

Hello Everybody,
I have sequential source file which consists of a country column, I need to create job which will create individual file for each country.

I know we can use the FileSet stage for this.
I want to know how to define partitions for each country.
How can we give our own target path and file name.
Please let me know.

Thanks in Advance
Arvind
bcarlson
Premium Member
Premium Member
Posts: 772
Joined: Fri Oct 01, 2004 3:06 pm
Location: Minnesota

Post by bcarlson »

I am not sure this will produce what you are looking for. You can use fileset to create one file per partition, but you may have multiple values per partition. For example, if I have 10 partitions but have 20 countries, then I'll have 2 countries per partition.

On the other hand, I think (someone please correct me if I am wrong), not specifying 'single file per partition' means the data is round-robined to each output file specified in the fileset descriptor.

Here's a different method you could try. Preload a dataset with your sequential source file, so you don't have to read it more than once. Then create a list of the unique country values (either a hard coded list you create once, or create a DS job to create the list dynamically). Then iterate through this list, and pass the value as a parameter to a job that filters out the specific values and exports to a file. If you setup the job to run with multiple instances, you could run several filters concurrently.

HTH,

Brad.
DEVESHASTHANA
Participant
Posts: 47
Joined: Thu Sep 16, 2004 5:26 am
Location: India

Post by DEVESHASTHANA »

Arvind,

For your problem easy solution is to use filter stage directly and
1: If you want to run multiple instances of the job then parameterise the job i.e. in filter stage pass column in where clause and parameterise the where clause with country name,and specify the output link,and in the out put file also parameterised the output file name so that u can have country wise file name,

2: If you want to run the job once and want the output at one go then u can use same filter stage with multiple output file options ,in where clause of the filter give country name with country column and specify different output links with country names
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Isn't this requirement exactly fulfilled by a Switch stage?
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Post Reply