Splitting to multiple files at runtime

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
Django
Premium Member
Premium Member
Posts: 20
Joined: Fri Apr 18, 2008 12:16 am

Splitting to multiple files at runtime

Post by Django »

I have got a source text file as follows

Code: Select all

-------------------------------------------
TargetFileName DataCol1 DataCol2
--------------------------------------------
ABC.txt                1             2
ABC.txt                5             6
ABC.txt                11            12
ABC.txt                41            24
DAMMY.txt           12             23
DAMMY.txt           34             45
DAMMY.txt           24             99
... and so on...

The number of TargetFileName is variable

Now I want to create output files as per the TargetFileName column. Ie. One file named ABC.txt and one file named DAMMY.txt and so on each on which will have only the columns DataCol1 and DataCol2.
Note - The number of TargetFileName is variable... thats what has made me come to this forum...
Django
If a simple dsx project requires such intelligence how much more is required to create this Cosmic System. Who dares to say there is no intelligence behind this creation i.e. no GOD .....
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Ok, then. Go for it.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Django
Premium Member
Premium Member
Posts: 20
Joined: Fri Apr 18, 2008 12:16 am

Help!!

Post by Django »

I need help here, please.

While reading a sequential file stage it allows to read from multiple files using pattern. What about while writing ?
Django
If a simple dsx project requires such intelligence how much more is required to create this Cosmic System. Who dares to say there is no intelligence behind this creation i.e. no GOD .....
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Create a server job that reads your file and invokes UtilityRunJob() to run a generic job that takes the file name (and anything else needed) as a job parameter.

You could do the same with a job sequence (instead of a server job), if you read the file then convert the line terminators into a delimiter that the StartLoop activity can use. Within the loop is your generic job.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Django
Premium Member
Premium Member
Posts: 20
Joined: Fri Apr 18, 2008 12:16 am

Unix Awk does it all

Post by Django »

After Job Sub-Routine : ExecSH

cd #SWorksOutputFilePath# ; awk -F, '{print > $2}' Sales.Imp ; awk -F, '{print > $2}' HistData.Imp

did it all :lol:

Col No. 2 in the file carries the filename.

Thanks to my collegue Gary ...
Django
If a simple dsx project requires such intelligence how much more is required to create this Cosmic System. Who dares to say there is no intelligence behind this creation i.e. no GOD .....
Django
Premium Member
Premium Member
Posts: 20
Joined: Fri Apr 18, 2008 12:16 am

Explored options

Post by Django »

Thanks to Ray's post. It made me explore various options...
Django
If a simple dsx project requires such intelligence how much more is required to create this Cosmic System. Who dares to say there is no intelligence behind this creation i.e. no GOD .....
Post Reply