splitting into multiple files

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
kavuri
Premium Member
Premium Member
Posts: 161
Joined: Mon Apr 16, 2007 2:56 pm

splitting into multiple files

Post by kavuri »

Hi,
I am having a big DB2 table in terms of number of rows. After passing through various stages I need to write into a flat file, so I am writing into a .csv file.
Now as the number of records are very large in terms of billions, I want to store the result into multiple files with same name with last 2 or 3 characters as suffixes like.

flatfile_000.csv
flatfile_001.csv
flatfile_002.csv
.
.
.
.
Like this I want to create my target file. I am supposed to make each file with 50,000 records each.

Can anybody tell me how can I achieve this?

Thanks
Kavuri
Maveric
Participant
Posts: 388
Joined: Tue Mar 13, 2007 1:28 am

Post by Maveric »

with a billion records and 50,000 records per file, u will need 20,000 files. :)
Why not store in DataSet? It will take much less space. And easy for further processing as well.
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Or perhaps explain why you 'need' to write to a flat file.
-craig

"You can never have too many knives" -- Logan Nine Fingers
kavuri
Premium Member
Premium Member
Posts: 161
Joined: Mon Apr 16, 2007 2:56 pm

Post by kavuri »

Hi,
Target flat files are utilised by another product which is written in Orchestrate. So i am writing them in .csv format. If you have any other idea please let me know. Or please tell me how can I achieve this? If not 20,000 we will prepare more flat files. That is what the requirement from the team.

Thanks
Kavuri
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Create one large file using DataStage then use the UNIX command split to break it into many smaller files.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
ArndW
Participant
Posts: 16318
Joined: Tue Nov 16, 2004 9:08 am
Location: Germany
Contact:

Post by ArndW »

kavuri wrote:...Target flat files are utilised by another product which is written in Orchestrate...
Are you sure? Orchestrate is the old name of the PX/EE product so you might as well use DataSets.
Post Reply