Splitting One File to Multiple Files

Post questions here relative to DataStage Server Edition for such areas as Server job design, DS Basic, Routines, Job Sequences, etc.

Moderators: chulett, rschirm, roy

120267
Participant
Posts: 30
Joined: Tue Jun 07, 2005 12:27 am

Splitting One File to Multiple Files

Post by 120267 »

Hi,

I want to split the Product Sales File, Which is having 100 Products.I have to split that file and store it as 100 Flat files with Product name as the File Name.We are not supposed to use the loop activity to trigger the same job for 100 times.Is there any other way to do it in the same datastage job.
With Love,

«·´`·.(*·.¸(`·.¸ ¸.·´)¸.·*).·´`·»
«.......>>>> Siva.G<<<<......»
«·´`·.(¸.·*(¸.·´ `·.¸)*·.¸).·´`·»
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Heck, yes. A Filter stage or a Transformer stage with 100 output links. Easy.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
narasimha
Charter Member
Charter Member
Posts: 1236
Joined: Fri Oct 22, 2004 8:59 am
Location: Staten Island, NY

Post by narasimha »

Giving a sample of your Product Sales File, can help.
Also give a sample of how you output file should look.
Narasimha Kade

Finding answers is simple, all you need to do is come up with the correct questions.
120267
Participant
Posts: 30
Joined: Tue Jun 07, 2005 12:27 am

Post by 120267 »

Ray,

The count of the products are not Defined, It is dynamic.If we get 60 products we have to split as 60 files with product name as the file name.We may get more than 100 products also.Is there any solution with out implementing using loop.
With Love,

«·´`·.(*·.¸(`·.¸ ¸.·´)¸.·*).·´`·»
«.......>>>> Siva.G<<<<......»
«·´`·.(¸.·*(¸.·´ `·.¸)*·.¸).·´`·»
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

You could always code something up in BASIC to read the input and write the output to different filenames... or you could use the Folder stage. :wink:

Read the Server Job Developer's Guide section on it to see how it can dynamically write to many files in a directory.
-craig

"You can never have too many knives" -- Logan Nine Fingers
120267
Participant
Posts: 30
Joined: Tue Jun 07, 2005 12:27 am

Post by 120267 »

narasimha,

It should be like this

Input File :Product.txt

Product_name Region Level Sales
A Adc 1 2300$
A Adb 1 2300$
A Ad1 1 2300$
A Ad2 1 2300$
B Adc 1 2300$
B Adb 1 2300$
B Ad1 1 2300$
B Ad2 1 2300$
C Adc 1 2300$
C Adb 1 2300$
C Ad1 1 2300$
C Ad2 1 2300$

I want the out put as 3 files:


Out Put Files:

A.txt

A Adc 1 2300$
A Adb 1 2300$
A Ad1 1 2300$
A Ad2 1 2300$

B.txt

B Adc 1 2300$
B Adb 1 2300$
B Ad1 1 2300$
B Ad2 1 2300$

C.txt

C Adc 1 2300$
C Adb 1 2300$
C Ad1 1 2300$
C Ad2 1 2300$
With Love,

«·´`·.(*·.¸(`·.¸ ¸.·´)¸.·*).·´`·»
«.......>>>> Siva.G<<<<......»
«·´`·.(¸.·*(¸.·´ `·.¸)*·.¸).·´`·»
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

(pssst... Folder stage)
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Did you look in the Server Jobs Developer's Guide pdf as I mentioned?

In my 7.5.1A version, Chapter 11 is dedicated to the Folder stage and the Folder Stage Input Data section tells you what you need to know to use the folder stage to write multiple files into a directory and how you can control the names of those files.

I'm not about to transcribe that chapter into this forum, so... go give it a read, try to use it in your job and if you've done that and you have some specific questions - come on back with them.
-craig

"You can never have too many knives" -- Logan Nine Fingers
120267
Participant
Posts: 30
Joined: Tue Jun 07, 2005 12:27 am

Post by 120267 »

Great Thanks chulett,

I have tried it. It's working fine.But the file is having the latest record.Is there any property to set "Append the File"

If i gave the input File as...

Input File :Product.txt
Product_name Region Level Sales
A Adc 1 2300$
A Adb 1 2300$
A Ad1 1 2300$
A Ad2 1 2300$
B Adc 1 2300$
B Adb 1 2300$
B Ad1 1 2300$
B Ad2 1 2300$
C Adc 1 2300$
C Adb 1 2300$
C Ad1 1 2300$
C Ad2 1 2300$

The out put of 3 files are


Out Put Files:

A.txt

A Ad2 1 2300$

B.txt

B Ad2 1 2300$

C.txt

C Ad2 1 2300$

But it should not be like this, it should contain all the records.
With Love,

«·´`·.(*·.¸(`·.¸ ¸.·´)¸.·*).·´`·»
«.......>>>> Siva.G<<<<......»
«·´`·.(¸.·*(¸.·´ `·.¸)*·.¸).·´`·»
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

The Sequential File stage does have append and overwrite as write methods. But the question remains: how are you parsing the results transmitted by the Folder stage into separate files?
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

120267 wrote:I have tried it. It's working fine. But...
In other words, it's not working fine. :wink:

Let me be the first to admit I've never actually used the Folder stage as a target, never had a need to. But I seem to recall others using it here and reporting success, hence the recommendation.

The docs say the first column must be marked as a Key and contain the filename. Ah... they then go on to say that the remaining columns:
are written to the named file, each column separated by a newline. Data to be written to a directory would normally be delivered in a single column.
And the example shows a single LongVarchar field. So it sounds like to use that stage you'd have to reverse what it does when it reads a file - put everything in one field for each Product. :(

To anyone whom has done this in the past - is that correct?

Worst case you could bone up on the sequential file processing functions that BASIC has (OPENSEQ,CLOSESEQ,etc) and write up some custom job control code to do this... wouldn't be all *that* hard.
-craig

"You can never have too many knives" -- Logan Nine Fingers
DSguru2B
Charter Member
Charter Member
Posts: 6854
Joined: Wed Feb 09, 2005 3:44 pm
Location: Houston, TX

Post by DSguru2B »

Lets think outside of DataStage shall we. You can do this via a unix script. Here is what i can offer

Code: Select all

#!/usr/bin/ksh

export filepath=/Data/SFDCDEV/scripts/dsx.txt
export tempFile=/Data/SFDCDEV/scripts/my.tmp
export newFileDir=/Data/SFDCDEV/scripts

cat $filepath | sort | awk -F"\ " '{print $1}' | uniq > $tempFile
cat $tempFile | while read filename
do
  cat $filepath | grep -w $filename > $newFileDir/$filename.txt
done
rm -f $tempFile
echo "All done"
Change the variables according to your environment. Basically
filepath is the text file that you want to manipulate
tempFile is a temporary file needed for manipulation. It will be deleted at the end of the script.
newFileDir is where you want your new files to be created.
Creativity is allowing yourself to make mistakes. Art is knowing which ones to keep.
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

DSguru2B wrote:Lets think outside of DataStage shall we.
Ha! What do you think this is... AwkXchange? :P
-craig

"You can never have too many knives" -- Logan Nine Fingers
DSguru2B
Charter Member
Charter Member
Posts: 6854
Joined: Wed Feb 09, 2005 3:44 pm
Location: Houston, TX

Post by DSguru2B »

Ha Ha Ha. Three Ha's from me :wink:
THis is more like, GettingMyWorkDoneNoMatterWhatXchange. How about that ?
Creativity is allowing yourself to make mistakes. Art is knowing which ones to keep.
Ultramundane
Participant
Posts: 407
Joined: Mon Jun 27, 2005 8:54 am
Location: Walker, Michigan
Contact:

Post by Ultramundane »

Awk keeps track of open files.

awk '{ print $0 >"/outfilepath/"$1".txt"; }' <infile.txt
Post Reply