DSXchange

Posted: **Wed May 26, 2010 3:33 pm**

Hello all

I have data like

col1 col2 col3

aaa 21 22
aaa 31 23
bbb 12 12
ccc 32 12
bbb 12 34

Now i have to generate separate files for each distinct value of col1.
In this case i will generate 3 files.
1st file will be like
aaa 21 22
aaa 31 23

2nd file will be like
bbb 12 12
bbb 12 34

3rd file will be like
ccc 32 12.

This task is easy if the distinct values in col1 are fixed. But in my case once i run the job, distinct values in col1 can be 12,20,3,4 .. i dont know.
Its all dynamic. Still we can do this using the looping activity in Datastage sequence. we can loop the job for each distinct value of col1. But i do not want to do using looping activity. I want to run the process once and multiple files should be generated each having one distinct value of col1.
Is this possible? Can we use any of the DataStage stages and implement this? Please reply.

Posted: **Wed May 26, 2010 3:36 pm**

Unix solution

Code: Select all

awk ' { print > $1 } ' file

Posted: **Wed May 26, 2010 4:05 pm**

anbu wrote:Unix solution
Code: Select all
awk ' { print > $1 } ' file

I did not understand. Can you please elaborate your response.

Posted: **Wed May 26, 2010 4:14 pm**

In a Sort stage, turn key change column to True. This key change column stores 1 if the key value of the current row is not equal to the previous row's value or 0 if the key value of the current row is equal to the previous row's value. So whenever the key change column is 1, write to a new file.

As a newbie myself, I have not figured out how to write to a new file based on a condition. Maybe someone else can pitch in here.

Posted: **Wed May 26, 2010 5:48 pm**

A Server job writing to a Folder stage can handle that. As (technically) can the XML Output stage, it doesn't even need to be XML for the Trigger Column there to help you out.

Posted: **Wed May 26, 2010 6:58 pm**

Code: Select all

$ ls
file
$ cat file
aaa 21 22
aaa 31 23
bbb 12 12
ccc 32 12
bbb 12 34
$ awk ' { print > $1 } ' file
$ ls
aaa  bbb  ccc  file
$ cat aaa
aaa 21 22
aaa 31 23
$ cat bbb
bbb 12 12
bbb 12 34
$ cat ccc
ccc 32 12

print > $1 Prints line to filename based on the value in first field

Posted: **Thu May 27, 2010 9:49 am**

anbu wrote:
Code: Select all
$ ls
file
$ cat file
aaa 21 22
aaa 31 23
bbb 12 12
ccc 32 12
bbb 12 34
$ awk ' { print > $1 } ' file
$ ls
aaa  bbb  ccc  file
$ cat aaa
aaa 21 22
aaa 31 23
$ cat bbb
bbb 12 12
bbb 12 34
$ cat ccc
ccc 32 12
print > $1 Prints line to filename based on the value in first field

This didnt work. It created multiple files namely
aaa 21 22
aaa 31 23
bbb 12 12
bbb 12 34
ccc 32 12

What happened is each record was written t o a different file and the file name is same as record.

Posted: **Thu May 27, 2010 10:33 am**

What is the field delimiter in your input file?

Posted: **Thu May 27, 2010 11:32 am**

Use a server job with a Folder stage.

Posted: **Thu May 27, 2010 12:22 pm**

What! Server job! No freak'n way!

<wink>

Posted: **Thu May 27, 2010 6:05 pm**

anbu wrote:Unix solution
Code: Select all
awk ' { print > $1 } ' file

nice example, anbu

Also a Windows solution, since DataStage includes MKS, which also has an awk.

Posted: **Fri May 28, 2010 10:51 am**

You can use external target stage ... example here viewtopic.php?t=126634&highlight=external+target

DSXchange

Create multiple files without looping

Create multiple files without looping

nice - awk example