Dynamic Files Creation
Moderators: chulett, rschirm, roy
-
- Participant
- Posts: 20
- Joined: Thu Oct 06, 2005 12:23 pm
Dynamic Files Creation
Hi,
I have a file with data in the following way
Employee Sal
10 100
10 200
10 40
20 400
20 20
20 10
20 100
I want to create files dynamically based on the employee number . In the above case the output should be directed to 2 different files one for each employee.
I do not know the employee numbers in advance ,for me to use constraints and direct the output.
Please let me know how can this be acheived in DataStage . If not in DataStage can this be done in Unix , or any other way.
Thanks Much in Advance
Sam
I have a file with data in the following way
Employee Sal
10 100
10 200
10 40
20 400
20 20
20 10
20 100
I want to create files dynamically based on the employee number . In the above case the output should be directed to 2 different files one for each employee.
I do not know the employee numbers in advance ,for me to use constraints and direct the output.
Please let me know how can this be acheived in DataStage . If not in DataStage can this be done in Unix , or any other way.
Thanks Much in Advance
Sam
I know we've had this conversation before here, so a search may turn something up. I'm sure you've already figured out that the Sequential File stage isn't up to this particular task.
One way would be to bone up on your BASIC and write a job control routine to accomplish this. There you can control the naming, closing and opening of your output files based on the data.
Odd thought, but I wonder if the XML Output stage could do this as well? It doesn't need to output XML per se as it allows 'pass through' columns and can automatically switch to a new output filename when the value in a particular column changes. Of course, you wouldn't have the fine grain control over the actual name used, but it wouldn't require any hand coding. I wonder...
I'm sure there's other ways to approach this.
One way would be to bone up on your BASIC and write a job control routine to accomplish this. There you can control the naming, closing and opening of your output files based on the data.
Odd thought, but I wonder if the XML Output stage could do this as well? It doesn't need to output XML per se as it allows 'pass through' columns and can automatically switch to a new output filename when the value in a particular column changes. Of course, you wouldn't have the fine grain control over the actual name used, but it wouldn't require any hand coding. I wonder...
I'm sure there's other ways to approach this.
-craig
"You can never have too many knives" -- Logan Nine Fingers
"You can never have too many knives" -- Logan Nine Fingers
Use the power of unix. Sort it by the first column, get unique values of the first column using unix uniq command to build the files. Then use the grep to get all the values for a particular employee. Something like
Code: Select all
sort myfile.txt | uniq | awk '{print $1} > filenames.txt
cat filenames | while read FileNames
do
cat myfile.txt | grep $FileNames > ${FileNames}.txt
done
Creativity is allowing yourself to make mistakes. Art is knowing which ones to keep.
-
- Participant
- Posts: 20
- Joined: Thu Oct 06, 2005 12:23 pm
Hi DSGuru!!!
Thanks it worked great.
There were however some minor problems which i want to get clarified
Here is the file myfile.txt i created
ab cd
10 100
10 50
20 30
10 40
10 50
20 60
srt.sh
sort myfile.txt | uniq | awk '{print $1}' > filenames.txt
cat filenames.txt | while read FileNames
do
cat myfile.txt | grep $FileNames > ${FileNames}.txt
done
I execute it as follows
srt.sh ab
It create 4 different files
10.txt
20.txt
filenames.txt
ab.txt
filenames.txt ideally should have had unique values , but it has the following values
10
10
10
20
20
ab
Is there a way to fix this.
Thanks once again!!
Sam
Thanks it worked great.
There were however some minor problems which i want to get clarified
Here is the file myfile.txt i created
ab cd
10 100
10 50
20 30
10 40
10 50
20 60
srt.sh
sort myfile.txt | uniq | awk '{print $1}' > filenames.txt
cat filenames.txt | while read FileNames
do
cat myfile.txt | grep $FileNames > ${FileNames}.txt
done
I execute it as follows
srt.sh ab
It create 4 different files
10.txt
20.txt
filenames.txt
ab.txt
filenames.txt ideally should have had unique values , but it has the following values
10
10
10
20
20
ab
Is there a way to fix this.
Thanks once again!!
Sam
-
- Participant
- Posts: 407
- Joined: Mon Jun 27, 2005 8:54 am
- Location: Walker, Michigan
- Contact:
-
- Participant
- Posts: 407
- Joined: Mon Jun 27, 2005 8:54 am
- Location: Walker, Michigan
- Contact:
Ultra,Ultramundane wrote:If you want all columns.
sort yourfile.txt | awk '{print $0 >$1".txt";}'
If you just want sal in each file
sort yourfile.txt | awk '{print $2 >$1".txt";}'
You actually don't need to do the sort. awk keeps track of what files it has opened.
If you want all columns.
cat yourfile.txt | awk '{print $0 >$1".txt";}'
If you just want sal in each file
cat yourfile.txt | awk '{print $2 >$1".txt";}'
-
- Participant
- Posts: 407
- Joined: Mon Jun 27, 2005 8:54 am
- Location: Walker, Michigan
- Contact:
I absolutely agree with you that your algorithm must do this. However, with just awk you don't even need to sort. Just another solution. They both work well.DSguru2B wrote:Sort is needed for the uniq. And uniq is needed for the file names. And yes then the entire record pertaining to that particular Employee willl be loaded to that file which is taken care off by the grep.
-
- Participant
- Posts: 20
- Joined: Thu Oct 06, 2005 12:23 pm