Can we change the parameter value at run time?
Moderators: chulett, rschirm, roy
-
- Participant
- Posts: 10
- Joined: Wed Sep 30, 2009 11:55 pm
- Location: Pune,India
- Contact:
Can we change the parameter value at run time?
Hi All,
The requirement is to split the file into multiple files based on
1.Each file should contain specific number of records or less then that
and
2.All the key columns should be in single file
Eg:
source file :
col1 col2
1 a
1 b
1 c
2 d
3 k
3 g
4 b
5 x
5 b
if number of records in each o/p file should be <= 4
then outpu files would be
file1:
col1 col2
1 a
1 b
1 c
2 d
file2:
col1 col2
3 k
3 g
4 b
file3:
col1 col2
5 x
5 b
How to impliment it in Datastage?
Appreciate the help in advance.
Warm Regards,
-Dawar
The requirement is to split the file into multiple files based on
1.Each file should contain specific number of records or less then that
and
2.All the key columns should be in single file
Eg:
source file :
col1 col2
1 a
1 b
1 c
2 d
3 k
3 g
4 b
5 x
5 b
if number of records in each o/p file should be <= 4
then outpu files would be
file1:
col1 col2
1 a
1 b
1 c
2 d
file2:
col1 col2
3 k
3 g
4 b
file3:
col1 col2
5 x
5 b
How to impliment it in Datastage?
Appreciate the help in advance.
Warm Regards,
-Dawar
"Just do it"
-
- Participant
- Posts: 437
- Joined: Fri Oct 15, 2004 6:13 am
- Location: Pune, India
-
- Participant
- Posts: 437
- Joined: Fri Oct 15, 2004 6:13 am
- Location: Pune, India
Try the below. I think it would work.
Aggregate the data on key col and get count for keys. so in your case it would be:
1 - 3
2 - 1
3 - 2
4 - 1
5 - 2
Sort this data for transformer input without partitioning. Then stage vars should be:
RecordCount = If CurrFile = PrevFile or PrevFile = file0 Then RecordCount + CurrCount Else CurrCount
CurrRecCount = Incoming RecCount
PrevFile = CurrFile
CurrFile = If (RecordCount + CurrRecCount) <= 4 then CurrFile Else CurrFile + 1
It would work as below - first line with 0 rownum shows the initial values for stage vars:
Now join this with your original data and you have record and to which file it should go.
Aggregate the data on key col and get count for keys. so in your case it would be:
1 - 3
2 - 1
3 - 2
4 - 1
5 - 2
Sort this data for transformer input without partitioning. Then stage vars should be:
RecordCount = If CurrFile = PrevFile or PrevFile = file0 Then RecordCount + CurrCount Else CurrCount
CurrRecCount = Incoming RecCount
PrevFile = CurrFile
CurrFile = If (RecordCount + CurrRecCount) <= 4 then CurrFile Else CurrFile + 1
It would work as below - first line with 0 rownum shows the initial values for stage vars:
Code: Select all
RowNum RecordCount CurrRecCount PrevFile CurrFile
0 0 0 file0 file1
1 0 3 file1 file1
2 3 1 file1 file1
3 4 2 file1 file2
4 2 1 file2 file2
5 3 2 file2 file3
Regards,
S. Kirtikumar.
S. Kirtikumar.
-
- Participant
- Posts: 10
- Joined: Wed Sep 30, 2009 11:55 pm
- Location: Pune,India
- Contact:
-
- Participant
- Posts: 10
- Joined: Wed Sep 30, 2009 11:55 pm
- Location: Pune,India
- Contact:
Requirement is to slpit a file into multiple files based on a fixed number of records ( That is in my example is 4) and provided all the same keys should be in the same file.chulett wrote:I only asked to make sure we're answering the right question, that we know everything that was behind bringing you here. Off the bat, I don't see the connection between them... but maybe that's just me.
If there is any dould please let me know we clarify more.
"Just do it"
-
- Participant
- Posts: 437
- Joined: Fri Oct 15, 2004 6:13 am
- Location: Pune, India
Whatever I mentioned was to be done in the first job. Then call the second job multiple times for no. of rows created by first job.
During each call, pass the filename from the file created in the first job. In the job using join with the original file and created file from first job, you can get the desired result.
During each call, pass the filename from the file created in the first job. In the job using join with the original file and created file from first job, you can get the desired result.
Regards,
S. Kirtikumar.
S. Kirtikumar.
Sorry, but this still doesn't do anything to answer my question - what does this requirement (which you are clarifying here) have to do with your subject of "Can we change the parameter value at run time?". However, I'm just going to let that go and stop worrying about it now.Md Dawar Mughni wrote:Requirement is to slpit a file into multiple files based on a fixed number of records ( That is in my example is 4) and provided all the same keys should be in the same file.chulett wrote:I only asked to make sure we're answering the right question, that we know everything that was behind bringing you here. Off the bat, I don't see the connection between them... but maybe that's just me.
If there is any dould please let me know we clarify more.
Carry on.
-craig
"You can never have too many knives" -- Logan Nine Fingers
"You can never have too many knives" -- Logan Nine Fingers
-
- Participant
- Posts: 10
- Joined: Wed Sep 30, 2009 11:55 pm
- Location: Pune,India
- Contact:
Well let me clarify that,
I was thinking of changing the parameter value at run time
1. Have two parameters
a.FileName(value as "SplitedFile")
b.Suffix (value as 1)
And use it in the File stage where all the files should get created
#File_Path/#FileName#Suffix#.txt
2. The value of the Parameter Suffix should be changed inside the trnasformer as per the number of records, Eg: for first 4 recorde it will be 1 for next 5 it will be 2 and so on
(But I dont know wheter we can do it or not because in transformer I didn get any thing related )
-------- ------------------ ---------------------
Input File Stage ----> Trnasformer --------> OutputFile Stage
-------- ------------------- ----------------------
But Not very sure how can we do this....
Please assist...
I was thinking of changing the parameter value at run time
1. Have two parameters
a.FileName(value as "SplitedFile")
b.Suffix (value as 1)
And use it in the File stage where all the files should get created
#File_Path/#FileName#Suffix#.txt
2. The value of the Parameter Suffix should be changed inside the trnasformer as per the number of records, Eg: for first 4 recorde it will be 1 for next 5 it will be 2 and so on
(But I dont know wheter we can do it or not because in transformer I didn get any thing related )
-------- ------------------ ---------------------
Input File Stage ----> Trnasformer --------> OutputFile Stage
-------- ------------------- ----------------------
But Not very sure how can we do this....
Please assist...
You can't do this the way you are currently envisioning for two main reasons:
1) You can't change job parameters on the fly while the job is running (they are resolved at job submission time only)
2) SeqFile doesn't support closing and opening multiple files during a job run.
As the number of files/number of records per file may change from run to run, one potential option is to use a BuildOp, custom operator or external target to handle the file writes. Your transformer could pass the filename as a column. I would envision something like this:
Input File->Transformer->Column Export->[BuildOp or CustomOp or ExtTarget]
The purpose of the Column Export would be to create your final output record and place it in a single column to the file-handling stage.
Regards,
1) You can't change job parameters on the fly while the job is running (they are resolved at job submission time only)
2) SeqFile doesn't support closing and opening multiple files during a job run.
As the number of files/number of records per file may change from run to run, one potential option is to use a BuildOp, custom operator or external target to handle the file writes. Your transformer could pass the filename as a column. I would envision something like this:
Input File->Transformer->Column Export->[BuildOp or CustomOp or ExtTarget]
The purpose of the Column Export would be to create your final output record and place it in a single column to the file-handling stage.
Regards,
- james wiles
All generalizations are false, including this one - Mark Twain.
All generalizations are false, including this one - Mark Twain.