Create multiple Records from single record
Moderators: chulett, rschirm, roy
-
- Participant
- Posts: 17
- Joined: Mon Aug 15, 2005 12:57 am
Create multiple Records from single record
Hi All Need a urgent help.
I have records like this
33979|21014589|1536|06/13/2005|V00020|2.1|400|400|050903010013534923|050903010013535322
Now i want it like this
33979|21014589|1536|06/13/2005|V00020|2.1|400|400|050903010013534923
33979|21014589|1536|06/13/2005|V00020|2.1|400|400|050903010013534924
33979|21014589|1536|06/13/2005|V00020|2.1|400|400|050903010013534925
And increment will go on till the last range reaches that means 050903010013535322
And then i will have one more records like
33979|21014589|1536|06/13/2005|V00022|4.1|100|100|050303010008325718|050303010008325817
Then again same above steps have to repeated.
Now my question is how to do duplicate rows taking last two columns in that first column as starting point and then increment the no with all other columns same as it is and till it gets the last column no.
Hope every body will get clear cut idea about the problem now.
So can any body help me.
I need it urgently
Thanks in advance
Shiv
I have records like this
33979|21014589|1536|06/13/2005|V00020|2.1|400|400|050903010013534923|050903010013535322
Now i want it like this
33979|21014589|1536|06/13/2005|V00020|2.1|400|400|050903010013534923
33979|21014589|1536|06/13/2005|V00020|2.1|400|400|050903010013534924
33979|21014589|1536|06/13/2005|V00020|2.1|400|400|050903010013534925
And increment will go on till the last range reaches that means 050903010013535322
And then i will have one more records like
33979|21014589|1536|06/13/2005|V00022|4.1|100|100|050303010008325718|050303010008325817
Then again same above steps have to repeated.
Now my question is how to do duplicate rows taking last two columns in that first column as starting point and then increment the no with all other columns same as it is and till it gets the last column no.
Hope every body will get clear cut idea about the problem now.
So can any body help me.
I need it urgently
Thanks in advance
Shiv
Roy R.
Time is money but when you don't have money time is all you can afford.
Search before posting:)
Join the DataStagers team effort at:
http://www.worldcommunitygrid.org
Time is money but when you don't have money time is all you can afford.
Search before posting:)
Join the DataStagers team effort at:
http://www.worldcommunitygrid.org
search for looping a record. you will get nice examples.
this is tony's idea, it will work for you with some additions according to your requirement.
Another thing you might consider:
Write the data to a hash file. Add a unique key field (use @Outrownum as the value) and another field, F2. The derivation for F2 is a function (pass in Count as a parameter) (that you have to write) that returns a string of values from 1 to count, separated by @VM chars. When reading from this hash file, in the metadata for F2, set "Type" to "MV" (Association should go to F2), then, at the top, set "Normalize On" to F2. Now when you read the data from the hash file you'll get one row for each value in the F2 field. You can drop the key and F2 after reading from the hash file, if you so desire.
Quick, easy code and easy to implement.
Thanks
Nag
this is tony's idea, it will work for you with some additions according to your requirement.
Another thing you might consider:
Write the data to a hash file. Add a unique key field (use @Outrownum as the value) and another field, F2. The derivation for F2 is a function (pass in Count as a parameter) (that you have to write) that returns a string of values from 1 to count, separated by @VM chars. When reading from this hash file, in the metadata for F2, set "Type" to "MV" (Association should go to F2), then, at the top, set "Normalize On" to F2. Now when you read the data from the hash file you'll get one row for each value in the F2 field. You can drop the key and F2 after reading from the hash file, if you so desire.
Quick, easy code and easy to implement.
Thanks
Nag
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
-
- Participant
- Posts: 17
- Joined: Mon Aug 15, 2005 12:57 am
-
- Participant
- Posts: 17
- Joined: Mon Aug 15, 2005 12:57 am
THis doesnt work sir, since my ranges are dynamic and based on last two fields. last but one field is starting range and last field is ending no.
now when u generate the nos with those two fields as ranges u will get all other fields dumplicated....
So hope you are clear now.
Thanks
Shiv
now when u generate the nos with those two fields as ranges u will get all other fields dumplicated....
So hope you are clear now.
Thanks
Shiv
nag0143 wrote:search for looping a record. you will get nice examples.
this is tony's idea, it will work for you with some additions according to your requirement.
Another thing you might consider:
Write the data to a hash file. Add a unique key field (use @Outrownum as the value) and another field, F2. The derivation for F2 is a function (pass in Count as a parameter) (that you have to write) that returns a string of values from 1 to count, separated by @VM chars. When reading from this hash file, in the metadata for F2, set "Type" to "MV" (Association should go to F2), then, at the top, set "Normalize On" to F2. Now when you read the data from the hash file you'll get one row for each value in the F2 field. You can drop the key and F2 after reading from the hash file, if you so desire.
Quick, easy code and easy to implement.
Thanks
Nag
First things first. Your ability to supply a result is your responsibility. If you want fast and accurate answers, consider the price you are paying. $0 US. That's pretty cheap. We don't bill for time or answers, we just ask that you don't request URGENCY, because we have lives too.
Plus, as is the case sometimes, the URGENCY comes from someone who is unqualified for the contract position they are filling which came at the expense of a better qualified person who was asking for higher compensation. The URGENCY tends to mean that we on this forum have to help the less qualified person avoid being tossed out on their rear from the customer site when the customer realizes they are ill-prepared to do the work.
Now, to your problem.
PURE DS SOLUTION:
Your last two columns indicate two integer values that are the starting and ending range. You will need to write a simple function that is passed three arguments. The first argument is the entire row of data as you need it duplicated across the repeating rows. The second argument is the beginning range, the third argument the ending range.
Now, in this function, initialize the Ans variable to blank, that is "". You will start a loop from your beginning range to your ending range. For each pass thru the loop, append to the Ans variable the argument holding the entire row, your delimiter, the loop variable, your delimiter, and the end variable. If you are not on the last pass thru the loop, also append the linefeed (and carriage-return if doing this for DOS files) character.
When this function is finished, it will have populated the Ans variable with all of the expanded rows, delimited by linefeed(+carriage return). In your DS transformer stage, call this function, AND THIS IS IMPORTANT, output the result as the only column to a Sequential file stage. Now read that Sequential file with an output link and continue processing. You will see that when the job runs, N number of rows will process to the Sequential stage, but when reading from the Sequential file N + M rows will read out.
The magic here is that the "row" written to the Sequential file is actually a big text string, but because the string contains LF+CR, it will physically become separate rows in the text file. DO NOT TRY TO CONTINUE TO PROCESS THE ROW WITHOUT FIRST WRITING TO A FILE, that's the trick.
DATABASE SOLUTION:
Bulk load your data into a relational database work table. Create another work table holding a single integer column and populate that table with values 0 to as large a range between your beginning and ending values as you will expect. Maybe be safe and use 1 million. Something like "create table numtable (x integer)".
Join the two tables together, but use sql like this:
One more aside, these are tricks of the data integration trade. If you have an urgent problem in the future, we can provide premium support that includes a phone number, or even us actually doing the work, but it comes at a price.
Plus, as is the case sometimes, the URGENCY comes from someone who is unqualified for the contract position they are filling which came at the expense of a better qualified person who was asking for higher compensation. The URGENCY tends to mean that we on this forum have to help the less qualified person avoid being tossed out on their rear from the customer site when the customer realizes they are ill-prepared to do the work.
Now, to your problem.
PURE DS SOLUTION:
Your last two columns indicate two integer values that are the starting and ending range. You will need to write a simple function that is passed three arguments. The first argument is the entire row of data as you need it duplicated across the repeating rows. The second argument is the beginning range, the third argument the ending range.
Now, in this function, initialize the Ans variable to blank, that is "". You will start a loop from your beginning range to your ending range. For each pass thru the loop, append to the Ans variable the argument holding the entire row, your delimiter, the loop variable, your delimiter, and the end variable. If you are not on the last pass thru the loop, also append the linefeed (and carriage-return if doing this for DOS files) character.
When this function is finished, it will have populated the Ans variable with all of the expanded rows, delimited by linefeed(+carriage return). In your DS transformer stage, call this function, AND THIS IS IMPORTANT, output the result as the only column to a Sequential file stage. Now read that Sequential file with an output link and continue processing. You will see that when the job runs, N number of rows will process to the Sequential stage, but when reading from the Sequential file N + M rows will read out.
The magic here is that the "row" written to the Sequential file is actually a big text string, but because the string contains LF+CR, it will physically become separate rows in the text file. DO NOT TRY TO CONTINUE TO PROCESS THE ROW WITHOUT FIRST WRITING TO A FILE, that's the trick.
DATABASE SOLUTION:
Bulk load your data into a relational database work table. Create another work table holding a single integer column and populate that table with values 0 to as large a range between your beginning and ending values as you will expect. Maybe be safe and use 1 million. Something like "create table numtable (x integer)".
Join the two tables together, but use sql like this:
This should give you a product of the two tables, returning all rows from the datatable, as well as a cartesian product result row where range1 plus that x value is less than or equal to the range 2 value.select a,b,c,d,e,range1+numtable.x,range2 from datatable, numtable
where datatable.range1+numtable.x <= datatable.range2
One more aside, these are tricks of the data integration trade. If you have an urgent problem in the future, we can provide premium support that includes a phone number, or even us actually doing the work, but it comes at a price.
Kenneth Bland
Rank: Sempai
Belt: First degree black
Fight name: Captain Hook
Signature knockout: right upper cut followed by left hook
Signature submission: Crucifix combined with leg triangle
Rank: Sempai
Belt: First degree black
Fight name: Captain Hook
Signature knockout: right upper cut followed by left hook
Signature submission: Crucifix combined with leg triangle
Hi Shiv,
This can be achieved reading the input file (hope this input can be written into seq file) in the job control code and reading the records in the loop.
OpenSeq/ReadSeq/WriteSeq would help you reading/writing from/into the file. And last value to be incremented till you reach the target range.
HTWH.
Regards
Saravanan
This can be achieved reading the input file (hope this input can be written into seq file) in the job control code and reading the records in the loop.
OpenSeq/ReadSeq/WriteSeq would help you reading/writing from/into the file. And last value to be incremented till you reach the target range.
HTWH.
Regards
Saravanan
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
Your range values are too large for integer arithmetic, and the leading zeroes need to be preserved. Search the forum for the code for my OpenSequentialFile routine and implement same. Then pre-process your file with the following before-stage subroutine. In its input value field supply the pathname of the original file and the pathname of another file that DataStage will subsequently process.
Code: Select all
SUBROUTINE GenerateMultipleRecords(InputArg, ErrorCode)
$COPYRIGHT "Copyright (c) 2005, Ray Wurlod. All rights reserved. May be used with this copyright notice intact."
* Generates N rows per input row, depending on values in final two fields.
DEFFUN OpenSequentialFile(FilePath, OpenMode, WriteMode, Logging) Calling "DSU.OpenSequentialFile"
Equate RoutineName To "GenerateMultipleRows"
Equate FieldDelimiter To "|"
ErrorCode = 0 ; * set this to a non-zero value to halt execution
ReadCount = 0
WriteCount = 0
ErrorCount = 0
* Process InputArg to determine file pathnames
Msg = ""
TrimInputArg = Trim(InputArg)
InputFilePath = Field(TrimInputArg, " ", 1, 1)
OutputFilePath = Field(TrimInputArg, " ", 2, 1)
If Len(InputFilePath) = 0
Then
Msg<-1> = "No input file path supplied."
End
If Len(OutputFilePath) = 0
Then
Msg<-1> = "No output file path supplied."
End
If Len(Msg) > 0
Then
Call DSLogWarn(Msg, RoutineName)
ErrorCode = 1 ; * halt execution of job on return
RETURN
End
* Open both files, one for reading, one for (over)writing.
fvInputFile = OpenSequentialFile(InputFilePath, "R", "A", "Y")
fvOutputFile = OpenSequentialFile(OutputFilePath, "W", "O", "Y")
If FileInfo(fvInputFile,0) And FileInfo(fvOutputFile,0)
Then
* Outer loop reads input file one line at a time.
Loop
While ReadSeq InputLine From fvInputFile
ReadCount += 1
* Parse generating values from input line.
Delimiters = Count(InputLine, FieldDelimiter )
ConstantFields = Field(Inputline, FieldDelimiter , 1, Delimiters - 1)
StartValue = Field(InputLine, FieldDelimiter , Delimiters, 1)
FinalValue = Field(InputLine, FieldDelimiter , Delimiters + 1, 1)
* Make sure each is numeric.
If Not(Num(StartValue) And Num(FinalValue))
Then
Call DSLogWarn("Non-numeric values for loop in line " : ReadCount " :. Row skipped." : @FM : InputLine, RoutineName)
ErrorCount += 1
Exit ; * exit from this iteration of outer (read) loop.
End
If FinalValue >= StartValue
Then
* Inner loop generates output lines.
FinalFieldValue = StartValue
Loop
While FinalFieldValue <= FinalValue
* Generate and write line to output file.
Line = ConstantFields : FieldDelimiter : FinalFieldValue
WriteSeq Line To fvOutputFile
On Error
Call DSLogWarn("Error writing to " : OutputFilePath : ", status = " : Status(), RoutineName)
ErrorCode = 2
ErrorCount += 1
GoTo MainExit ; * no point continuing
End
Then
WriteCount += 1
End
Else
Call DSLogWarn("Unable to write to " : OutputFilePath : ", status = " : Status(), RoutineName)
ErrorCode = 2
ErrorCount += 1
GoTo MainExit ; * no point continuing
End
* Use string arithmetic to preserve precision, Fmt() to preserve leading zero(es).
FinalFieldValue = Fmt(SAdd(FinalFieldValue, "1"), "R%":Len(FinalFieldValue))
Repeat
End
Else
Call DSLogWarn("Final value less than start value in line " : ReadCount : @FM : InputLine, RoutineName)
ErrorCount += 1
End
Repeat
MainExit:
* Pass back counters through system variables
@USER0 = ReadCount
@USER1 = WriteCount
@USER2 = ErrorCount
If FileInfo(fvOutputFile, 0) Then CloseSeq fvOutputFile
If FileInfo(fvInputFile, 0) Then CloseSeq fvInputFile
End
RETURN
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
So, did you get it working?????shivaprasad wrote:Urgent means i need it badly sir,
hope you help me, i have monday a delivery for this.
So pls help me.
Thanks
Kenneth Bland
Rank: Sempai
Belt: First degree black
Fight name: Captain Hook
Signature knockout: right upper cut followed by left hook
Signature submission: Crucifix combined with leg triangle
Rank: Sempai
Belt: First degree black
Fight name: Captain Hook
Signature knockout: right upper cut followed by left hook
Signature submission: Crucifix combined with leg triangle
kcbland wrote:So, did you get it working?????shivaprasad wrote:Urgent means i need it badly sir,
hope you help me, i have monday a delivery for this.
So pls help me.
Thanks
In your previous message, you have told that the records can be read in a single row using the routine and process that into a file. But I've one question on this.
1. Lets say there are 4 cols. AA|BB|1|5
Once we process the record into a single row using Char(10) - line feed, it appears like this "AA|BB|1
AA|BB2
AA|BB|3
AA|BB|4
AA|BB|5"
2. How do we read this into another file with only 3 cols as different rows.
Hope the question is clear??
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
Welcome aboard. :D
Re-read the thread from the top, where the original line had ten columns of which the final two delimited fields specified what should occur in the output.
Your question substitutes four for ten and can be solved in exactly the same ways. Not even any requirement to change the routine (though with such small numbers you could get by with integer arithmetic rather than needing the string math function SAdd()).
Olap. That's a Scandinavian name, no? Please edit your Profile and let us know a little about yourself.
Re-read the thread from the top, where the original line had ten columns of which the final two delimited fields specified what should occur in the output.
Your question substitutes four for ten and can be solved in exactly the same ways. Not even any requirement to change the routine (though with such small numbers you could get by with integer arithmetic rather than needing the string math function SAdd()).
Olap. That's a Scandinavian name, no? Please edit your Profile and let us know a little about yourself.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
The job looks like this:Olap wrote: 2. How do we read this into another file with only 3 cols as different rows.
Code: Select all
SEQ#1 --> XFM#1 w/routine --> SEQ#2 --> XFM#2 --> whatever
The first sequential file has to make the entire row available. The XFM#1 parses the row using FIELD(inlink.col, "|", #) functions where # is the element in the delimited col that contains the data you need to pass to the function. Since the last two columns are the ranging values, you can get them with the FIELD function.
A single hour of experimentation should be all that's required to get this to work. This is a rather simple task in DataStage.
Kenneth Bland
Rank: Sempai
Belt: First degree black
Fight name: Captain Hook
Signature knockout: right upper cut followed by left hook
Signature submission: Crucifix combined with leg triangle
Rank: Sempai
Belt: First degree black
Fight name: Captain Hook
Signature knockout: right upper cut followed by left hook
Signature submission: Crucifix combined with leg triangle