how to sum the number on the csv file
Moderators: chulett, rschirm, roy
-
- Premium Member
- Posts: 149
- Joined: Fri Oct 15, 2004 12:13 am
- Location: Toronto,divine_auro@yahoo.co.in
- Contact:
how to sum the number on the csv file
I have a .csv file.
1,10
2,10,20
3,10,20,30
4,10,20,25,35
5,25,35,25,15,40
.
.
.
45,12,13,14,25,...........
I need to insert the records into a table. It has 2 columns. One is Trans_No and another is Total_Amt.
The values should be
Trans_No,Total_Amt
1,10
2,30
3,60
4,90
So on.
Can ay one help me how to sum the numbers on the basis of the data present in the 1st column in the .csv file.
1,10
2,10,20
3,10,20,30
4,10,20,25,35
5,25,35,25,15,40
.
.
.
45,12,13,14,25,...........
I need to insert the records into a table. It has 2 columns. One is Trans_No and another is Total_Amt.
The values should be
Trans_No,Total_Amt
1,10
2,30
3,60
4,90
So on.
Can ay one help me how to sum the numbers on the basis of the data present in the 1st column in the .csv file.
With Regards
Biswajit
Biswajit
-
- Premium Member
- Posts: 149
- Joined: Fri Oct 15, 2004 12:13 am
- Location: Toronto,divine_auro@yahoo.co.in
- Contact:
Hi,
It is a dynamic file. Everytime I used to get different number of column. It depends upaon the no. of transaction . If the transaction is 2 than I only get 3 records ie.
2, 10,20.
If multiple transaction than accordingly file will be generate ie.
2.10,20
.
.
.
99, 2,45,23,67,.........
It is a dynamic file. Everytime I used to get different number of column. It depends upaon the no. of transaction . If the transaction is 2 than I only get 3 records ie.
2, 10,20.
If multiple transaction than accordingly file will be generate ie.
2.10,20
.
.
.
99, 2,45,23,67,.........
With Regards
Biswajit
Biswajit
As can be seen from the file, the first column gives the number of values to be added. Call a routine. A for loop which counts for the number of times of this first column value should work.
If we want to do this in a Transformer, as Craig suggested if we know the maximum value in the first column, can seperate each value and can store each individual value in that many stage variables and then finally use a SvTotal stage variable to sum up these values. Field function should be enough in this case. Read the file in a single column and then seperate the other column values as suggested.
If we want to do this in a Transformer, as Craig suggested if we know the maximum value in the first column, can seperate each value and can store each individual value in that many stage variables and then finally use a SvTotal stage variable to sum up these values. Field function should be enough in this case. Read the file in a single column and then seperate the other column values as suggested.
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
Re: how to sum the number on the csv file
Hi
It may not be the straight forward way. But you can try something like below!
Use the sequence for looping. Using the DCOUNT function to find how many iteration it need to loop and based on that you can try to do something.
DS User
It may not be the straight forward way. But you can try something like below!
Use the sequence for looping. Using the DCOUNT function to find how many iteration it need to loop and based on that you can try to do something.
DS User
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
Re: how to sum the number on the csv file
That would be woefully inefficient. The task is to process the contents of a file, which DataStage does (without sequence) very efficiently.SURA wrote:Hi
It may not be the straight forward way. But you can try something like below!
Use the sequence for looping. Using the DCOUNT function to find how many iteration it need to loop and based on that you can try to do something.
DS User
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
I've done similar logic in 7.5 without a pivot in the past. Can't remember the exact logic. If you leave each CSV record as a string, you can use a column import to pack each field into an array (or was it a vector column). Once in this format, you can use a combination of he restructure stages (including sub record) to perform a pivot wihout knowing the column count.
I can post the info later if your still stuck
I can post the info later if your still stuck
-
- Premium Member
- Posts: 149
- Joined: Fri Oct 15, 2004 12:13 am
- Location: Toronto,divine_auro@yahoo.co.in
- Contact:
Hi ,
Thanks to all. I have tried to do it without using any loop or routine. But DS can not read the file as the number of coulmn is not same for each row. But as suggested by all of you, I'll try to implement all the suggestion one at a time.Thanks a lot to all of you for the valuable suggestion. I'll keep update everybody.
Thanks to all. I have tried to do it without using any loop or routine. But DS can not read the file as the number of coulmn is not same for each row. But as suggested by all of you, I'll try to implement all the suggestion one at a time.Thanks a lot to all of you for the valuable suggestion. I'll keep update everybody.
With Regards
Biswajit
Biswajit
Its a fairly well known problem in datastage - this is an example of horizontal variable pivot ( number of fields to be pivoted cannot be predicted in advance, but of you have a fair idea of the maximum - its pretty simple. Another step i'd recommend is to replace the first Comma with another delimiter , so that the valueset can be read as a seperate column from the code.
use Dcount to get count of delimiters from valueset column , and add 1 to this to get the number of fields in valueset column , derive another column witha hardcoded value of 1 or A to lookup with the output of a row generator that has 2 columns ( one is constant 1 or A to force the lookup to happen , and another is a column that cycles from 1 to MaxDelimiterCount , in increments of 1 ) . FOr Eg - if you assume the valueset will have a maximum of 10 values , MaxDelimiterCount =10 . force lookup to happen with A= A or 1=1, For every row from input file, you will now get 10 rows , with a column having values 1, 2, 3, 4, 5, etc upto 10.
Compare this column with the delimiter count derived in previous step and pass to output only thsoe rows where rownum ( 1,2,3,) etc <= Delimiter COunt for that input record, next use a field() function with valueset column as primary argument , and comma as delimiter , rownum as occurence and 1 as instance to parse the field out . USe suitable constraints on transformer to get only rows in the out put where output of field function is not null , then sort by CODe column and group and aggregate to get your output (this works on older editions of datastage where you cannot loop within a transformer) . I can also send a dsx file if poster requires it
- cost is negotiable ![Twisted Evil :twisted:](./images/smilies/icon_twisted.gif)
use Dcount to get count of delimiters from valueset column , and add 1 to this to get the number of fields in valueset column , derive another column witha hardcoded value of 1 or A to lookup with the output of a row generator that has 2 columns ( one is constant 1 or A to force the lookup to happen , and another is a column that cycles from 1 to MaxDelimiterCount , in increments of 1 ) . FOr Eg - if you assume the valueset will have a maximum of 10 values , MaxDelimiterCount =10 . force lookup to happen with A= A or 1=1, For every row from input file, you will now get 10 rows , with a column having values 1, 2, 3, 4, 5, etc upto 10.
Compare this column with the delimiter count derived in previous step and pass to output only thsoe rows where rownum ( 1,2,3,) etc <= Delimiter COunt for that input record, next use a field() function with valueset column as primary argument , and comma as delimiter , rownum as occurence and 1 as instance to parse the field out . USe suitable constraints on transformer to get only rows in the out put where output of field function is not null , then sort by CODe column and group and aggregate to get your output (this works on older editions of datastage where you cannot loop within a transformer) . I can also send a dsx file if poster requires it
![Smile :)](./images/smilies/icon_smile.gif)
![Twisted Evil :twisted:](./images/smilies/icon_twisted.gif)