Page 1 of 1

how to sum the number on the csv file

Posted: Fri Aug 05, 2011 9:35 am
by divine
I have a .csv file.
1,10
2,10,20
3,10,20,30
4,10,20,25,35
5,25,35,25,15,40
.
.
.
45,12,13,14,25,...........

I need to insert the records into a table. It has 2 columns. One is Trans_No and another is Total_Amt.

The values should be
Trans_No,Total_Amt
1,10
2,30
3,60
4,90
So on.
Can ay one help me how to sum the numbers on the basis of the data present in the 1st column in the .csv file.

Posted: Fri Aug 05, 2011 11:15 am
by chulett
Horizontal pivot followed by Aggregation.

Posted: Fri Aug 05, 2011 11:38 am
by divine
Hi,
I'll try for that. But here is only problem we do not predict the number of records.
Thanks for the suggestion and I'll update accordingly.

Posted: Fri Aug 05, 2011 12:09 pm
by chulett
The number of records don't matter... do you know the maximum number of columns you'll have?

Posted: Fri Aug 05, 2011 12:26 pm
by divine
Hi,
It is a dynamic file. Everytime I used to get different number of column. It depends upaon the no. of transaction . If the transaction is 2 than I only get 3 records ie.
2, 10,20.
If multiple transaction than accordingly file will be generate ie.
2.10,20
.
.
.
99, 2,45,23,67,.........

Posted: Fri Aug 05, 2011 1:21 pm
by chulett
Still... any idea what the maximum number could be?

Posted: Fri Aug 05, 2011 1:41 pm
by divine
HI,
Till now I have seen the file with max. of 26 column. First coulmn contains 25 and rest is the 25 values, all together 26 column.

Posted: Sun Aug 07, 2011 1:57 pm
by harshada
As can be seen from the file, the first column gives the number of values to be added. Call a routine. A for loop which counts for the number of times of this first column value should work.

If we want to do this in a Transformer, as Craig suggested if we know the maximum value in the first column, can seperate each value and can store each individual value in that many stage variables and then finally use a SvTotal stage variable to sum up these values. Field function should be enough in this case. Read the file in a single column and then seperate the other column values as suggested.

Posted: Sun Aug 07, 2011 3:19 pm
by ray.wurlod
This is an ideal application for the looping capability that exists in the version 8.5 Transformer stage.

Re: how to sum the number on the csv file

Posted: Sun Aug 07, 2011 5:45 pm
by SURA
Hi

It may not be the straight forward way. But you can try something like below!

Use the sequence for looping. Using the DCOUNT function to find how many iteration it need to loop and based on that you can try to do something.

DS User

Re: how to sum the number on the csv file

Posted: Sun Aug 07, 2011 9:25 pm
by ray.wurlod
SURA wrote:Hi

It may not be the straight forward way. But you can try something like below!

Use the sequence for looping. Using the DCOUNT function to find how many iteration it need to loop and based on that you can try to do something.

DS User
That would be woefully inefficient. The task is to process the contents of a file, which DataStage does (without sequence) very efficiently.

Posted: Mon Aug 08, 2011 2:40 am
by bmsq
I've done similar logic in 7.5 without a pivot in the past. Can't remember the exact logic. If you leave each CSV record as a string, you can use a column import to pack each field into an array (or was it a vector column). Once in this format, you can use a combination of he restructure stages (including sub record) to perform a pivot wihout knowing the column count.

I can post the info later if your still stuck

Posted: Mon Aug 08, 2011 8:55 am
by divine
Hi ,
Thanks to all. I have tried to do it without using any loop or routine. But DS can not read the file as the number of coulmn is not same for each row. But as suggested by all of you, I'll try to implement all the suggestion one at a time.Thanks a lot to all of you for the valuable suggestion. I'll keep update everybody.

Posted: Mon Aug 08, 2011 9:27 am
by chulett
Read the records as one long string field, you can parse out the individual elements from there especially seeing as how the first 'column' has the total number of columns in it.

Posted: Mon Aug 08, 2011 12:29 pm
by rameshrr3
Its a fairly well known problem in datastage - this is an example of horizontal variable pivot ( number of fields to be pivoted cannot be predicted in advance, but of you have a fair idea of the maximum - its pretty simple. Another step i'd recommend is to replace the first Comma with another delimiter , so that the valueset can be read as a seperate column from the code.

use Dcount to get count of delimiters from valueset column , and add 1 to this to get the number of fields in valueset column , derive another column witha hardcoded value of 1 or A to lookup with the output of a row generator that has 2 columns ( one is constant 1 or A to force the lookup to happen , and another is a column that cycles from 1 to MaxDelimiterCount , in increments of 1 ) . FOr Eg - if you assume the valueset will have a maximum of 10 values , MaxDelimiterCount =10 . force lookup to happen with A= A or 1=1, For every row from input file, you will now get 10 rows , with a column having values 1, 2, 3, 4, 5, etc upto 10.
Compare this column with the delimiter count derived in previous step and pass to output only thsoe rows where rownum ( 1,2,3,) etc <= Delimiter COunt for that input record, next use a field() function with valueset column as primary argument , and comma as delimiter , rownum as occurence and 1 as instance to parse the field out . USe suitable constraints on transformer to get only rows in the out put where output of field function is not null , then sort by CODe column and group and aggregate to get your output (this works on older editions of datastage where you cannot loop within a transformer) . I can also send a dsx file if poster requires it :)- cost is negotiable :twisted: