multiple rows to single row

pandeesh · Post by **pandeesh** » Mon Jun 27, 2011 1:46 am

Hi,

This topic is already discussed so many times . but the scenario which i expalin is little bit different.

Basically i want to achieve the functionality of WM_CONCAT () oracle function in datastage.

My i/p file contains 2 columns as given below:

id,value
------------
1|10
1|20
1|30
2|10
2|20

The output what i am expecting is:

id,values
-----------
1|10,20,30
2|10,20

I am not able to find any user defined column function aggregator stage.
All those functions are built in including sum,standard deviaton.

Please help me in this

pandeesh · Post by **pandeesh** » Mon Jun 27, 2011 2:12 am

i have found some posts exactly similar to that.
But i am not able to find Vertical Pivot stage in my DS 7.5 designer.
is there any other way to achieve that?

Thanks

samyamkrishna · Post by **samyamkrishna** » Mon Jun 27, 2011 2:16 am

Hi ,

This can be achived in transformer stage using stage variable.

This has been discussed a lot of times

viewtopic.php?t=140653&highlight=

Hope this helps

chulett · Post by **chulett** » Mon Jun 27, 2011 7:19 am

pandeesh wrote:This topic is already discussed so many times . but the scenario which i explain is little bit different.

Sorry, don't see anything different about this, just a plain old ordinary vertical pivot. The fact that you want everything in one column instead of multiple columns is just a variation on that theme, one that also has been discussed before.

Would you entertain using a Server job? That would make an interesting exercise for you and be... trivial to roll up.

pandeesh · Post by **pandeesh** » Mon Jun 27, 2011 7:25 am

chulett wrote:Would you entertain using a Server job? That would make an interesting exercise for you and be... trivial to roll up.

yes .

i remember we need to create a hash file from the source file and we need to use the hash file for lookup.

we will use the id as key field. but in the lookup it will find many matches for the key..how we need to handle this and get the required output?

Thanks

chulett · Post by **chulett** » Mon Jun 27, 2011 7:33 am

On a lookup "miss" you put the value in the second field. For any subsequent lookup "hit" you append a comma and the new value to the existing value.

pandeesh · Post by **pandeesh** » Mon Jun 27, 2011 8:00 am

chulett wrote:On a lookup "miss" you put the value in the second field. For any subsequent lookup "hit" you append a comma and the new value to the existing value.

craig,

i am struggling to create a hash file.

in the sequential file stage i mentioned id,value as two integer columns.

My data is as given below:

1,10
1,20
1,30

In the hashed file stage i mentioned id as key.
So it doesn't allow duplicates and 1,30 only loaded to hash file.

we cant use a hashed file stage without key.

How to make it work?

Thanks

chulett · Post by **chulett** » Mon Jun 27, 2011 8:19 am

As noted, you (in essence) insert the first record and update all subsequent records. And you do the 'update' by looking up the current value and appending/concatenating any new values to it before doing the destructive overwrite back to the hashed file.

pandeesh · Post by **pandeesh** » Mon Jun 27, 2011 8:51 am

chulett wrote:As noted, you (in essence) insert the first record and update all subsequent records. And you do the 'update' by looking up the current value and appending/concatenating any new values to it before doing the destructive overwrite back to the hashed file.

Sorry Craig!!i am not able to understand! How to insert the first record and update the remaining in hash file?

Are you telling something like the below?>

Code: Select all



seq.file---------->Transformer----------->Seq.file
(id,value)                 |
                               |
                               |
seq.file ------->hash file(id as key)
(id,value)

chulett · Post by **chulett** » Mon Jun 27, 2011 11:11 am

Moved this over to the Server forum.

No, do not 'preload' the hashed file. You should have one input from the source and you should be doing a reference lookup and writing back to the same hashed file in a single transformer.

You'll need to precreate the hashed file empty for this to work, use a transformer as the source for this and mark the hashed file to be cleared.

Code: Select all

seq.file---------->Transformer----------->Hashed write ----> seq.file
(id,value)                | 
                          | 
                          | 
tx ------------->hashed file(read)

Make sure you do not cache the lookup. Once the hashed file has been built, the last link selects from it to build the actual output file.

pandeesh · Post by **pandeesh** » Mon Jun 27, 2011 11:59 am

Thanks Craig.

i have tried to create a empty hashed file in a separate job.

Code: Select all


transformer--------->Hashed file

But it throwed no primary link error.

As per the job design you mentioned,

The primary source seq file will contain 3 records and the reference hashed file contains no records.

So What condition we need to give in transformer?

is it ref_link.NOTFOUND?

i need the values of value column as comma separated. so i need to make that column as varchar.

please help me.

Thanks

chulett · Post by **chulett** » Mon Jun 27, 2011 1:11 pm

To source from a transformer, you need to do two things:

1) Define a stage variable. Note that I did not say "use" although you certainly could, it just needs to exist to allow the job to compile.

2) Set your output link constraint to let out only the number of records that you want, without one it will output rows... well, forever. So something like @OUTROWNUM=1 would work for one record but here you don't actually need (or want) a record to flow down the link. You just need the "clear" to fire when the stage initializes so a constraint of @FALSE would be appropriate here. And there's really no reason to separate that out into its own job.

Yes, you could use the NOTFOUND link status variable to check the lookup, ref_link.NOTFOUND would be true when the lookup 'fails' and Not(ref_link.NOTFOUND) would be true when the the lookup succeeds. I would typically set up both conditions as well-named boolean stage variables and then reference them in any dependant derivations.

maks475 · Post by **maks475** » Wed Jul 06, 2011 5:21 am

hi, guess you've only two fields, col1 and col2. then do the following..

source->sort>transformer->R.Duplicate->target.
(in transforer take stage variables and define as below)

stg_var1=col1
str_var2=col2
stg_var3=stg_var4 (this is final string to populate in target)
stg_var4= if stg_var1=stg_var5 then stgvar3:str_var2 else str_var2
stg_var5=stg_var2
Note: define all stage variables and their default values.

In Remove Duplicate Stage retain the last record.

correct me if I am worng.

chulett · Post by **chulett** » Wed Jul 06, 2011 6:47 am

OK... you are wrong. And welcome.

Keep in mind the Job Type being discussed when replying. This is the Server forum so no "Remove Duplicates" stage and hence the recommendations to use a Transformer.

ray.wurlod · Post by **ray.wurlod** » Wed Jul 06, 2011 5:09 pm

In Server jobs the Aggregator stage has a "keep first/last" capability.

DSXchange

multiple rows to single row

multiple rows to single row

Re: multiple rows to single row

Re: multiple rows to single row

Re: multiple rows to single row