Combine Records

Manfred Hagedorn · Post by **Manfred Hagedorn** » Wed Mar 25, 2009 4:51 am

Hello,
i have tried and searched for hours, and i don't find a solution!
I have the following type of data input:

Record-Id, Field-Name, Field-Value
1,Name,Hagedorn
1,Age,50
2,Name,Schmitz
2,City,Hamburg
2,Income,50000
3,Name,Meier
3,City,Berlin

You see, not all columns are always available for each id.

And i want to get the following output:
Record-Id, Name, City, Age, Income
1,Hagedorn,,50,
2,Schmitz,Hamburg,,50000
3,Meier,Berlin,,

All Input Records with the same ID should be in one Output record, and the values should be assigned to the Outputfields accordingly to the Fieldnames.

The real file has more than 300 possible Fieldnames!

The Combine Records Stage seems not to solve my problem.

Is it in general possible with Datastage?

Pagadrai · Post by **Pagadrai** » Wed Mar 25, 2009 5:16 am

Hi,
In this case, your target will have 300+ column names.
I can see a way using lookup.

Lets say your source file has all distinct ids.
Your lookup will be the different lookup file sets created for each field value like name etc.

But the might jobs look ugly with so many links & stages.

you can split the task into multiple jobs to make it simpler.

ray.wurlod · Post by **ray.wurlod** » Wed Mar 25, 2009 6:18 am

If the inport is sorted by Record_ID, then you can assemble the output record putting the correct value in the correct field via If..Then..Else statements in a Transformer stage (use stage variables initialized to null, and name the stage variable in the Else part). Then run the output through a Remove Duplicates stage keeping only the final row for each Record_ID value.

Manfred Hagedorn · Post by **Manfred Hagedorn** » Wed Mar 25, 2009 6:22 am

Hi Pagadrai,

thanks a lot for your feedback!
The main reason for posting this issue was, that i expected, that there is an easy solution within Datastage, but i simply don't find or don't know it.

Has anybody else any idea, how to solve this problem?
Best regards
Manfred

ray.wurlod · Post by **ray.wurlod** » Wed Mar 25, 2009 6:26 am

You were probably typing that while I was typing my solution.

Manfred Hagedorn · Post by **Manfred Hagedorn** » Thu Mar 26, 2009 3:37 am

I understand, that there is no smart solution within Datastage.
I think, i will use a SAS-Stage within Datastage and do a kind of pre-processing with SAS-coding and write an output file which will be used for further processing with real Datastage steps.

Thanks for your support/feedback.
Manfred

sbass1 · Post by **sbass1** » Thu Mar 26, 2009 6:27 am

Yeah I was going to suggest SAS since it supports first.key and last.key processing plus conditional output.

If you don't have SAS licensed (although it sounds like you do) you could also do this in perl, loading a hash (associative array with character keys) and outputing the hash (accumulated record) when the key changes.

DSXchange

Combine Records

Combine Records

Re: Combine Records

Re: Combine Records

Re: Combine Records