Page 1 of 1

Combine Records

Posted: Wed Mar 25, 2009 4:51 am
by Manfred Hagedorn
Hello,
i have tried and searched for hours, and i don't find a solution!
I have the following type of data input:

Record-Id, Field-Name, Field-Value
1,Name,Hagedorn
1,Age,50
2,Name,Schmitz
2,City,Hamburg
2,Income,50000
3,Name,Meier
3,City,Berlin

You see, not all columns are always available for each id.

And i want to get the following output:
Record-Id, Name, City, Age, Income
1,Hagedorn,,50,
2,Schmitz,Hamburg,,50000
3,Meier,Berlin,,

All Input Records with the same ID should be in one Output record, and the values should be assigned to the Outputfields accordingly to the Fieldnames.

The real file has more than 300 possible Fieldnames!

The Combine Records Stage seems not to solve my problem.

Is it in general possible with Datastage?

Re: Combine Records

Posted: Wed Mar 25, 2009 5:16 am
by Pagadrai
Hi,
In this case, your target will have 300+ column names.
I can see a way using lookup.

Lets say your source file has all distinct ids.
Your lookup will be the different lookup file sets created for each field value like name etc.

But the might jobs look ugly with so many links & stages. :(
you can split the task into multiple jobs to make it simpler.

Posted: Wed Mar 25, 2009 6:18 am
by ray.wurlod
If the inport is sorted by Record_ID, then you can assemble the output record putting the correct value in the correct field via If..Then..Else statements in a Transformer stage (use stage variables initialized to null, and name the stage variable in the Else part). Then run the output through a Remove Duplicates stage keeping only the final row for each Record_ID value.

Re: Combine Records

Posted: Wed Mar 25, 2009 6:22 am
by Manfred Hagedorn
Hi Pagadrai,

thanks a lot for your feedback!
The main reason for posting this issue was, that i expected, that there is an easy solution within Datastage, but i simply don't find or don't know it.

Has anybody else any idea, how to solve this problem?
Best regards
Manfred

Posted: Wed Mar 25, 2009 6:26 am
by ray.wurlod
You were probably typing that while I was typing my solution.

Re: Combine Records

Posted: Thu Mar 26, 2009 3:37 am
by Manfred Hagedorn
I understand, that there is no smart solution within Datastage.
I think, i will use a SAS-Stage within Datastage and do a kind of pre-processing with SAS-coding and write an output file which will be used for further processing with real Datastage steps.

Thanks for your support/feedback.
Manfred

Posted: Thu Mar 26, 2009 6:27 am
by sbass1
Yeah I was going to suggest SAS since it supports first.key and last.key processing plus conditional output.

If you don't have SAS licensed (although it sounds like you do) you could also do this in perl, loading a hash (associative array with character keys) and outputing the hash (accumulated record) when the key changes.